Data source: The study was conducted at a tertiary care center and included JOAG patients who required trabeculectomy. The study adhered to the tenets of the declaration of Helsinki and was approved by the Institutional Ethics Committee (All India Institute of Medical Sciences, New Delhi).

Patients inclusion criteria: The study included JOAG patients who met the following criteria: diagnosed between 10 - 40 years of age, with an IOP >18 mmHg in the eyes on two or more occasions, open-angle on gonioscopy in both eyes, with glaucomatous optic neuropathy and requiring a trabeculectomy for IOP control. Only the patients with a minimum five year follow up after surgery were included in the study.

Patients exclusion criteria: Patients with a history of steroid use, presence of any other retinal or neurologic pathology, evidence of secondary causes of raised IOP such as pigment dispersion, pseudoexfoliation, or trauma, with any pathology detected on gonioscopy such as angle recession, irido-trabecular contact or peripheral anterior synechiae and history of any previous ocular surgery were excluded from the study.

Data collection: All subjects underwent a detailed history and examination. IOP was measured using Goldmann applanation tonometry and recorded as preoperative baseline IOP (IOP prior to medication) and on follow-up visits (1, 3, 6, 12, 24, 36, 48 and 60 months post-surgery). The Canadian Target IOP Workshop with additional central visual field loss criteria according to Enhanced Glaucoma Severity Score was used to stage the disease on presentation. Indications for trabeculectomy were uncontrolled IOP on maximally tolerable medical therapy (as defined by the treating doctor) or non-compliance with medical management.

Surgical technique and intraoperative factors: Prior to surgical intervention, eyes were randomly assigned to the use of MMC. The eyes underwent trabeculectomy under peribulbar anesthesia. A fornix based trabeculectomy was performed by a single glaucoma specialist (VG). Intraoperatively the tenon’s thickness was graded as thin or thick. A partial thickness rectangular superficial scleral flap (hinged at limbus) was dissected. In eyes assigned for MMC use, MMC 0.2 mg/ml was applied for 2 min using a sponge placed subconjunctivally, and then it was irrigated. The scleral flap dimensions were measured with an intraoperative caliper. Paracentesis for gradual decompression was followed by Fistula formation. Two different techniques were utilized either a traditional method using the Vannas scissor forming a 1mm x 2 mm sclerotomy, and another using a 1mm Kelly scleral punch. The size of sclerostomy by Kelly scleral punch was also further varied by using the punch one, two, three, or four times continuously. A peripheral iridectomy (PI) was performed and its sizes were graded as small, moderate, or large. Two releasable and one fixed suture was used to close the superficial scleral flap. The conjunctiva was sutured with 10–0 nylon. Patients were started on a topical antibiotic-steroid combination for six weeks postoperatively and topical cycloplegic for 2 weeks. The patients were followed up for over five years and any adverse events related to the surgery were recorded.

ML model training: The steps involved during ML model training are as given below:

Data pre-processing: Data pre-processing was carried out, a mandatory step in an ML study. All records in the data were encoded and indexed by a unique patient identifier (Figure 1a). In all, we had 13 parameters which included demographic information like gender and age at diagnosis; six preoperative parameters like presenting BCVA, Cup-disc-ratio (CDR), baseline IOP, disease severity, duration of preoperative medical treatment and tenon thickness; and five intraoperative parameters like administration of intraoperative MMC, scleral flap length and width, scleral fistulization technique and PI size. To develop and evaluate ML models in the present study, we used only demographic, preoperative and intraoperative surgical data to predict the trabeculectomy outcome.

Outcomes assessed: Two criteria were used to define success at the fifth-year follow-up. In criterion A, success was defined as post-operative IOP ≤ 18mmHg and in Criterion B, success was defined as ≥ 50% reduction in IOP (from baseline) (Figure 1b). Out of all the eyes that had achieved success, those not on any supplemental glaucoma medication were considered complete success. The ones that required supplemental medical therapy were deemed to be qualified success. Eyes requiring a repeat glaucoma surgery or with IOP ≤ 4mmHg were considered failures.

Feature selection: It is an important step that eliminates redundant and irrelevant training data. We carried out feature selection using the feature selection modules in the Weka package (v3.8.4). We used all the available ‘Attribute Evaluators’ with different ‘Search Algorithms’ (FSA) within the WEKA package to identify the most discriminatory parameters.

Cross-validation technique: Accuracy assessment of a new model was carried out using cross-validation techniques. We used ten-fold cross validation in which the whole dataset was divided into ten subsets. Out of which, nine subsets were used for training and the left one subset is used for testing. The procedure is repeated ten times so that each subset is used for testing the model.

Tools and techniques used for ML model building and evaluation: We used the freely available package WEKA for the development of ML models. The binary classification problem that we addressed in the study was to predict whether the surgery was successful at fifth year of follow-up. The training set was used to evaluate the performance of 80 algorithms from eight main classifiers (Bayes, functions, lazy, meta, mi, misc, rules, trees) available in the WEKA (v3.8.4) package. All the probable combinations of input parameters were used, leading to more than 350 models.

External validation of the prioritized ML models: External validation is imperative to assess model reproducibility and generalizability in terms of calibration and discrimination. The external validation dataset comprised the five unrelated JOAG patients was used to validate the performance of the three prioritised prediction models.

Statistical analysis: The Shapiro Wilk test was used to analyse the normality of the data. Preoperative and intraoperative factors in the success and failure groups were compared. The independent ‘t’-test for continuous variables showing normal distribution, Mann-Whitney U test for continuous variables showing non-normal distribution, Fisher’s exact test for categorical variables when more than 20% cells showed expected count less than five and Chi-square test for categorical variables when less than 20% cells showed expected count less than 5.
Different models were evaluated, compared, and prioritized based on their accuracy, sensitivity, specificity, Mathew Correlation Coefficient (MCC) index and mean area under the receiver operating characteristic curve (AUROC). The ROC curve was generated using the Python Scikit-learn library. The relative clinical importance of the selected input parameters was analysed and ranked. Since there were both categorical and continuous variables, Kendall’s correlation coefficient was calculated to assess the relationships between the input parameters. Statistical analyses were performed using the statistical software package (SPSS for Windows, v. 26.0. SPSS, Inc, Chicago, IL) and R statistical package V.3.5.