Machine learning of intraoperative variables to test feasibility of multivariable prediction modelling for postoperative complications in thoracic surgery: a prospective cohort study

Biniam Kidane; Atif Ul Aftab; Eagan J. Peters; Sadeesh Srinathan; Gordon Buduhan; Lawrence Tan; Emma Poole; Michael Domaratzki

doi:10.21037/jtd-2025-1-2513

Original Article

Machine learning of intraoperative variables to test feasibility of multivariable prediction modelling for postoperative complications in thoracic surgery: a prospective cohort study

Biniam Kidane^1,2,3,4# , Atif Ul Aftab^5#, Eagan J. Peters⁶ , Sadeesh Srinathan^1,7 , Gordon Buduhan⁸ , Lawrence Tan¹ , Emma Poole¹, Michael Domaratzki⁹

¹Section of Thoracic Surgery, Department of Surgery, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Canada; ²CancerCare Manitoba Research Institute, University of Manitoba, Winnipeg, Canada; ³Department of Physiology & Pathophysiology, University of Manitoba, Winnipeg, Canada; ⁴Department of Biomedical Engineering, University of Manitoba, Winnipeg, Canada; ⁵Department of Computer Science, University of Manitoba, Winnipeg, Canada; ⁶Department of Medicine, Temerty Faculty of Medicine, University of Toronto, Toronto, Canada; ⁷Population Health Research Institute, McMaster University, Hamilton, Canada; ⁸Division of Thoracic Surgery, University of British Columbia Faculty of Medicine, University of British Columbia, Kelowna, Canada; ⁹Department of Computer Science, Western University, London, Canada

Contributions: (I) Conception and design: B Kidane, M Domaratzki; (II) Administrative support: B Kidane, M Domaratzki; (III) Provision of study materials or patients: B Kidane, S Srinathan, G Buduhan, L Tan; (IV) Collection and assembly of data: A Ul Aftab, E Poole; (V) Data analysis and interpretation: B Kidane, A Ul Aftab, M Domaratzki; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work as co-first authors.

Correspondence to: Biniam Kidane, MD, MSc. Section of Thoracic Surgery, Department of Surgery, Rady Faculty of Health Sciences, University of Manitoba, GH611 - Health Sciences Centre, 820 Sherbrook Street, Winnipeg, MB R3A 1R9, Canada; CancerCare Manitoba Research Institute, University of Manitoba, Winnipeg, Canada; Department of Physiology & Pathophysiology, University of Manitoba, Winnipeg, Canada; Department of Biomedical Engineering, University of Manitoba, Winnipeg, Canada. Email: bkidane@hsc.mb.ca.

Background: Among patients undergoing thoracic surgery, the impact of intraoperative variables on postoperative complications is unclear. Because patients receiving one-lung ventilation (OLV) experience further unique intraoperative stressors, a knowledge gap exists around the impact of intraoperative predictor variables that may not be well-accounted for in existing risk prediction models. The objectives of this study were therefore to (I) assess the feasibility of measuring intraoperative variables using modern machine learning techniques; (II) determine if machine learning of intraoperative parameters predicts postoperative complications; and (III) compare model performance of machine learning against a set of known preoperative predictors.

Methods: A prospective cohort study was performed of consecutive patients undergoing thoracic surgery with OLV at a single Canadian centre. Intraoperative ventilatory and hemodynamic data were captured. Machine learning was used to predict postoperative complications based on: (I) preoperative variables only; (II) intraoperative variables only; and (III) combined data from both preoperative and intraoperative variables. Machine learning classification algorithms using random forests, support vector machines, and logistic regression models were compared. These models were analyzed with and without synthetic minority over-sampling technique (SMOTE). All models used the same split of training and test data.

Results: Of 121 surgeries, there were 54 (44.6%) sublobar resections, 51 (42.2%) lobectomies, 3 (2.5%) pneumonectomies, and 13 (10.7%) non-pulmonary surgeries. 100 (82.6%) surgeries were minimally invasive. Mean operative time was 122.3±76.4 minutes. Postoperative complications occurred in 35 (28.9%) patients. Support vector machine classification algorithm using SMOTE of intraoperative data alone predicted complications with the highest F1 score of 0.8298. The most important parameters driving prediction accuracy were (I) duration of operative time; (II) mean and maximum heart rate; and (III) fraction of inspired oxygen. Using preoperative data alone and adding preoperative data to intraoperative data did not improve F1 score (max F1 score =0.6643 and 0.7647 respectively).

Conclusions: Machine learning of intraoperative data may be feasible to predict postoperative complications. Therefore, intraoperative data could be used to augment existing risk prediction models. However, this study is hypothesis-generating only. Analysis of larger scale samples focused on specific complications by type are required to improve predictions.

Keywords: Complications; intraoperative; machine learning; one-lung ventilation (OLV); thoracic surgery

Submitted Dec 01, 2025. Accepted for publication Feb 13, 2026. Published online Apr 27, 2026.

doi: 10.21037/jtd-2025-1-2513

Highlight box

Key findings

• Machine learning of intraoperative data among patients undergoing thoracic surgery using one-lung ventilation is feasible.

• Machine learning of intraoperative variables can describe risk of postoperative complications using multiple models as measured by F1-score with and without synthetic minority over-sampling technique.

What is known and what is new?

• Previous research shows that several preoperative variables can be used to predict postoperative complications among patients undergoing thoracic surgery with area under the curve up to 0.737 in the largest datasets.

• The results of this study show that intraoperative variables may have the potential to supplement known preoperative predictors to risk-stratify patients for postoperative complications, as demonstrated by F1-score up to 0.8298.

What is the implication, and what should change now?

• These findings generate the hypothesis that intraoperative parameters could be analyzed in addition to preoperative variables to augment existing risk-prediction models for postoperative complications.

• Additional studies using machine learning should be performed on a larger scale to confirm the results of this study and relate them to specific complications by type.

Introduction

Background

Thoracic surgeries are associated with high rates of postoperative complications. Patients who experience complications are at risk for increased in-hospital mortality and length-of-stay (1). Multiple preoperative variables are associated with the development of postoperative complications. Examples include older age, male sex, higher body mass index, prior smoking history, increased baseline medical comorbidities, lower pulmonary function measurements, and reduced health-related quality-of-life (2-4). Identifying these predictor variables is important in order to inform perioperative risk management strategies (5). In contrast to preoperative variables, there are relatively fewer intraoperative variables that have been identified to predict postoperative complications. Established intraoperative predictors include need for open thoracotomy and extended pulmonary resections (2,3). However, such factors may still be anticipated in the preoperative setting. Whereas, there are additional intraoperative variables that may confer increased risk of complications but are difficult to feasibly to account for prior to surgery. These include operative duration, surgical complexity, and the use of one-lung ventilation (OLV).

Intraoperative mechanical ventilation may be a driver of lung injury in up to 33% of patients undergoing major surgery and is associated with early postoperative mortality (6). Ventilator-induced lung injury may occur through several mechanisms: alveolar overdistention with high tidal volume ventilation (volutrauma), high airway pressures (barotrauma), and systemic inflammation induced by alveolar trauma and high inspired oxygen content (biotrauma) (7,8). Compared to nonthoracic populations, patients undergoing thoracic surgery experience additional increased rates of pulmonary complications (9-12). This may be explained by two reasons. First, most patients who have lung surgery have pre-existing lung pathology which make them more vulnerable to damage with mechanical ventilation. Second, these operations often use OLV, whereby the lung being operated on is collapsed. Therefore, the collective ventilatory trauma forces are exerted on a single ventilated lung instead of divided between two lungs. Although lung protective ventilation strategies have been proposed to minimize the risk of OLV, there is variation in definition and implementation of lung protective strategies (13,14).

Rationale and knowledge gap

Because patients receiving OLV for thoracic surgery experience increased intraoperative stressors, there may be predictor variables not well-accounted for in the existing risk prediction models that are based primarily on preoperative factors. Without identifying pertinent intraoperative variables among these patients, it difficult to prevent their role in the development of postoperative complications. As such, a knowledge gap exists around the impact of intraoperative variables on postoperative complications among patients undergoing OLV for thoracic surgery.

Objective

The objectives were to (I) assess the feasibility of measuring intraoperative parameters systematically using modern data capture and machine learning techniques; (II) generate hypotheses to determine if these parameters could predict postoperative complications among patients undergoing thoracic surgery with OLV; and (III) explore model performance relative to a set of known preoperative predictor variables. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1-2513/rc) (15).

Methods

A prospective cohort study was performed of eligible consecutive patients at Health Sciences Centre Winnipeg in Canada from 2017–2018. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the University of Manitoba Health Research Ethics Board (No. HS20888). All participants gave informed consent before taking part in this study. Inclusion criteria were adult inpatients (age >18 years) admitted for elective thoracic surgery requiring OLV. Demographic and baseline comorbidity variables were collected prospectively. Intraoperative ventilatory and hemodynamic parameters were captured using TrendFace Solo software (ixellence) (16). These included heart rate, arterial line blood pressures, tidal volume, fraction of inspired oxygen (FiO₂), airway pressures, and end-tidal carbon dioxide. The primary outcome was the occurrence of any postoperative complication. These were prospectively assessed using the Ottawa Thoracic Morbidity and Mortality System. This is a standardized and validated approach to document and grade complications (17). Complications were collected by the clinical team and adjudicated in duplicate: (I) initial adjudication by the attending thoracic surgeon; and (II) final adjudication by all thoracic surgeons after formal review in quality assurance meetings. There were no actions to blind assessment of the predictor or outcome variables. Formal sample size calculations were not performed as sample size was determined by practical considerations including time, cost, and available personnel.

Statistical analysis

Machine learning using three different models were compared: random forests, support vector machines, and logistic regression. Details of the classification algorithms, hyperparameter tuning, and methods of time series data used in the machine learning process are described in Appendix 1. All code was implemented in Python using scikit-learn and imbalanced-learn (18,19). The models were used to predict postoperative complications based on three sets of parameters: (I) preoperative variables only; (II) intraoperative variables only; and (III) combined variables from both preoperative and intraoperative data. To account for variability of surgery duration on the number of intraoperative observations, summary features were created to represent intraoperative data, including mean and maximum heart rate. This allowed all machine learning models to work with a consistent number of features per patient. Missing predictor data was handled either using simple imputation via SimpleImputer with median imputation in the scikit-learn ecosystem, or variables were removed entirely if there were large amounts of missing data.

The dataset was randomly split into two sets: the training set (80%) and the test set (20%). Models were trained on the training set only. All models used the same split of training and test data. Results were obtained using ten repeated 5-fold cross-validation and grid search over hyperparameters. The hyperparameters were: (I) number of trees and maximum tree depth for random forests; (II) regularization parameter C, kernel, and gamma for support vector machines; and (III) inverse of regularization strength C and penalty for logistic regression. Synthetic minority over-sampling technique (SMOTE) was used to compensate for the imbalance in surgical outcomes (i.e., those with postoperative complications versus those without) (20). The training set was balanced to equal numbers of complication and non-complication instances. SMOTE was applied strictly within the training set only and not to the test set. A summary of the process to obtain data, over-sampling methods, and machine learning to predict outcomes is depicted in Figure 1 (21).

Figure 1 Flow diagram summarizing the stepwise process to obtain perioperative data, the over-sampling methods used to process the data, and the machine learning used to predict postoperative complications. This figure was reused/adapted with the permission from Ul Aftab’s Master’s degree thesis: Ul Aftab A. Predicting Perioperative Outcomes from Surgical Data during One Lung Ventilation. 2020 (21).

For all models, F1 score was measured. This score is a measure of the success of a model, calculated as F1 = 2pr/(p + r). In the test set, p represents precision and r represents recall. Here, p = true positive/(true positive + false positive) and r = true positive/(true positive + false negative) in the test set respectively (22). The F1 score ranges between 0 and 1, with 1 representing perfect classification. F1 score was used primarily for model comparison, not for individual risk estimation. At a chosen threshold, F1 score is considered a more informative estimator than area under the curve in imbalanced datasets where the presence and absence of an outcome are not equal to each other (i.e., occurrence of postoperative complications in this study) (22). As comparison, receiver operating characteristic curve was also constructed using predicted probabilities.

F1 score was used because the primary outcome of complications was relatively infrequent. Therefore, correctly identifying high-risk patients (i.e., promoting sensitivity) while minimizing false positives (i.e., promoting precision) are clinically important. F1 score permits a balance between these two opposing metrics. This is reflective of how surgeons balance competing interests in real-world clinical decision-making, where perioperative interventions are made based on a given patient’s balance of probabilities for an outcome. Furthermore, because this was an exploratory study, F1-score allows for the identification of predictive signals without proceeding immediately to clinical utility in the setting of a small sample size that could mislead the clinical interpretation. As such, F1 score balances the exploratory intent of the study while also offering clinically relevant evaluation of classification.

Results

Of 121 enrolled patients, there were 54 (44.6%) sublobar resections, 51 (42.2%) lobectomies, 3 (2.5%) pneumonectomies, and 13 (10.7%) non-pulmonary surgeries. Minimally invasive surgery was used for 100 (82.6%) patients. Mean operative time was 122.3±76.4 minutes. There were no deaths; 11 patients had missing forced expiratory volume in one second data and 17 patients had missing diffusing capacity of the lungs for carbon monoxide data. Due to a large amount of missing data for the following variables, these were removed from the analysis: mean arterial blood pressure, systolic arterial blood pressure, positive end-expiratory pressure, and peak inspiratory pressure. Postoperative complications occurred in 35 (28.9%) patients, with 22 (18.1%) being pleural or pulmonary. There were no missing outcome data. A comparison of patients with and without complications can be found in Table 1.

Table 1

Baseline and treatment characteristics of patients with and without postoperative complications undergoing thoracic surgery with one-lung ventilation

Variable	With complications (n=35)	Without complications (n=86)	P value
Age, years	66.4±12.1	63.0±12.8	0.18
Male sex	21 (60.0)	46 (53.5)	0.55
BMI (kg/m²)	26.6±5.4	28.8±5.8	0.08
% predicted FEV₁	84.9±22.1	86.0±21.1	0.81
% predicted DLCO	75.7±16.4	82.8±17.9	0.08
Pack-years smoked	21.2±16.7	23.2±23.4	0.66
Surgeon			0.27
1	12 (34.3)	33 (38.4)
2	5 (14.3)	21 (24.4)
3	11 (31.4)	14 (16.3)
4	7 (20.0)	18 (20.9)
Operation type			0.005
VATS wedge resection	8 (22.9)	42 (48.8)
VATS lobectomy	12 (34.3)	28 (32.6)
Open sublobar resection	3 (8.6)	1 (1.2)
Open lobectomy	7 (20.0)	4 (4.7)
Pneumonectomy	0 (0.0)	3 (3.5)
Open/VATS non-pulmonary	5 (14.3)	8 (9.3)
Charlson Comorbidity Index ≥3	11 (31.4)	24 (27.9)	0.82
Operative time (minutes)	163.0±90.1	106.5±62.3	<0.001

Data are presented as mean ± SD or n (%). BMI, body mass index; DLCO, diffusing capacity of the lungs for carbon monoxide; FEV₁, forced expiratory volume in one second; SD, standard deviation; VATS, video-assisted thoracoscopic surgery.

Table 2 summarizes F1 scores yielded by the different machine learning models. Each model was derived from the same sample of participants (n=121) and postoperative complications (35/121). Without SMOTE, F1 scores ranged from 0.66–0.73. The best F1 scores without SMOTE was derived from support vector machines with a radial basis function kernel using both combined preoperative and intraoperative data (F1 score =0.73±0.03). A principal component analysis of the multi-dimensional support vector machine data was performed. Intraoperative data alone was shown to yield the same prediction accuracy as known and established preoperative predictors. As comparison, receiver operating characteristic curve was also constructed using predicted probability from the most accurate model without use of SMOTE. Area under the curve was 0.73±0.06. After applying SMOTE to account for the disproportionately fewer patients with complications compared to those without complications, intraoperative data alone yielded an F1 score of 0.83±0.009 using support vector machines classification with a radial basis function kernel. The most important parameters driving prediction accuracy were (I) duration of operative time; (II) mean and maximum heart rate; and (III) FiO₂.

Table 2

F1 scores of different algorithms to predict postoperative complications among patient undergoing thoracic surgery with one-lung ventilation

Classification algorithm	Dataset	F1 score without SMOTE	F1 score with SMOTE
Random forests	Combined	0.6604	0.6783
	Intraoperative	0.6642	0.6537
	Preoperative	0.6643	0.5807
Support vector machine	Combined	0.7338	0.7647
	Intraoperative	0.6627	0.8298
	Preoperative	0.6627	0.5596
Logistic regression	Combined	0.6903	0.7163
	Intraoperative	0.6627	0.6028
	Preoperative	0.6627	0.6175

SMOTE, synthetic minority over-sampling technique.

To use these models, the F1 scores by themselves cannot be employed to predict patient-level outcome probabilities. This is because the F1 score is a model-level metric summarizing overall harmonic mean of precision and recall of a classification model across a dataset. In doing so, it examines how a given model performs at a specific decision threshold. As such, the F1 scores here are defined per model and thus compare model performance. It is thus not feasible to assign F1 score to an individual because F1 scores are not defined per individual. First, clinical risk should therefore be evaluated by using predicted probabilities and decision curve analysis. Then, F1 scores may be used separately to compare classification performance among the known candidate models to assist with appropriate model selection.

Discussion

Key findings

These results suggest that (I) systematic and real-time analysis of intraoperative data is feasible; (II) analysis of intraoperative data may help predict postoperative complications among patients undergoing thoracic surgery with OLV; and (III) models using intraoperative data could be used to predict complications in tandem with known preoperative predictors should the hypothesis generated by this study be confirmed on a larger scale. However, discussion around these findings must also be contextualized with the acknowledgement that the variables identified are predictive markers, not necessarily causal risk factors.

Strengths and limitations

This study is limited by its small sample size and the associated low event rates. Thus, there is a risk of over-fitting. This, combined with the high dimensionality of intraoperative variables, may inflate F1 scores and overestimate model performance. In doing so, the model could be poorly generalizable to other patients. It could also mislead predictor importance leading clinicians to incorrect conclusions about which intraoperative variables matter. Multiple mitigation strategies were used to address this risk, including the use of SMOTE as the over-sampling technique and F1 score as the measure of model performance. However, studies with larger sample sizes and event rates are still required to confirm these findings. In addition, larger sample sizes will improve interpretability as types of complications were not analyzed separately due to the small sample size (e.g., respiratory versus nonrespiratory). By not distinguishing types of complications, we may have masked organ-specific risk relationships, thereby limiting the potential clinical applicability of the model.

Because the analysis did not use nested cross-validation, optimistic bias was introduced to the results, which is another key limitation. F1 score also limits the understanding of which error type (i.e., false positives or false negatives) is more important because the calculation of F1 weighs them both equally. Therefore, we cannot distinguish situations where one type of error is clinically more important than the other. As such, F1 does not account for the fact that false negatives and positives have different consequences, which is often crucial in clinical decision-making. Moreover, calibration and decision curve analysis were not employed, which means that the accuracy of predicted probabilities is unknown. However, as this analysis was exploratory and not intended to inform clinical risk estimation these assessments will be deferred to future confirmatory analyses on a larger scale. Due to the large number of missing data for multiple potential intraoperative predictor variables, they were removed from the analysis. We were therefore unable to assess whether they demonstrated an association with postoperative complications. The low event rate of specific complications necessitated the generic primary outcome of any complication, which also limits the ability to make inferences about specific complications in relation to intraoperative variables.

Nevertheless, this limitation posed by the primary outcome may in turn highlight a possible strength. Prediction accuracy would be expected to decrease by a lack of a coherent internally consistent outcome. For example, it is not likely that urinary tract infection and respiratory failure are caused by the same constellation of intraoperative risk factors. Combining different complications such as these into one outcome risks diluting the relationship between intraoperative exposures and outcomes, thereby masking true associations that may exist. Despite this, intraoperative data demonstrated good prediction of generic complications. As such, studies with larger sample sizes and event rates of specific complications could potentially result in improved model performance with specific complications.

Comparison with similar research

Previous large-scale studies have assessed the risk of developing complications after thoracic surgery (2,3). The European Society of Thoracic Surgeons’ (ESTS) EuroLung1 and EuroLung2 analyses, derived from more than 82,000 pulmonary resections, report area under the curve of 0.710 and 0.737 respectively in accuracy of predicting postoperative complications (2). The Society of Thoracic Surgeons General Thoracic Surgery Database (STS GTSD), derived from more than 27,000 pulmonary resections for lung cancer, report area under the curve of 0.70 in the accuracy of predicting composite major postoperative morbidity and mortality (3). Whereas the ESTS and STS GTSD estimates are based primarily on preoperative variables among pulmonary resections, our study also included non-pulmonary resections and offers hypothesis-generating F1 scores using intraoperative factors. This suggests the possibility to augment the criterion standard prediction accuracy of the STS GTSD and ESTS databases with additional research incorporating machine learning of intraoperative variables (23).

Explanations of findings

The results suggest that increased operative duration may contribute to a higher risk of complications. This is consistent with recent research among patients undergoing pulmonary lobectomy (24). There are several interrelated mechanisms to which this may be attributed. Increased operative duration entails increased exposure to potentially injurious OLV on the ventilated lung. It may also mean increased surgical trauma and atelectrauma to the portion of the operated lung that is not resected (25). Furthermore, operative duration may account for some degree of surgical complexity. This is because more complex operations require more time to complete (26). They are also more likely to result in complications (27). However, this is not always true. For example, pneumonectomy is associated with higher risk of postoperative complications than lobectomy (28), but tends to take less time to complete (29,30). Our analysis suggests that operative duration is an important risk factor even after controlling for surgical complexity or extent of lung resected. Thus, while difficult to disentangle the true mechanism driving the relationship between increased operative duration and higher risk of complications, it appears that increased length of exposure to intraoperative trauma (both ventilated and surgical) is important.

Mean and maximum intraoperative heart rate were independent parameters driving prediction accuracy of postoperative complications. This is consistent with prior research showing the association between intraoperative nociceptive response index and postoperative complications after lung resection surgery (31). It is also well-known that elevated preoperative heart rate and heart rate variability are risk factor for postoperative cardiopulmonary morbidity in this patient population (32,33). To explain this, heart rate variability may be conceived as a proxy for the autonomic nervous system. Thus, it is theorized that heart rate variability may represent autonomic dysfunction that leads to adverse events (34). Extreme intraoperative variations in heart rate may therefore act as physiologic reactions to intraoperative adverse events or as sources of the adverse events themselves (a relationship which itself is difficult to disentangle within the limits of this small observational study).

Higher intraoperative FiO₂ was the third important driver for prediction accuracy of postoperative complications. This is consistent with previous evidence describing the relationship between elevated intraoperative FiO₂ and postoperative pulmonary complications in patients undergoing pulmonary resection (35,36). The finding supports the goal of lowest feasible FiO₂ as part of lung-protective strategies during OLV surgeries (37). However, there are often competing physiologic and clinically important demands during intraoperative management of gas exchange during OLV (38-40). For example, elevated FiO₂ and heart rate may reflect clinician response to intraoperative stress rather than modifiable exposures to begin with. Additional studies such as this one, which assess real time variations in multiple parameters, are required to further elucidate how the management of these competing demands affects clinically important outcomes in addition to physiological outcomes.

Implications and actions needed

Risk prediction for complications after thoracic surgery remains imperfect. Previous best estimates remain less than area under the curve of 0.80 (2,3), which is below the threshold considered excellent or outstanding (41). These estimates use mostly preoperative predictors that are static reflections of baseline states, usually separated temporally and causally from the pathway between surgery and outcomes. Whereas, the results herein assessed dynamic predictors, closer in time to surgery (i.e., intraoperative), and likely closer causally to the pathway between surgery and outcomes. Intraoperative factors that were previously accounted for in risk prediction are largely limited to the need for a thoracotomy (rather than minimally invasive surgery) and extended pulmonary resections (2,3). However, there are additional intraoperative variables that could impact risk of complications. Unfortunately, these are not often assessed systematically or feasibly accounted for. In addition to known intraoperative variables, such as duration and complexity of surgery (24,27), increased heart rate and FiO₂ during OLV are conceivable sources of risk that our study reveals in this small-scale population. Future studies using machine learning to examine the relationship between intraoperative variables and postoperative complications can be performed on a larger scale to confirm these findings, relate them to specific complications, and used to enhance existing risk-prediction models.

While these results are currently exploratory only, if confirmed on a larger scale the utility in clinical practice would be translatable. During an operation, machine learning of intraoperative data would classify patients as “high risk” or “low risk” for postoperative complications. This would be based on real-time features such as vital signs, procedure duration, and potentially additional factors such as blood loss and medication doses. The threshold selected through tuning the model would maximize the F1 score, thus balancing the identification of true complications (recall/sensitivity) and false alarm avoidance (precision). This could result in an ability to change ventilatory management or allow for patients to be enrolled in randomized trials of trigger-based ventilatory changes versus routine care. In a more immediate future, patients identified as high risk intraoperatively would be selected for higher level monitoring and for longer period of in-hospital monitoring. In many modern thoracic centers, lung resection is becoming increasingly outpatient or short stay with some people going home the same day and most patients being discharged on postoperative day one. Intraoperative prediction of high risk can inform selection of patients for fast-track pathways.

The steps required for translation into real-time decision support first include internal validation within the existing dataset to ensure reproducibility of the classification threshold. Next, prospective validation on a larger, separate cohort should be done to assess generalizability. Additional external validation is required between multiple centres to evaluate applicability across multiple different patient populations, surgeon practices, and intraoperative monitoring systems. Last, user testing and workflow integration would need to be conducted to ensure that risk prediction alerts are interpretable, clinically meaningful, and do not exacerbate alarm fatigue. Ultimately, a continuous quality improvement approach would be undertaken to recalibrate performance metrics as needed and monitor for intervention efficacy relative to resource utilization.

Conclusions

This study shows that machine learning of intraoperative variables may be a feasible method to model the occurrence of postoperative complications among patients undergoing thoracic surgery using OLV. Within the constraints of this small dataset, intraoperative variables appear to carry strong predictive signal. This generates the hypothesis that models based on intraoperative variables could assess risk of postoperative complications in order to supplement known models using preoperative predictor variables. If further characterized on larger scale, these methods may be used to assess the relationship between intraoperative variables and specific types of postoperative complications. In doing so, intraoperative data could to augment existing risk prediction systems that are based predominantly on preoperative predictors. Overall, these findings contribute to the growing body of literature that identifies intraoperative risk factors for complications and the actions that can be taken to prevent them from occurring.

Acknowledgments

Part of the materials have been presented in the thesis of A.U.A.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1-2513/rc

Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1-2513/dss

Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1-2513/prf

Funding: This work was funded by the University of Manitoba Department of Surgery Geographical Full-Time Research Grant.

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1-2513/coif). B.K. serves as an unpaid editorial board member of Journal of Thoracic Disease from February 2025 to January 2027. All authors report that this study was supported by the University of Manitoba Department of Surgery Geographical Full-Time Research Grant. The authors have no other conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the University of Manitoba Health Research Ethics Board (No. HS20888). All participants gave informed consent before taking part in this study.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Schieren M, Stoelben E, Weber J, et al. Postoperative Complications After Thoracic Surgery-An Analysis From the German Thorax Registry. J Cardiothorac Vasc Anesth 2025;39:2377-83. [Crossref] [PubMed]
Brunelli A, Cicconi S, Decaluwe H, et al. Parsimonious Eurolung risk models to predict cardiopulmonary morbidity and mortality following anatomic lung resections: an updated analysis from the European Society of Thoracic Surgeons database. Eur J Cardiothorac Surg 2020;57:455-61. [Crossref] [PubMed]
Tong BC, Bonnell LN, Habib RH, et al. The Society of Thoracic Surgeons 2024 Risk Models for Lung Cancer Resection: Continued Refinement and Improved Outcomes. Ann Thorac Surg 2025;119:777-85. [Crossref] [PubMed]
Peters EJ, Buduhan G, Tan L, et al. Preoperative quality of life predicts complications in thoracic surgery: a retrospective cohort study. Eur J Cardiothorac Surg 2024;66:ezae301. [Crossref] [PubMed]
Odor PM, Bampoe S, Gilhooly D, et al. Perioperative interventions for prevention of postoperative pulmonary complications: systematic review and meta-analysis. BMJ 2020;368:m540. [Crossref] [PubMed]
Fernandez-Bustamante A, Frendl G, Sprung J, et al. Postoperative Pulmonary Complications, Early Mortality, and Hospital Stay Following Noncardiothoracic Surgery: A Multicenter Study by the Perioperative Research Network Investigators. JAMA Surg 2017;152:157-66. [Crossref] [PubMed]
Peel JK, Funk DJ, Slinger P, et al. Tidal volume during 1-lung ventilation: A systematic review and meta-analysis. J Thorac Cardiovasc Surg 2022;163:1573-1585.e1. [Crossref] [PubMed]
Bruinooge AJG, Mao R, Gottschalk TH, et al. Identifying biomarkers of ventilator induced lung injury during one-lung ventilation surgery: a scoping review. J Thorac Dis 2022;14:4506-20. [Crossref] [PubMed]
Piccioni F, Spagnesi L, Pelosi P, et al. Postoperative pulmonary complications and mortality after major abdominal surgery. An observational multicenter prospective study. Minerva Anestesiol 2023;89:964-76.
Yang CK, Teng A, Lee DY, et al. Pulmonary complications after major abdominal surgery: National Surgical Quality Improvement Program analysis. J Surg Res 2015;198:441-9. [Crossref] [PubMed]
Piccioni F, Langiano N, Bignami E, et al. One-Lung Ventilation and Postoperative Pulmonary Complications After Major Lung Resection Surgery. A Multicenter Randomized Controlled Trial. J Cardiothorac Vasc Anesth 2023;37:2561-71.
Abram J, Spraider P, Martini J, et al. Flow-controlled versus pressure-controlled ventilation in thoracic surgery with one-lung ventilation - A randomized controlled trial. J Clin Anesth 2025;103:111785. [Crossref] [PubMed]
Kidane B, Choi S, Fortin D, et al. Use of lung-protective strategies during one-lung ventilation surgery: a multi-institutional survey. Ann Transl Med 2018;6:269. [Crossref] [PubMed]
Kidane B, Peel JK, Seely A, et al. National practice variation in pneumonectomy perioperative care among Canadian thoracic surgeons. Interact Cardiovasc Thorac Surg 2017;25:872-6. [Crossref] [PubMed]
von Elm E, Altman DG, Egger M, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. J Clin Epidemiol 2008;61:344-9. [Crossref] [PubMed]
Avidan MS, Zhang L, Burnside BA, et al. Anesthesia awareness and the bispectral index. N Engl J Med 2008;358:1097-108. [Crossref] [PubMed]
Ivanovic J, Al-Hussaini A, Al-Shehab D, et al. Evaluating the reliability and reproducibility of the Ottawa Thoracic Morbidity and Mortality classification system. Ann Thorac Surg 2011;91:387-93. [Crossref] [PubMed]
Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res 2011;12:2825-30.
Lemaître G, Nogueira F, Aridas C. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning. J Mach Learn Res 2017;18:1-5.
Chawla N, Bowyer K, Hall LO, et al. SMOTE: Synthetic Minority Over-sampling Technique. J Artif Intell Res 2002;16:321-57.
Ul Aftab A. Predicting Perioperative Outcomes from Surgical Data during One Lung Ventilation. University of Manitoba; 2020. Available online: https://mspace.lib.umanitoba.ca/server/api/core/bitstreams/fc22b1b2-59e5-4768-8bbd-a1735c39d777/content
Ahmed Z, Das S. A Comparative Analysis on Recent Methods for Addressing Imbalance Classification. SN Comput Sci 2024;5:1-18.
Towe CW, Kuo EY, Feczko A, et al. The Society of Thoracic Surgeons General Thoracic Surgery Database: 2024 Update on Outcomes and Research. Ann Thorac Surg 2025;119:733-43. [Crossref] [PubMed]
de Angelis P, Tan KS, Chudgar NP, et al. Operative Time is Associated With Postoperative Complications After Pulmonary Lobectomy. Ann Surg 2023;278:e1259-66. [Crossref] [PubMed]
Bussières JS, Marques E. Atelectasis in one lung ventilation: the good, the bad, and the ugly: a narrative review. Curr Chall Thorac Surg 2023;5:38.
Haeuser L, Cone EB, Cole AP, et al. Are work relative value units correlated with operative duration of common surgical procedures? Am J Manag Care 2022;28:148-51. [Crossref] [PubMed]
Morris MS, Deierhoi RJ, Richman JS, et al. The relationship between timing of surgical complications and hospital readmission. JAMA Surg 2014;149:348-54. [Crossref] [PubMed]
Chen J, Soultanis KM, Sun F, et al. Outcomes of sleeve lobectomy versus pneumonectomy: A propensity score-matched study. J Thorac Cardiovasc Surg 2021;162:1619-1628.e4. [Crossref] [PubMed]
Matsuo T, Imai K, Takashima S, et al. Outcomes and pulmonary function after sleeve lobectomy compared with pneumonectomy in patients with non-small cell lung cancer. Thorac Cancer 2023;14:827-33. [Crossref] [PubMed]
Wang L, Pei Y, Li S, et al. Left sleeve lobectomy versus left pneumonectomy for the management of patients with non-small cell lung cancer. Thorac Cancer 2018;9:348-52. [Crossref] [PubMed]
Okamoto T, Matsuki Y, Ogata H, et al. Association between averaged intraoperative nociceptive response index and postoperative complications after lung resection surgery. Interact Cardiovasc Thorac Surg 2022;35:ivac258. [Crossref] [PubMed]
Fu D, Wu C, Li X, et al. Elevated preoperative heart rate associated with increased risk of cardiopulmonary complications after resection for lung cancer. BMC Anesthesiol 2018;18:94. [Crossref] [PubMed]
Lai Y, Zhou S, Tian L, et al. Preoperative heart rate variability as a predictor of postoperative pneumonia and lung function recovery in surgical lung cancer patients: a prospective observed study. BMC Cancer 2025;25:404. [Crossref] [PubMed]
Keim OC, Bolwin L, Feldmann RE Jr, et al. Heart rate variability as a predictor of intraoperative autonomic nervous system homeostasis. J Clin Monit Comput 2024;38:1305-13. [Crossref] [PubMed]
Choi A, Deng H, Fuller M, et al. Intraoperative FiO(2) and risk of impaired postoperative oxygenation in lung resection: A propensity score-weighted analysis. J Clin Anesth 2025;101:111739. [Crossref] [PubMed]
Okahara S, Shimizu K, Suzuki S, et al. Associations between intraoperative ventilator settings during one-lung ventilation and postoperative pulmonary complications: a prospective observational study. BMC Anesthesiol 2018;18:13. [Crossref] [PubMed]
Piccioni F, Droghetti A, Bertani A, et al. Recommendations from the Italian intersociety consensus on Perioperative Anesthesa Care in Thoracic surgery (PACTS) part 2: intraoperative and postoperative care. Perioper Med (Lond) 2020;9:31. [Crossref] [PubMed]
Kidane B, Palma DC, Badner NH, et al. The Potential Dangers of Recruitment Maneuvers During One Lung Ventilation Surgery. J Surg Res 2019;234:178-83. [Crossref] [PubMed]
Peel JK, Funk DJ, Slinger P, et al. Positive end-expiratory pressure and recruitment maneuvers during one-lung ventilation: A systematic review and meta-analysis. J Thorac Cardiovasc Surg 2020;160:1112-1122.e3. [Crossref] [PubMed]
Kormish J, Ghuman T, Liu RY, et al. Temporal and Spatial Patterns of Inflammation and Tissue Injury in Patients with Postoperative Respiratory Failure after Lung Resection Surgery: A Nested Case-Control Study. Int J Mol Sci 2023;24:10051. [Crossref] [PubMed]
Mandrekar JN. Receiver operating characteristic curve in diagnostic test assessment. J Thorac Oncol 2010;5:1315-6. [Crossref] [PubMed]

Cite this article as: Kidane B, Ul Aftab A, Peters EJ, Srinathan S, Buduhan G, Tan L, Poole E, Domaratzki M. Machine learning of intraoperative variables to test feasibility of multivariable prediction modelling for postoperative complications in thoracic surgery: a prospective cohort study. J Thorac Dis 2026;18(4):290. doi: 10.21037/jtd-2025-1-2513

Machine learning of intraoperative variables to test feasibility of multivariable prediction modelling for postoperative complications in thoracic surgery: a prospective cohort study

Highlight box

Introduction

Background

Rationale and knowledge gap

Objective

Methods

Statistical analysis

Results

Table 1

Table 2

Discussion

Key findings

Strengths and limitations

Comparison with similar research

Explanations of findings

Implications and actions needed

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share