Using machine learning to predict paediatric 30-day unplanned hospital readmissions: a case-control retrospective analysis of medical records, including written discharge documentation

Huaqiong Zhou; Matthew A. Albrecht; Pamela A. Roberts; Paul Porter; Philip R. Della

doi:10.1071/AH20062

RESEARCH ARTICLE (Open Access)

Previous Next Contents Vol 45(3)

Using machine learning to predict paediatric 30-day unplanned hospital readmissions: a case-control retrospective analysis of medical records, including written discharge documentation

Huaqiong Zhou ^A ^B , Matthew A. Albrecht ^B , Pamela A. Roberts ^B , Paul Porter ^B ^C and Philip R. Della ^B ^D ^E

+ Author Affiliations

- Author Affiliations

^A General Surgical Ward, Princess Margaret Hospital for Children, Perth, WA 6008, Australia.

^B School of Nursing, Curtin University, GPO Box U 1987, Perth, WA 6845, Australia. Email address: h.zhou@curtin.edu.au; matthew.albrecht@curtin.edu.au; p.a.roberts@curtin.edu.au; paul.porter@curtin.edu.au

^C Joondalup Health Campus, Joondalup, WA 6027, Australia.

^D Visiting Professor, College of Nursing, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China.

^E Corresponding author. Email: p.della@curtin.edu.au

Australian Health Review 45(3) 328-337 https://doi.org/10.1071/AH20062
Submitted: 14 April 2020 Accepted: 18 June 2020 Published: 12 April 2021

Journal Compilation © AHHA 2021 Open Access CC BY-NC-ND

Abstract

Objectives To assess whether adding clinical information and written discharge documentation variables improves prediction of paediatric 30-day same-hospital unplanned readmission compared with predictions based on administrative information alone.

Methods A retrospective matched case-control study audited the medical records of patients discharged from a tertiary paediatric hospital in Western Australia (WA) between January 2010 and December 2014. A random selection of 470 patients with unplanned readmissions (out of 3330) were matched to 470 patients without readmissions based on age, sex, and principal diagnosis at the index admission. Prediction utility of three groups of variables (administrative, administrative and clinical, and administrative, clinical and written discharge documentation) were assessed using standard logistic regression and machine learning.

Results Inclusion of written discharge documentation variables significantly improved prediction of readmission compared with models that used only administrative and/or clinical variables in standard logistic regression analysis (χ² ₁₇ = 29.4, P = 0.03). Highest prediction accuracy was obtained using a gradient boosted tree model (C-statistic = 0.654), followed closely by random forest and elastic net modelling approaches. Variables highlighted as important for prediction included patients’ social history (legal custody or patient was under the care of the Department for Child Protection), languages spoken other than English, completeness of nursing admission and discharge planning documentation, and timing of issuing discharge summary.

Conclusions The variables of significant social history, low English language proficiency, incomplete discharge documentation, and delay in issuing the discharge summary add value to prediction models.

What is known about the topic? Despite written discharge documentation playing a critical role in the continuity of care for paediatric patients, limited research has examined its association with, and ability to predict, unplanned hospital readmissions. Machine learning approaches have been applied to various health conditions and demonstrated improved predictive accuracy. However, few published studies have used machine learning to predict paediatric readmissions.

What does this paper add? This paper presents the findings of the first known study in Australia to assess and report that written discharge documentation and clinical information improves unplanned rehospitalisation prediction accuracy in a paediatric cohort compared with administrative data alone. It is also the first known published study to use machine learning for the prediction of paediatric same-hospital unplanned readmission in Australia. The results show improved predictive performance of the machine learning approach compared with standard logistic regression.

What are the implications for practitioners? The identified social and written discharge documentation predictors could be translated into clinical practice through improved discharge planning and processes, to prevent paediatric 30-day all-cause same-hospital unplanned readmission. The predictors identified in this study include significant social history, low English language proficiency, incomplete discharge documentation, and delay in issuing the discharge summary.

Keywords: administrative data, clinical information, discharge planning, discharge summary, follow-up plan, machine learning, medical records, paediatric hospital readmissions, paediatric unplanned readmissions, retrospective analysis, social history, social predictors, written discharge documentation.

Introduction

The identification of predictive factors associated with paediatric unplanned readmission to hospital can be used to improve discharge planning processes, and thereby help prevent such readmissions. Prior research has uncovered many of these factors; a recent systematic review¹ of the existing literature extracted 36 unique predictors associated with paediatric unplanned hospital readmissions from 44 studies. The most commonly cited four predictors were comorbidity, health insurance status, length of stay (LOS), and age at the index admission. The review highlighted that statistical identification of predictors depended on what variables were examined in each of the studies. In 33 of the 44 studies, administrative databases and medical records were both accessed. In the remaining 11 studies, only administrative variables were analysed. The number of examined variables ranged from 2² to 44³. Extracting variables from electronic or hard-copy medical records enriches the data and may assist to rectify coding errors in the administrative dataset. Manual review of medical records does, however, incur significant time and financial impost. Nevertheless, the enhanced prediction capability via including such information may result in significant reductions in readmission rate and healthcare costs.

In addition to sociodemographic and clinical information, three paediatric studies examined the association between written discharge documentation (e.g. follow-up plan or discharge summary) and unplanned readmissions, but the results were not consistent.⁴^–⁶ In this project, written discharge documentation refers to not only the discharge summary, but also the last entry within the patient progress notes by doctors, allied healthcare providers, and nurses, as this method allows for comprehensive review of the inter-healthcare professional team members input to the discharge documentation. Variations between studies in how this information is extracted and analysed, along with its effect on prediction of readmissions, suggests further investigations are warranted. Written discharge documentation plays a critical role in the continuity of care following hospital discharge, but extracting this data is challenging for researchers.⁷^–⁹

Apart from adding variables for predictive model development, advances in statistical analysis methods may also improve prediction accuracy, especially with large healthcare datasets. Logistic regression analysis methods are commonly employed in predicting paediatric unplanned hospital readmissions. Advanced machine learning analysis approaches have also been applied to adult¹⁰ and paediatric¹¹^–¹⁴ unplanned hospital readmissions, because of their potential to improve predictive model performance.¹⁵ The commonly applied approaches included random forests,¹⁶ least absolute selection and shrinkage operator (LASSO),¹¹^,¹⁶^,¹⁷ and gradient boosted decision trees.¹⁶^–¹⁸ However, the number of paediatric studies remains limited, and have so far only analysed administrative data.¹¹^–¹⁴

In a recently published study,¹⁹ we developed a logistic regression model based on 16 administratively collected variables as electronic medical records were not available. The model was found with moderate discriminative ability for 30-day all-cause readmission at a tertiary paediatric hospital in Western Australia (WA) (C-statistic = 0.645).

Study aim

This current study added clinical information and written discharge documentation with the aim to determine whether adding these variables improves prediction of 30-day same-hospital unplanned readmissions compared with examining only administratively collected variables. Prediction accuracy was also examined comparing standard logistic regression analysis to machine learning approaches.

Methods

Study design

A retrospective matched case-control study was conducted, which audited the medical records of patients discharged from a tertiary paediatric facility in WA that has approximately 250 000 inpatient and outpatient visits each year.²⁰ Ethics approvals were obtained from the Human Ethics Research Committee of Health Service, Department of Health, WA (2015/55), Children’s Hospital (2015015EP), and Curtin University (HR184/2015).

Data source

The patients included in this study were discharged between 1 January 2010 and 31 December 2014. The original electronic administrative inpatient dataset was extracted from the WA Hospital Morbidity Data Collection (WAHMDC). A total of 3330 patients (4.55%) experienced 30-day unplanned hospital readmission.²¹ Hospital readmission was operationalised as an unexpected hospitalisation within 30 days as measured from an index admission. The readmission is related to the principal diagnosis of the index admission. The identification of unplanned hospital readmissions in this study was based on the combination of admission type (emergency) and the principal diagnosis of the subsequent admission following the index admission. Because of the burden associated with extracting data from medical records, out of the initial dataset, 550 patients with readmissions were randomly selected and matched to 550 patients without readmissions by age, sex, principal diagnosis of the index admission, and proportion of principal diagnosis. The randomisation and matching was generated using Coarsened Exact Matching.²² Due to the unavailability of medical records for some patients, the final number of paired patients was 470 (total patients = 940).

Sample size

Sample size was calculated based on the association between written discharge documentation and unplanned paediatric readmissions. Previous research⁵ found the absence of a written discharge plan demonstrated an odds ratio (OR) of 1.55 for readmissions. Other substantive predictive variables, such as comorbidity, possessed ORs from 1.18 to 5.61.²^,²³^,²⁴ Therefore, we consider OR for written discharge documentation to be suitable for a baseline power calculation. Assuming a rate of 40% written discharge absence/incompleteness from the larger data set, we would need 332 matched case-control pairs (with continuity correction; total = 664, for power = 0.8, and α = 0.05) assuming the equal proportion of rehospitalisations in each group.²⁵ Given our current sample size of 940, we have the power to detect a variable with an OR of 1.45.

Machine learning methods can sometimes require a substantially larger sample size. We therefore used multiple machine learning methods with specific reference to methods that use strong regularisation (e.g. the elastic net) – recommended for situations with a high variable to sample size ratio – and looked for consistency across algorithms for identifying important variables.

Extracted variables

Three groups of variables were analysed (Table 1). The first group (16 administrative variables) were extracted from the initial electronic dataset; the second group (11 clinical information variables) were extracted from patients’ medical records; and the third group (13 variables on written discharge documentation) were extracted from the last written entry of healthcare providers in patient progress notes and/or from clinical care pathway. The data extraction was completed by HZ, using a data collection form to ensure consistency. PRD was consulted with any queries. The written discharge documentation variables were initially extracted from patients’ medical records and then categorised as ‘Yes/No/Not Applicable’. In particular, the Nursing Admission and Discharge Planning Form consists of multiple entry areas to be recorded (this form is divided in two sections, Admission and Discharge Planning); our categorisation of ‘completeness’ was made when all areas of the form were recorded. Partially recorded forms were considered ‘incomplete’. The filled contents of the form were extracted and assessed against variables of ‘Significant social history (legal custody or patient was under the care of the Department for Child Protection)’, ‘language spoken other than English’, ‘known allergies’, ‘discharge information’, ‘discharge medication information’, and ‘follow-up information’.

**Table 1. Three groups of extracted variables**

Missing data

The numbers of missing values were as follows: Significant social history (0 without readmission, 1 with readmission); Source of referral transport (55 without readmission, 59 with readmission); and Completeness of Nursing Admission and Discharge Planning Form (6 without readmission, 14 with readmission). Missing data were imputed by random forest imputation using the missForest package in R.²⁶ This method performs well compared with other imputation procedures, and is able to impute continuous and categorical data, and allows for interactive and non-linear effects. We used default parameter settings from missForest (number of trees = 100, and max iterations = 10).

Statistical analysis

Data processing and analyses were conducted in R (version 3.5.1).²⁷

Model comparison of the three sets of variables

This study was interested in whether a group of variables improved prediction, and, to reduce the number of comparisons, we compared three groups of variables by sequentially fitting three logistic regression models: (1) Administrative variables only; (2) Administrative and clinical variables; (3) Administrative, clinical, and written discharge documentation variables.

Analysis of deviance with Chi-squared (χ²) test was used for determining significance. Analysis of individual variables was not conducted at this stage, but is included in Table 2 for comparison. To complement the logistic regression we used machine learning to highlight variables of relevance for prediction.

**Table 2. Characteristics of patients with readmission and without readmission**
Data are presented as mean±s.d. or n (%) unless otherwise noted. LOS, length of stay; SEIFA, Socioeconomic Indexes for Areas; ICU, intensive care unit; ED, emergency department

Prediction models

Multiple methods were used to ensure consistency and robustness across models, and included logistic regression, stepwise logistic regression, random forest, elastic net, and gradient boosted trees. Performance was evaluated using the C-statistic across the ten repeats of the ten-fold cross-validation.

Stepwise regression methods are standard selection methods in the relevant, existing literature. The ‘glmStepAIC’ method within the ‘caret’ package²⁸ was used for forward stepwise selection to the logistic regression model with the Akaike information criteria (AIC) penalty. Backward elimination gave the same results as forward elimination; therefore, only forward elimination is reported.

Elastic net mixes two regression penalty methods: least absolute shrinkage and selection operator (LASSO)²⁹ penalty, and ridge penalty.³⁰ It provides stable and sparse estimates of model parameters. The LASSO penalty produces sparse predictor matrices by shrinking variables, with a proportion shrunk to 0. The ridge penalty shrinks smoothly all coefficients towards 0, while retaining all variables in the model. We used the ‘glmnet’ package within ‘caret’ to perform the elastic net. Optimal parameters were evaluated using grid search (α and λ between 0 and 1, with 0.02 step increments).

Random forests build multiple decision trees to create a ‘forest’ of trees. Each tree is built on a bootstrapped sample of the training data and, at each split, a random subset of the features are chosen for prediction. The number of variables randomly sampled at each split ranged from 2 to 10, in steps of 2. We used the ‘randomForest’³¹ implementation within ‘caret’.

Gradient boosted decision trees are similar to random forests. Trees are iteratively grown using the outcomes from a previously grown tree, applying a larger weighting to the errors from the previous tree’s classifications. The ‘xgboost’ implementation³² within ‘caret’ was used. The following tuneable parameters were determined by grid search: interaction depth (from 1 to 5), fraction of variables randomly sampled for each tree (0.1, 0.2, 0.5), and minimum loss reduction to make a split γ = 3, 5, 7. The learning rate η = 0.01, and number of trees = 500.

Multiple methods were used to ensure consistency and robustness across models, and included logistic regression, stepwise logistic regression, random forest, elastic net, and gradient boosted trees. These specific methods were selected primarily because they represent the most commonly used methods in the current hospital readmission literature.

Variable selection

For models with in-built selection (stepwise regression, gradient boosted tree, and elastic net), variable selection was done through the model fitting procedure. For the random forest, we selected the top ten variables according to their variable importance. Supplementary Table S1 presents the relative variable importance for the random forest algorithm. The built-in ‘varImp’ function from the ‘caret’ package was used to calculate importance. Variable importance quantifies the relative contributions of each variable to the model, defined as the number of times a variable is selected for splitting, weighted by the improvement to the model, and averaged.

Results

Patients’ characteristics, based on the three groups of variables, for the with-readmission group and without-readmission group, are presented in Table 2. The length of the index admission (mean ± s.d.) was longer in the with-readmission group compared with the without-readmission group (3.3 ± 6.6 vs 3.0 ± 6.9 days). Patients with significant social history were almost doubled in the with-readmission group compared with the without-readmission group (52 (11.1%) vs 8 (6.0%)). Five patients in the with-admission group required interpreter service but none in the without-readmission group required this service. The mean length of delay in issuing a discharge summary was longer in the with-readmission group compared with the without-readmission group (22.9 ± 39.9 vs 16.8 ± 34.3 days).

Comparison of administrative, administrative and clinical, and administrative, clinical, and written discharge documentation variable groups

The improvement in prediction of unplanned hospital readmissions for each set of variables (administrative, administrative and clinical, and administrative, clinical, and written discharge documentation) was sequentially assessed using logistic regression model comparison with standard significance testing. A model with only administrative variables did not significantly improve prediction (administrative model vs intercept only model, χ² ₃₂ = 27.4, P = 0.70). By contrast, the inclusion of clinical variables significantly improved prediction over the administrative-only model (χ² ₁₂ = 86.1, P < 0.01), and the inclusion of written discharge documentation variables further improved prediction over the administrative and clinical variables model (χ² ₁₇ = 29.4, P = 0.03).

Prediction model performance of standard logistic regression to machine learning approaches

Prediction performance for each method obtained from the 10 × 10-fold cross-validation is presented in Table 3. The best performing prediction model according to the mean receiver operating curve (ROC) statistic (C-statistic) was the gradient boosted tree model using all three sets of variables (administrative, administrative and clinical, and administrative, clinical, and written discharge documentation), followed closely by the random forest and elastic net. Consistent with the logistic regression above, models using only administrative data performed no better than chance, and substantial improvements in the C-statistic were seen by including clinical and written discharge documentation data.

**Table 3. Model performance comparison**

Fig. 1 presents the ROC curves for each machine learning algorithm. ROC curves were extracted from the predictions of the 10 × 10-fold cross-validation.

**Fig. 1.** Receiver operating characteristic (ROC) curve of 10-fold cross validations of each predictive modelling approaches.

Variables included in the prediction models

Table 4 presents the variables selected for each model. Due to the failure of the administrative data analysed in isolation to provide predictions above chance, variable selections for Model 1 are considered unreliable and are marked with a circle for comparison purposes.

**Table 4. Variables selected by each model**
GLM, logistic regression; G-S, stepwise logistic regression; RF, random forest; EN, elastic net; XGB, gradient boosted tree

There was considerable concordance in the variables deemed useful for prediction across models. Principally, variables representing clinical information, including usage of hospital services within the past 12 months (number of admissions, emergency department (ED) presentations, and outpatient clinic attendance), number of past medical histories recorded in the progress notes, social history, and language spoken other than English were selected across multiple models, including by the elastic net. Variables relating to written discharge documentation were also selected, including completion of nursing admission and discharge planning documentation and date of discharge summary issued. The elastic net did not select any administrative variable and the gradient boosted tree selected only one administrative variable in Model 3 (distance from the hospital).

Languages spoken other than English/interpreter services requirement was selected by the elastic net and stepwise logistic regression models. It is worth noting that a total of five patients in the dataset required interpreter service at the index admission and all of them experienced 30-day unplanned readmission. The low cell count potentially precludes the variable from emerging as a useful predictor in other models, and suggests caution in interpreting the influence of this variable given the low count.

Discussion

We present a matched case-control study using retrospective analysis of patients’ medical records to identify paediatric 30-day all-cause same-hospital unplanned readmissions. Model prediction improvements were identified when adding clinical information and written discharge documentation compared with the available administrative data. Previous paediatric studies³^,¹¹^–¹⁴^,³³^–³⁵ reported predictive model performance, with only one study³⁴ examining both clinical and administrative data by reviewing patients’ medical record charts. Previous studies that have applied machine learning to paediatric readmission prediction obtained similar ¹³ or better performance¹¹^,¹² to the current study. However, our study used a matched case-control design, with matching across age, sex, and diagnosis that may better identify factors contributing to readmission, distinct from diagnosis.

Four of the identified predictors in this study were consistent with previous research, including the number of hospitalisations prior to the index admission,²³ day of discharge,³⁶ LOS,³³ and the number of comorbidities.²^,²³^,²⁴^,³³^,³⁷ Previous studies have also investigated socioeconomic status in terms of using the area-level deprivation²^,³⁸ and type of health insurance.²⁴^,³³^,³⁷^,³⁸ This study extracted patients’ significant social history (e.g. under the care of the Department for Child Protection) from their medical records and found a positive association with readmissions. The use of an interpreter service was also selected as a predictor of unplanned readmissions; however, interpreter service usage was only selected by two of the machine learning models (stepwise logistic regression and elastic net), suggesting some caution when interpreting the utility of this variable in predicting readmission. Furthermore, there is inconsistency in the literature with respect to this variable and how well it is able to predict readmission. Previous studies⁶^,³⁹ that have examined whether speaking a language other than English was associated with unplanned hospital readmissions have been inconclusive due to low numbers of cases in the dataset,⁴⁰ as was the case in this study’s dataset. Future studies could examine whether a sample enriched in people requiring interpreter services contributes significantly to readmission.

Social history and English language proficiency are routinely assessed at the time of admission, and this study highlights the need for early commencement of discharge planning for these patients.¹⁹ Patients identified as having significant social history at the time of admission require a designated hospital-based social worker to assess and provide social needs for the family/caregiver. The social worker should also collaborate with other healthcare providers to implement a discharge planning process that ensures continuity of care at home, post-discharge.⁴¹ Interpreter services should be available throughout hospitalisation for families/caregivers with language barriers, and are crucial at the time discharge information is delivered by doctors and nurses. The ‘teach-back process’ is also recommended to ensure families’/caregivers’ understanding of the discharge information.⁴²

The quality of written discharge documentation was examined in this study. Incomplete nursing admission and discharge planning documentation, and delay in issuing discharge summaries were associated with unplanned readmissions. Previous research is inconsistent in reporting the association between written discharge documentation and readmissions. One study⁵ found that not providing a written instructional discharge plan to caregivers of children with asthma resulted in a 1.55 times higher readmission rate. A second study⁶ reported that having discharge follow-up plans contributed to readmissions; however, this result was possibly due to the low rate of primary care providers follow-up plan documentation in the discharge summary. A third study⁴ examined the association between asthma patients who were given follow-up appointments and asthma patient readmissions, but the results were inconclusive. Completeness of discharge documentation may reflect on the level of comprehensiveness of discharge information conveyed to families/caregivers.⁸ However, our study conducted limited research into what and how the discharge information is communicated between healthcare providers and families/caregivers. A clinical observational study is, therefore, required to explore communication practice at discharge. It is imperative to complete and distribute discharge summaries to the caregiver’s/family’s general practitioner prior to sending a patient home.⁷^,⁴³ Discharge summaries contain detailed admission information for when the patient seeks medical advice following hospital discharge, and therefore may prevent unnecessary return ED visits or even unplanned readmissions.

This is the first known study using machine learning approaches to predict paediatric unplanned readmissions in Australia. Stepwise logistic regression, random forest, elastic net, and gradient boosted tree approaches were utilised and compared with standard logistic regression analysis. We found modestly greater prediction accuracy using machine learning for the identification of unplanned readmissions, especially using gradient boosted trees. Similarly, an adult population study¹⁷ also found substantially improved prediction of unplanned hospital readmissions using machine learning.

A limitation of this study is that principal diagnosis of the index admission was not examined as a predictor because it was used to match cases and controls. This study is also limited by a specified local context of WA. In comparison to the literature, this study was based on 470 matched case-controls, a small sample size, due to the difficulty and cost of auditing patients’ medical records. Therefore, use of electronic medical records is warranted to allow easy access not only to clinical information but also to written discharge documentation information. A larger sample size is also required to further leverage the benefit of machine learning approaches in the development of predictive models for unplanned paediatric readmissions, as we used a highly constrained approach to prevent overfitting. This retrospective cohort study used historical data from 2010 to 2014, which may reduce the relevance to current clinical practice. However, risk factors associated with paediatric unplanned hospital readmissions have remained stable over the last decade, based on our recently published systematic review,¹ indicating that the datasets used in this study provided relevant information regarding current readmission factors.

Conclusions

Adding clinical information and written discharge documentation demonstrated incremental improvements in prediction of paediatric unplanned hospital readmissions. Machine learning approaches, especially gradient boosted trees, achieved improved prediction accuracy over standard logistic regression analysis. Social and written discharge documentation variables including social history, poor English language proficiency, incomplete discharge documentation, and delay in issuing discharge summary, add value to prediction and our understanding of unplanned hospital readmissions. These predictors could also be translated into clinical practice of discharge planning to help prevent paediatric 30-day all-cause same-hospital unplanned readmission.

Competing interests

The authors have no competing interests to declare.

Acknowledgements

The authors acknowledge staff at the Health Information & Administrative Services, Child & Adolescent Health Service, for their assistance in retrieving medical records, especially Dr Julia Logan, Head of Department. This study was supported by a grant from the Australian Research Council – ARC Linkage Grant (Project ID: LP140100563) and the Chief Investigator is PRD. HZ was also supported by an Academic Support Grant 2016 and an Academic Research Grant 2014 from the Nursing and Midwifery Office, WA Department of Health.

References

[1] Zhou H, Roberts PA, Dhaliwal SA, Della PR. Risk factors associated with paediatric unplanned hospital readmissions: a systematic review. BMJ Open 2019; 9 e020554
| Risk factors associated with paediatric unplanned hospital readmissions: a systematic review.Crossref | GoogleScholarGoogle Scholar | 30696664PubMed |

[2] Wijlaars LP, Hardelid P, Woodman J, Allister J, Cheung R, Gilbert R. Who comes back with what: a retrospective databse study on reasons for emergency readmission to hospital in children and young people in England. Arch Dis Child 2016; 101 714–8.
| Who comes back with what: a retrospective databse study on reasons for emergency readmission to hospital in children and young people in England.Crossref | GoogleScholarGoogle Scholar | 27113555PubMed |

[3] Minhas SV, Chow I, Feldman DS, Bosco J, Otsuka NY. A predictive risk index for 30-day readmissions following surgical treatment of pediatric scoliosis. J Pediatr Orthop 2016; 36 187–92.
| A predictive risk index for 30-day readmissions following surgical treatment of pediatric scoliosis.Crossref | GoogleScholarGoogle Scholar | 25730378PubMed |

[4] Feng JY, Toomey SL, Zaslavsky AM, Nakamura MM, Schuster MA. Readmissions after pediatric mental health admissions. Pediatrics 2017; 140 e20171571
| Readmissions after pediatric mental health admissions.Crossref | GoogleScholarGoogle Scholar | 29101224PubMed |

[5] Topal E, Gucenmez OA, Harmanci K, Arga M, Derinoz O, Turktas I. Potential predictors of relapse after treatment of asthma exacerbations in children. Ann Allergy Asthma Immunol 2014; 112 361–4.
| Potential predictors of relapse after treatment of asthma exacerbations in children.Crossref | GoogleScholarGoogle Scholar | 24583137PubMed |

[6] Coller RJ, Klitzner TS, Lerner CF, Chung PJ. Predictors of 30-day readmission and association with primary care follow-up plans. J Pediatr 2013; 163 1027–33.
| Predictors of 30-day readmission and association with primary care follow-up plans.Crossref | GoogleScholarGoogle Scholar | 23706518PubMed |

[7] Choudhry AJ, Baghdadi YMK, Wagie AE, Habermann EB, Cullinane DC, Zielinski MD. Readability of discharge summaries: with what level of information are we dismissing our patients? Am J Surg 2016; 211 631–36.
| Readability of discharge summaries: with what level of information are we dismissing our patients?Crossref | GoogleScholarGoogle Scholar | 26794665PubMed |

[8] Coghlin DT, Leyenaar JK, Shen M, Bergert L, Engel R, Hershey D, Mallory L, Rassbach C, Woehrlen T, Cooperberg D. Pediatric discharge content: a multisite assessment of physician preferences and experiences. Hosp Pediatr 2014; 4 9–15.
| Pediatric discharge content: a multisite assessment of physician preferences and experiences.Crossref | GoogleScholarGoogle Scholar | 24435595PubMed |

[9] Olsen MR, Hellzen O, Skotnes LH, Enmarker I. Content of nursing discharge notes: associations with patient and transfer characteristics. Open Nurs J 2012; 2 277–87.
| Content of nursing discharge notes: associations with patient and transfer characteristics.Crossref | GoogleScholarGoogle Scholar |

[10] Artetxe A, Beristain A, Graña M. Predictive models for hospital readmission risk: a systematic review of methods. Comput Methods Programs Biomed 2018; 164 49–64.
| Predictive models for hospital readmission risk: a systematic review of methods.Crossref | GoogleScholarGoogle Scholar | 30195431PubMed |

[11] Jovanovic M, Radovanovic S, Vukicevic M, Pouke SV, Delibasic B. Building interpretable predictive models for pediatric hospital readmission using Tree-Lasso logistic regression. Artif Intell Med 2016; 72 12–21.
| Building interpretable predictive models for pediatric hospital readmission using Tree-Lasso logistic regression.Crossref | GoogleScholarGoogle Scholar | 27664505PubMed |

[12] Stiglic G, Wang F, Davey A, Obradovic Z. Pediatric readmission classification using stacked regularized logistic regression models. AMIA Annual Symp Proc 2014; 2014 1072–81.

[13] Wolff P, Grana M, Rios SA, Yarza MB. Machine learning readmission risk modeling: a pediatric case study. BioMed Res Int 2019; 2019 8532892
| Machine learning readmission risk modeling: a pediatric case study.Crossref | GoogleScholarGoogle Scholar | 31139655PubMed |

[14] Janjua MB, Reddy S, Samdani AF, Welch WC, Ozturk AK, Price AV, Weprin BE, Swift DM. Predictors of 90-day readmission in children undergoing spinal cord tumor surgery: a nationwide readmissions database analysis. World Neurosurg 2019; 127 e697–706.
| Predictors of 90-day readmission in children undergoing spinal cord tumor surgery: a nationwide readmissions database analysis.Crossref | GoogleScholarGoogle Scholar | 30947001PubMed |

[15] Wiens J, Shenoy E. Machine learning for healthcare: on the verge of a major shift in healthcare epidemiology. Clin Infect Dis 2018; 66 149–53.
| Machine learning for healthcare: on the verge of a major shift in healthcare epidemiology.Crossref | GoogleScholarGoogle Scholar | 29020316PubMed |

[16] Frizzell JD, Liang L, Schulte PJ, Yancy CW, Heidenreich PA, Hernandez AF, Bhatt DL, Fonarow GC, Laskey WK. Prediction of 30-day all-cause readmissions in patients hospitalized for heart failure: comparison of machine learning and other statistical approaches. JAMA Cardiol 2017; 2 204–209.
| Prediction of 30-day all-cause readmissions in patients hospitalized for heart failure: comparison of machine learning and other statistical approaches.Crossref | GoogleScholarGoogle Scholar | 27784047PubMed |

[17] Yang C, Delcher C, Shenkman E, Ranka S. Predicting 30-day all-cause readmissions from hospital inpatient discharge data. In 2016 IEEE 18th International Conference on e-Health Networking, Applications and Services (Healthcom), 14–16 September 2016, Munich, Germany. IEEE; 201610.1109/HealthCom.2016.7749452

[18] Golas SB, Shibahara T, Agboola S, Otaki H, Sato J, Nakae T, Hisamitsu T, Kojima G, Felsted J, Kakarmath S, Kvedar J, Jethwani K. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Med Inform Decis Mak 2018; 18 44
| A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data.Crossref | GoogleScholarGoogle Scholar | 29929496PubMed |

[19] Zhou H, Della P, Porter P, Roberts P. Risk factors associated with 30-day all-cause unplanned hospital readmissions at a tertiary children’s hospital in Western Australia. J Paediatr Child Health 2020; 56 524–46.
| Risk factors associated with 30-day all-cause unplanned hospital readmissions at a tertiary children’s hospital in Western Australia.Crossref | GoogleScholarGoogle Scholar |

[20] Child and Adolescent Health Service. History and design: Princess Margaret Hospital. Available at: https://pch.health.wa.gov.au/About-us/History/Princess-Margaret-Hospital [verified 16 February 2021].

[21] Zhou H, Della P, Roberts P, Porter P, Dhaliwal S. A 5-year retrospective cohort study of unplanned readmissions in an Australian tertiary paediatric hospital. Aust Health Rev 2019; 43 662–71.
| A 5-year retrospective cohort study of unplanned readmissions in an Australian tertiary paediatric hospital.Crossref | GoogleScholarGoogle Scholar | 30369393PubMed |

[22] Blackwell M, Iacus S, King GP. G. cem: coarsened exact matching in Stata. Stata J 2009; 9 524–46.
| G. cem: coarsened exact matching in Stata.Crossref | GoogleScholarGoogle Scholar |

[23] Beck CE, Khambalia A, Parkin PC, Raina P, Macarthur C. Day of discharge and hospital readmission rates within 30 days in children: a population-based study. Paediatr Child Health 2006; 11 409–12.
| Day of discharge and hospital readmission rates within 30 days in children: a population-based study.Crossref | GoogleScholarGoogle Scholar | 19030310PubMed |

[24] Berry J, Hall DE, Kuo DZ, Cohen E, Agrawal R, Feudtner C, Hall M, Kueser J, Kaplan W, Neff J. Hospital utilization and characteristics of patients experiencing recurrent readmissions within children’s hospital. JAMA 2011; 305 682–90.
| Hospital utilization and characteristics of patients experiencing recurrent readmissions within children’s hospital.Crossref | GoogleScholarGoogle Scholar | 21325184PubMed |

[25] Schlesselman JJ, Stolley PD. Case-control studies: design, conduct, analysis. New York: Oxford University Press; 1982.

[26] Stekhoven DJ, Bühlmann P. MissForest–non-parametric missing value inputation for mixed-type data. Bioinformatics 2012; 28 112–8.
| MissForest–non-parametric missing value inputation for mixed-type data.Crossref | GoogleScholarGoogle Scholar | 22039212PubMed |

[27] R Core Team. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical computing; 2018. Available at: https://www.R-project.org/ [verified 20 October 2020].

[28] Kuhn M. The caret package. 2019. Available at: https://cran.r-project.org/web/packages/caret/index.html [verified 12 Ocober 2020].

[29] Tibshirani R. Regression shrinkage and selection via the Lasso. JSTOR. Series B (Methodological) 1996; 58 267–88. [verified 12 October 2020] https://www.jstor.org/stable/2346178

[30] Zou H, Hastie T. Regularization and variable selection via the elastic net. JSTOR. Series B (Methodological) 2005; 67 301–20. [verified 12 October 2020] https://www.jstor.org/stable/3647580?seq=1

[31] Liaw A, Wiener M. Classification and regression by randomForest. R News 2002; 2/3 18–22. [verified 12 October 2020] https://cogns.northwestern.edu/cbmg/LiawAndWiener2002.pdf

[32] Chen T, He T. xgboost: eXtreme gradient boosting. Package version 1.2.0.1. 2020. Available at: https://cran.r-project.org/web/packages/xgboost/vignettes/xgboost.pdf [verified 1 May 2019].

[33] Feudtner C, Levin JE, Srivastava R, Goodman DM, Slonim AD, Sharma V, Shah SS, Pati S, Fargason C, Hall M. How well can hospital readmission be predicted in a cohort of hospitalized children? A retrospective, multicenter study. Pediatrics 2009; 123 286–93.
| How well can hospital readmission be predicted in a cohort of hospitalized children? A retrospective, multicenter study.Crossref | GoogleScholarGoogle Scholar | 19117894PubMed |

[34] Sacks JH, Kelleman M, McCracken C, Glanville M, Oster M. Pediatric cardiac readmissions: an opportunity for quality improvement? Congenit Heart Dis 2017; 12 282–8.
| Pediatric cardiac readmissions: an opportunity for quality improvement?Crossref | GoogleScholarGoogle Scholar | 27874252PubMed |

[35] Vo D, Zurakowski D, Faraoni D. Incidence and predictors of 30-day postoperative readmission in children. Pediatric Anaesth 2018; 28 63–70.
| Incidence and predictors of 30-day postoperative readmission in children.Crossref | GoogleScholarGoogle Scholar |

[36] Auger K, Davis M. Pediatric weekend admission and increased unplanned readmission rates. J Hosp Med 2015; 10 743–45.
| Pediatric weekend admission and increased unplanned readmission rates.Crossref | GoogleScholarGoogle Scholar | 26381150PubMed |

[37] Khan A, Nakamura MM, Zaslavsky AM, Jang J, Berry JG, Feng JY, Schuster MA. Same-hospital readmission rates as a measure of pediatric quality of care. JAMA Pediatr 2015; 169 905–12.
| Same-hospital readmission rates as a measure of pediatric quality of care.Crossref | GoogleScholarGoogle Scholar | 26237469PubMed |

[38] Sills MR, Hall M, Colvin JD, Macy ML, Cutler GJ, Bettenhausen JL, Morse RB, Auger KA, Raphael JL, Gottlieb LM, Fieldston ES, Shah SS. Association of social determinants with children’s hospitals’ preventable readmissions performance. JAMA Pediatr 2017; 170 350–8.
| Association of social determinants with children’s hospitals’ preventable readmissions performance.Crossref | GoogleScholarGoogle Scholar |

[39] Richards MK, Yanez D, Goldin AB, Grieb T, Murphy WM, Drugas GT. Factors associated with 30-day unplanned pediatric surgical readmission. Am J Surg 2016; 212 426–32.
| Factors associated with 30-day unplanned pediatric surgical readmission.Crossref | GoogleScholarGoogle Scholar | 26924805PubMed |

[40] Tommey S, Peltz A, Loren S, Tracy M, Williams K, Pengeroth L, Ste Marie A, Onorato S, Schuster MA. Potentially preventable 30-day hospital readmissions at a children’s hospital. Pediatr Neonatol 2016; 138 e20154182
| Potentially preventable 30-day hospital readmissions at a children’s hospital.Crossref | GoogleScholarGoogle Scholar |

[41] Heenan D, Birrell D. Hospital-based social work: challenges at the interface between health and social care. Br J Soc Work 2019; 49 1741–58.
| Hospital-based social work: challenges at the interface between health and social care.Crossref | GoogleScholarGoogle Scholar |

[42] Kornburger CK, Gibson C, Sadowski S, Maletta K, Klingbeil C. Using ‘teach-back’ to promote a safe transition from hospital to home: an evidence-based approach to improving the discharge process. J Pediatr Nurs 2013; 28 282–91.
| Using ‘teach-back’ to promote a safe transition from hospital to home: an evidence-based approach to improving the discharge process.Crossref | GoogleScholarGoogle Scholar |

[43] Hoyer EH, Odonkor CA, Bhatia SN, Leung C, Deutschendorf A, Brotman DJ. Association between days to complete inpatient discharge summaries with all-payer hospital readmissions in Maryland. J Hosp Med 2016; 11 393–400.
| Association between days to complete inpatient discharge summaries with all-payer hospital readmissions in Maryland.Crossref | GoogleScholarGoogle Scholar | 26913814PubMed |