Issue: 2016 > October > original article

Predicting adverse health outcomes in older emergency department patients: the APOP study

J. de Gelder, J.A. Lucke, B. de Groot, A.J. Fogteloo, S. Anten, K. Mesri, E.W. Steyerberg, C. Heringhaus, G.J. Blauw, S.P. Mooijaart
AbstractFull textPDF

Full text


Prognosis, emergency department, adverse outcomes, mortality, older adults


Older patients presenting to the emergency department (ED) experience high rates of adverse outcomes,1 but they form a heterogeneous group and it is unknown who is at highest risk. The incidence of adverse outcomes is particularly high after three months, with a mortality rate about 10% and increased functional dependence between 10-45%.1 Early identification of those at highest risk gives an opportunity to guide preventive interventions and informed treatment decisions.2
Current models use either severity of disease or existing geriatric vulnerability for prediction. The Modified Early Warning Score (MEWS) is an indicator of disease severity and showed to be valuable in predicting worse in-hospital outcomes in older patients.3 However, prognostication of MEWS for long-term outcomes in older adults is unknown. The Identification of Seniors At Risk (ISAR)4 and Triage Risk Stratification Tool (TRST)5 focus on existing geriatric vulnerabilities, such as functional and cognitive impairment, to predict adverse health outcomes. Neither of these tools accurately identify high-risk patients,6,7 while that vulnerable group of patients benefits most from an increased level of attention.
We conducted a prospective follow-up study among all older patients who visited the EDs in the region of Leiden. The aim was to develop and validate a prediction model for adverse health outcomes in older emergency patients. To reflect the condition of the patient, both demographics and severity of disease indicators and existing geriatric vulnerability were taken into account.


Study design and setting

We performed a prospective follow-up study in the EDs of two hospitals in the region of Leiden, the Netherlands. The Leiden University Medical Center (LUMC, derivation cohort) is an academic hospital with a level 1 trauma centre and Alrijne Hospital (location Leiderdorp, validation cohort) is a peripheral hospital with a level 2 trauma centre. We considered all patients eligible who fulfilled the inclusion and exclusion criteria. The inclusion criterion was patients aged ≥ 70 presenting for the first time to the ED in the study period. The exclusion criteria were patients who were triaged with highest urgency (code red), who we were not able to approach due to an unstable medical condition, lack of permission of the nurse or physician to enter the room for any reason or due to impaired mental status without an authorised relative to provide informed consent. Also a language barrier and patients who left the waiting room were not eligible. Patients were enrolled 7 days a week for 12 weeks with 24-hour coverage in the LUMC and 12-hour coverage (10.00 am to 10.00 pm) in Alrijne Hospital. Written informed consent was obtained before inclusion. The medical ethics committee of the LUMC and Alrijne waived the necessity for formal approval of the present study, as the study closely follows routine care.

Organisation of emergency care in the Netherlands
Basic health insurance is mandatory in the Netherlands and covers the care from general practitioners (GPs), hospital care and specialist care. Emergency care is provided by GPs and in EDs. Patients in need of immediate care can contact the GP, the GP out-of-hours service, call for an ambulance or go to an ED by themselves. Depending on the urgency, patients are expected to contact or visit the GP first. As with the structure of the LUMC and Alrijne, an increasing number of GP out-of-hours centres are integrated close to the ED to avoid unnecessary ED visits.8 In EDs a triage nurse will prioritise patients first based on the severity of their condition; then the patient can either be directed to the emergency room or the waiting room. In the Leiden region, the LUMC and Alrijne Hospital are the only two EDs, together servicing an unselected catchment area of 400,000 inhabitants of all ages. In both EDs there are no special rooms or trajectories for older patients. Two patient groups bypass the ED and were therefore impossible to include: 1) Older patients with a myocardial infarction in the ambulance who were directly sent to the catheterisation room and 2) older patients with a CVA and eligible for thrombolytic therapy underwent a brief primary assessment in the ED and were then sent to the neurology ward after a CT scan.

Data collection
We included patients in the LUMC from September to November 2014 and in Alrijne Hospital from March to June 2015. In both hospitals teams of medical students were present at the ED from 10.00 am until 10.00 pm to enrol patients, and in the LUMC the ED staff were responsible for inclusion from 10.00 pm until 10.00 am. Before the start of the inclusion period, the medical students and ED staff of the LUMC attended training sessions to guarantee convergence on conducting the questionnaires. The ideal moment for conducting the questionnaires turned out to be 30-45 minutes after arrival of the patient to the ED. At that moment the patient had spoken to the physician and was waiting for lab results or further analysis. The questionnaire took 5-10 minutes to complete. A representative was permitted to answer questions when the patient was unable to provide answers, with the exception of the cognition and self-reported quality of life questions. Questions were collected on a tablet computer and sent directly to a secured database. Additional medical data were extracted automatically from the medical records, verified manually and added to the database.

At baseline, data on three domains were assessed: demographics, severity of disease indicators and geriatric measurements. Demographics consisted of age, gender, living arrangements and level of education. A low level of education was defined as elementary school, basic education as community college, middle education as secondary education and high education as higher vocational training or university. Severity of disease indicators consisted of characteristics related to the ED visit: way of arrival, triage category by the Manchester Triage System,9 fall-related ED visit, indication to measure vital signs and indication to perform a blood test. Whether the visit was fall related was obtained by asking the question: Is the reason for presentation related to a fall? Indication to measure vital signs or laboratory tests was scored positively when, at the moment of presentation, vital signs needed to be measured or a laboratory test was ordered based on the Manchester Triage System and local protocols. Geriatric measurements consisted of the number of different medications mentioned by the patient, history of diagnosed dementia reported by patient or proxy, current use of a walking device, the Identification of Seniors at Risk (ISAR)4 screening tool, the Six Item Cognitive Impairment test (6CIT)10 and the Katz Index of Activities of Daily Living (ADL)11 questionnaire. The ISAR was developed for patients aged ≥ 65 and aims to predict the risk of adverse health outcomes six months after the ED visit. The ISAR consists of six dichotomous questions and scores range from 0 to 6 with higher scores denoting higher risk. The 6CIT is a short cognition test and was validated in a Dutch population against the Mini Mental State Examination (MMSE)12 with a score on the 6CIT of ≥ 11 indicating cognitive impairment (MMSE < 24).13 Six questions lead to a score ranging from 0 to 28 with higher scores indicating more cognitive impairment. The Katz ADL indicates functional status two weeks before presentation to the ED to eliminate possible effects of the acute illness and consists of six dichotomous questions on dependence in bathing, dressing, toileting, transfer, eating and incontinence. Scores range from 0 to 6 with higher scores an indication of more dependency.

The main outcome of the study was composite outcome, a composite of functional decline or mortality at 90-day follow-up. Functional decline was defined as at least one point increase in the Katz ADL score or new institutionalisation, defined as a higher level of assisted living at 90 days after ED visit. We analysed 90-day mortality separately. Mortality can be seen as the ultimate decline and might then be taken together with functional decline. On the other hand, the intervention strategy could differ for patients at high risk for mortality. For that reason we developed a separate prediction model for 90-day mortality. A model solely for functional decline is not feasible. Excluding deceased patients would imply that the model is only applicable in patients who will not die within a certain period, which we do not know at the moment of presentation. Three months after the ED visit the patient was contacted by telephone. In case of no response after three attempts on three consecutive days, the GP was contacted to verify the phone number and living status. Finally a letter with the follow-up questions was sent to patients who had not moved to a higher level of assisted living and who were alive according to the information from the GP. Data concerning mortality were derived from the municipal records.

Statistical analysis
Baseline characteristics are presented as mean with standard deviation (SD) in case of normal distribution, median with interquartile range (IQR) in case of skewed distribution or as numbers with percentages (%). Adequate statistical power for obtaining good predictions requires a minimum of 10 events per candidate predictor.14 This rule was followed for the composite outcome. To reduce the number of candidate predictors, the most relevant questions from the questionnaires (Katz ADL, ISAR, 6CIT) were pre-selected. Single questions with the highest R-square values on the entire questionnaire score were selected and added to the list of candidate predictors for development of the prediction model. Missing predictors were imputed via single imputation techniques.15 The prediction model was derived via backward elimination with Akaike’s Information Criterion (equivalent to p < 0.157 for predictors with 1 df). With this technique the least contributing candidate predictors are deleted until the deterioration in model fit is too large. Discrimination of the models was assessed with the area under the receiver operating characteristic curve (AUC). Internal validation was conducted with a 500 bootstrap sample procedure, where we repeated the backward elimination procedure in each bootstrap sample to estimate the optimism-corrected performance that is expected if the derived prediction model is applied in other datasets. The internal validation procedure also provided a shrinkage factor to adjust the estimated regression coefficients for overfitting.16 The adjusted regression equation provides predictions for new individuals. It was validated in the Alrijne patients.17 Calibration of the model, which reflects how well predicted and observed outcomes agree, was examined by using the adjusted regression equation. Calibration was examined graphically with calibration plots, with a goodness of fit test (Hosmer and Lemeshow test18). The formula 1/(1+e(-linear predictor)) was applied with the adjusted regression equation to determine the individual risks of experiencing the outcome. Performance of the model for the patients with the highest 30%, 20% and 10% predicted risk was evaluated according to sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio and negative likelihood ratio, with 95% confidence intervals. To compare our model performance with the existing six-item ISAR questionnaire, predictive performance of the ISAR on different cut-off points was also calculated. The level of statistical significance was set at p < 0.05. Statistical analyses were performed using IBM SPSS Statistics package (version 20) and R version 3.1.1.


In the three-month inclusion period a total of 995 older patients presented to the ED of the LUMC. Of these, 19 patients were excluded due to a language barrier or leaving the waiting room. Another 92 patients could not be approached due to their medical condition, resulting in 884 eligible patients. Of these, 65 patients were missed for inclusion and 68 patients refused informed consent, which led to a study population of 751 patients (85% of eligible patients). Similarly, 881 patients were included in Alrijne Hospital (figure 1).
The median age of LUMC patients was 78 years (IQR 74-83) and 80 years (IQR 75-84) in Alrijne patients (table 1); 405 (53.9%) patients of the LUMC arrived by ambulance and 201 (26.1%) were triaged as ‘very urgent’. In Alrijne 432 (49.0%) arrived by ambulance and 58 (6.6%) were triaged as ‘very urgent’. In both hospitals the majority of the older patients were independent, with a median Katz ADL score of 0 (IQR 0-1).
In total 230 LUMC patients (30.6%) experienced the composite outcome within 90 days of follow-up and 71 patients (9.5%) died. In Alrijne Hospital 247 patients (28.0%) had the composite outcome and 84 (9.5%) died.
Details on the univariate and multivariable analyses on both outcomes can be found in supplemental tables 1 and 2. The final model for the composite outcome combined age, arrival by ambulance, number of different medications, help needed with bathing or showering, hospital admission in the past six months, help needed at home on a regular base and history of dementia (table 2). Ninety-day mortality could be best predicted by combining information of age, gender, living arrangements, a fall prior to ED visit, indication for blood tests and needing help in dressing. Accuracy of the final models was fair to good, with in the derivation cohort an area under the curve (AUC) of 0.73 (95% CI 0.69-0.77) for the composite outcome and of 0.79 (95% CI 0.73-0.85) for mortality. External validation in Alrijne patients showed an AUC of 0.71 (95% CI 0.67-0.75) for the composite outcome and 0.67 (95% CI 0.60-0.73) for mortality. The formula of the original final models to calculate the individual risk can be found in the legend of table 2.
Calibration of the predicted probabilities was satisfactory (figure 2), with all Hosmer and Lemeshow goodness-of-fit p-values above 0.05. A stricter limit to assign patients at high risk increased specificity, PPV and the positive likelihood ratio (table 3). The PPV ranged from 0.55 (95% CI 0.48-0.61) to 0.69 (95% CI 0.57-0.79) for the composite outcome and from 0.21 (95% CI 0.16-0.27) to 0.36 (95% CI 0.26-0.49) for mortality, depending on the threshold chosen. This implies that in the highest risk group 69% of the patients experienced the composite outcome and 36% died. 
The conventional cut-off of 2 points or higher for the ISAR resulted in a PPV of 0.40 (95% CI 0.36-0.45) for composite outcome and 0.13 (95% CI 0.10-0.16) for mortality (supplemental table 3). Raising the ISAR cut-off to a score of ≥ 4 points to define patients as high risk yielded a PPV of 0.50 (95% CI 0.40-0.60) for the composite outcome and 0.18 (95% CI 0.12-0.26) for mortality.


New externally validated prediction models were presented for older emergency patients by using a combination of demographics, severity of disease indicators and geriatric vulnerability. Performance of the models was satisfactory, with good accuracy and high PPVs.
The predictors used in our models have previously been shown to be predictive of negative health outcomes in other models. The Identification Seniors at Risk (ISAR)4 tool and Triage Risk Screening Tool (TRST)5 were developed for older patients at the ED. The ISAR is suitable for all older patients, whereas the TRST was developed for those discharged home. Both tools include predominately geriatric vulnerabilities, such as functional and cognitive impairment, and are validated for prediction of negative health outcomes, including functional decline and mortality.4,19,20 Scoring systems for disease severity are also used to predict negative health outcomes of which the Acute Physiology and Chronic Health Evaluation (APACHE II)21 and Early Warning Score (MEWS)3 are well known. APACHE II is available online and predicts mortality in intensive care unit patients by using an algorithm consisting of 12 physiological and two disease-related variables. The MEWS weighs the severity of five physiological parameters to identify patients at risk of clinical deterioration and can be used as a bedside evaluation instrument to predict mortality and admission in ED patients.22 The MEWS and APACHE II scores were developed for prediction of worse in-hospital outcomes, whereas the prognostic capabilities in the longer term are unknown, especially in the older population. Recently we showed that directly available clinical data describing disease severity and geriatric vulnerability can be used for prediction in hospitalised older patients.23 The present study also selected predictors reflecting the acute condition of the older patient visiting the ED and developed prediction models with high specificity and high PPVs.
As shown in table 2, demographics and severity of disease indicators are important for predicting mortality and geriatric measurements for predicting the composite outcome. It is arguable that those patients with functional impairment at baseline have a higher risk of further decline, and this stresses the importance of obtaining these measurements of functional capacity in combination with the other parameters for accurate prediction. We showed that history of dementia decreased the risk of the composite outcome. It was an unexpected finding and it may be caused by a larger proportion of patients with dementia living in an institution, and thus better protected from poor outcome, or the group of patients with dementia were less severely ill but referred to the ED sooner. Alternatively, it could be a chance finding in a small group of patients. Another notable finding is that a comparable percentage of patients in both hospitals arrived by ambulance, while the patients in LUMC are triaged more urgently (table 1). We do not have an explanation for this finding, since there are many reasons to arrive by ambulance and both hospitals use the Manchester Triage System. However, we showed that patients who arrive by ambulance are at increased risk to experience the composite outcome. Expected or unexpected, the final prediction models have to be tested in a different population or setting to support general applicability. External validation of both models in the Alrijne patients resulted in a comparable discrimination for the composite outcome and a decrease in AUC of 0.12 for mortality. It is difficult to explain the reason for this decrease in mortality. The fact that the inclusion timeframe was different between hospitals (24 hours in the LUMC vs. 10 am to 10 pm in Alrijne Hospital) is unlikely to have influenced the results substantially, as there were only a very limited number of patients included during the night, and endpoints did not differ between those included during the ‘daytime’ vs. those included at ‘night’. More likely, it could be minor differences in the study population, in ED protocols or parameters which we did not or cannot measure.
Predictive performance of a comparable model, the ISAR,4 was analysed in the same study population (supplemental table 3). The performance was characterised with high sensitivities and low specificities, resulting in relatively low PPVs and high NPVs. As a consequence ISAR is more useful to ‘rule out’ patients at high risk, where our models target patients at highest risk. Prediction of individual risk scores on multiple outcomes, as shown with the composite outcome and mortality enable emergency physicians to guide preventive interventions and tailored treatment decisions. As an example, for patients with a predicted risk for the composite adverse outcome of 50% to 65%, safety procedures could be applied, whereas a predicted risk of 65% or higher can lead to more intensive interventions. On one hand standardised interventions should be administered, such as nursing these patients in a comfortable bed and informing the general practitioner. On the other hand, the predicted risk could support the physician in deciding to start physiotherapy or in making an outpatient appointment to prevent deterioration. If the risk of 90-day mortality is also high, this could be an argument to spend more time on diagnostic and therapeutic shared decision making and advanced care planning.
To date, there is no standard screening program for older adults in the ED. This could be due to the low proportion of evidence-based studies designed for the elderly,24,25 specifically due to the low number of clinical impact studies.26 The ultimate goal is to introduce a new generalised prediction tool, suitable for all older emergency patients and to design and test effectiveness of different interventions. Such a model should consist of patient-related parameters rather than organisation-dependent factors, such as the indication to perform measurements. The present model is accurate for older patients in Western Europe. We are planning such external validation studies, which will show whether the model needs to be updated to specific settings. The algorithm can simply be integrated in the electronic patient record to incorporate screening into routine care or be used as an application as developed on the website:
One of the limitations in the current study is the lack of baseline data on potentially important determinants such as malnutrition, depression and instrumental ADL functioning. Since time is scarce in the acute setting we had to limit the number of questions, instead of performing a comprehensive geriatric assessment. A second limitation is the low proportion of deceased patients within 90 days of follow-up. As a consequence, power for prediction of 90-day mortality was low. The major strength is the unselected representative study population. We included 85% of the eligible older patients 24/7 during 12 weeks. A second strength is the fact that demographics, severity of disease and geriatric vulnerability of the patient were taken into account as a reflection of the condition of the patient. 
In conclusion, we successfully developed and validated prediction models for 90-day composite outcome and 90-day mortality in older emergency patients. The benefits for patients by implementing these models with preventive interventions have to be further investigated. 


Conflict of Interest:

The authors declare no conflict of interest.
The Institute for Evidence-Based Medicine in Old Age (IEMO) is funded by the Dutch Ministry of Health and Welfare and supported by ZonMW (project number 62700.3002).


  1. Aminzadeh F, Dalziel WB. Older adults in the emergency department: a systematic review of patterns of use, adverse outcomes, and effectiveness of interventions. Ann Emerg Med. 2002;39:238-47. 
  2. Samaras N, Chevalley T, Samaras D, Gold G. Older patients in the emergency department: a review. Ann Emerg Med. 2010;56:261-9. 
  3. Cei M, Bartolomei C, Mumoli N. In-hospital mortality and morbidity of elderly medical patients can be predicted at admission by the Modified Early Warning Score: a prospective study. Int J Clin Pract. 2009;63:591-5.
  4. McCusker J, Bellavance F, Cardin S, Trepanier S, Verdon J, Ardman O. Detection of older people at increased risk of adverse health outcomes after an emergency visit: the ISAR screening tool. J Am Geriatr Soc. 1999;47:1229-37. 
  5. Meldon SW, Mion LC, Palmer RM, et al. A brief risk-stratification tool to predict repeat emergency department visits and hospitalizations in older patients discharged from the emergency department. Acad Emerg Med. 2003;10:224-32. 
  6. Carpenter CR, Shelton E, Fowler S, et al. Risk factors and screening instruments to predict adverse outcomes for undifferentiated older emergency department patients: a systematic review and meta-analysis. Acad Emerg Med. 2015;22:1-21. 
  7. Yao JL, Fang J, Lou QQ, Anderson RM. A systematic review of the identification of seniors at risk (ISAR) tool for the prediction of adverse outcome in elderly patients seen in the emergency department. Int J Clin Exp Med. 2015;8:4778-86. 
  8. Schafer W, Kroneman M, Boerma W, et al. The Netherlands: health system review. Health Syst Transit. 2010;12:v-xxvii, 1-228. 
  9. Mackway-Jones K. Manchester Triage Group. Emergency Triage. 1997. 
  10. Katzman R, Brown T, Fuld P, Peck A, Schechter R, Schimmel H. Validation of a short Orientation-Memory-Concentration Test of cognitive impairment. Am J Psychiatry. 1983;140:734-9. 
  11. Katz S, Ford AB, Moskowitz RW, Jackson BA, Jaffe MW. Studies of illness in the aged: The index of ADL: a standardized measure of biological and psychosocial function. JAMA. 1963;185:914-9. 
  12. Folstein MF, Folstein SE, McHugh PR. ‘Mini-mental state’. A practical method for grading the cognitive state of patients for the clinician. J Psychiatric Res. 1975;12:189-98. 
  13. Tuijl JP, Scholte EM, de Craen AJ, van der Mast RC. Screening for cognitive impairment in older general hospital patients: comparison of the Six-Item Cognitive Impairment Test with the Mini-Mental State Examination. Int J Geriatr Psychiatry. 2012;27:755-62. 
  14. Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. New York: Springer; 2009. 
  15. Donders AR, van der Heijden GJ, Stijnen T, Moons KG. Review: a gentle introduction to imputation of missing values. J Clin Epidemiol. 2006;59:1087-91. 
  16. Van Houwelingen JC, Le Cessie S. Predictive value of statistical models. Stat Med. 1990;9:1303-25. 
  17. Moons KG, Kengne AP, Woodward M, et al. Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker. Heart. 2012;98:683-90. 
  18. Hosmer DW, Lemeshow S. Applied logistic regression. 2nd ed. New York: Wiley; 2000. 
  19. Buurman BM, van den Berg W, Korevaar JC, Milisen K, de Haan RJ, de Rooij SE. Risk for poor outcomes in older patients discharged from an emergency department: feasibility of four screening instruments. Eur J Emerg Med. 2011;18:215-20. 
  20. Hustey FM, Mion LC, Connor JT, Emerman CL, Campbell J, Palmer RM. A brief risk stratification tool to predict functional decline in older adults discharged from emergency departments. J Am Geriatr Soc. 2007;55:1269-74. 
  21. Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985;13:818-29. 
  22. Bulut M, Cebicci H, Sigirli D, et al. The comparison of modified early warning score with rapid emergency medicine score: a prospective multicentre observational cohort study on medical and surgical patients presenting to emergency department. Emerg Med J. 2014;31:476-81. 
  23. De Gelder J, Lucke JA, Heim N, et al. Predicting mortality in acutely hospitalized older patients: a retrospective cohort study. Intern Emerg Med. 2016;11:587-94. 
  24. Broekhuizen K, Pothof A, de Craen AJ, Mooijaart SP. Characteristics of randomized controlled trials designed for elderly: a systematic review. PloS One. 2015;10:e0126709. 
  25. Mooijaart SP, Broekhuizen K, Trompet S, et al. Evidence-based medicine in older patients: how can we do better? Neth J Med. 2015;73:211-8. 
  26. Steyerberg EW, Moons KG, van der Windt DA, et al. Prognosis Research Strategy (PROGRESS) 3: prognostic model research. PLoS Med. 2013;10:e1001381.