Screening Key Indicators for Acute Kidney Injury Prediction Using Machine Learning


  • Jiaming Wang
  • Bing Zhu
  • Pei Liu
  • Ruiqi Jia Department of School of Economics and Management, Beijing Jiaotong University, Beijing 100044
  • Lijing Jia
  • Wei Chen
  • Cong Feng Department of Emergency, First medical center, General Hospital of People's Liberation Army, Beijing, 100853, China
  • Jing Li Department of School of Economics and Management, Beijing Jiaotong University


acute kidney injury, key indicator screening, machine learning, sequential forward selection, XGBoost


Acute kidney injury is a common critical disease with a high mortality. The large number of indicators in AKI patients makes it difficult for clinicians to quickly and accurately determine the patient’s condition. This study used machine learning methods to filter key indicators and use key indicator data to achieve advance prediction of AKI so that a small number of indicators could be measured to reliably predict AKI and provide auxiliary decision support for clinical staff. Sequential forward selection based on feature importance calculated by XGBoost was used to screen out 17 key indicators. Three machine learning algorithms were used to make predictions, namely, logistic regression (LR), decision tree, and XGBoost. To verify the validity of the method, data were extracted from the MIMIC III database and the eICU-CRD database for 1,009 and 1,327 AKI patients, respectively. The MIMIC III database was used for internal validation, and the eICU-CRD database was used for external validation. For all three machine learning algorithms, the prediction performance from using only the key indicator dataset was very close to that from using the full dataset. The XGBoost algorithm performed the best, and LR was the next best. The decision tree performed the worst. The key indicator screening method proposed in this study can achieve a good predictive performance while streamlining the number of indicators.


[1] S. Uchino, J. A. Kellum, R. Bellomo, G. S. Doig, H. Morimatsu, S. Morgera, M. Schetz, I. Tan, C. S. C. Bouman, and E. Macedo, "Acute renal failure in critically ill patients: A multinational, multicenter study," JAMA, vol. 294, pp. 813-818, 2005.

[2] N. Pannu, M. T. James, B. R. Hemmelgarn, and S. Klarenbach, "Association between AKI, Recovery of Renal Function, and Long-Term Outcomes after Hospital Discharge," Clinical Journal of The American Society of Nephrology, vol. 8, pp. 194-202, 2013.

[3] X. Xu, S. Nie, Z. Liu, C. Chen, G. Xu, Y. Zha, J. Qian, B. Liu, S. Han, and A. Xu, "Epidemiology and Clinical Correlates of AKI in Chinese Hospitalized Adults," Clinical Journal of The American Society of Nephrology, vol. 10, pp. 1510-1518, 2015.

[4] J. A. Kellum, N. Lameire, P. Aspelin, R. S. Barsoum, E. A. Burdmann, S. L. Goldstein, C. A. Herzog, M. Joannidis, A. Kribben, and A. S. Levey, "Kidney disease: Improving global outcomes (KDIGO) acute kidney injury work group.KDIGO clinical practice guideline for acute kidney injury," Kidney International, vol. 2, pp. 1-138, 2012.

[5] Y. Fang, X. Ding, Y. Zhong, J. Zou, J. Teng, Y. Tang, J. Lin, and P. Lin, "Acute Kidney Injury in a Chinese Hospitalized Population," Blood Purification, vol. 30, pp. 120-126, 2010.

[6] J. Lafrance and D. R. Miller, "Acute Kidney Injury Associates with Increased Long-Term Mortality," Journal of The American Society of Nephrology, vol. 21, pp. 345-352, 2010.

[7] R. J. Kate, R. M. Perez, D. Mazumdar, K. S. Pasupathy, and V. Nilakantan, "Prediction and detection models for acute kidney injury in hospitalized older adults," BMC Medical Informatics and Decision Making, vol. 16, p. 39-39, 2016.

[8] E. Hoste, S. M. Bagshaw, R. Bellomo, C. M. Cely, R. Colman, D. N. Cruz, K. Edipidis, L. G. Forni, C. D. Gomersall, and D. Govil, "Epidemiology of acute kidney injury in critically ill patients: the multinational AKI-EPI study," Intensive Care Medicine, vol. 41, pp. 1411-1423, 2015.

[9] P. Thottakkara, T. Ozrazgatbaslanti, B. B. Hupf, P. Rashidi, P. M. Pardalos, P. Momcilovic, and A. Bihorac, "Application of Machine Learning Techniques to High-Dimensional Clinical Data to Forecast Postoperative Complications," PLOS ONE, vol. 11, 2016.

[10] R. J. Kate, R. M. Perez, D. Mazumdar, K. S. Pasupathy, and V. Nilakantan, "Prediction and detection models for acute kidney injury in hospitalized older adults," BMC Medical Informatics and Decision Making, vol. 16, p. 39-39, 2016.

[11] J. Coster, R. Jacques, J. Turner, A. Crum, J. Nicholl, and A. N. Siriwardena, "PP12 New indicators for measuring patient survival following ambulance service care," Emergency Medicine Journal, vol. 34, 2017.

[12] McCoy A, Das R., Reducing patient mortality, length of stay and readmissions through machine learningbased sepsis prediction in the emergency department, intensive care unit and hospital floor units. BMJ Open Quality 2017;6:e000158. doi:10.1136/bmjoq-2017-000158

[13] S. Levin, M. Toerper, E. Hamrock, J. S. Hinson, S. Barnes, H. Gardner, A. S. F. Dugas, B. Linton, T. D. Kirsch, and G. D. Kelen, "Machine-Learning-Based Electronic Triage More Accurately Differentiates Patients With Respect to Clinical Outcomes Compared With the Emergency Severity Index," Annals of Emergency Medicine, vol. 71, p. 565, 2017.

[14] C. M. Hohl, K. Badke, A. Zhao, M. E. Wickham, S. A. Woo, M. L. A. Sivilotti, and J. J. Perry, "Prospective Validation of Clinical Criteria to Identify Emergency Department Patients at High Risk for Adverse Drug Events," Academic Emergency Medicine, vol. 25, pp. 1015-1026, 2018.

[15] M. A. Tahir, J. Kittler and F. Yan, "Inverse random under sampling for class imbalance problem and its application to multi-label classification," Pattern Recognition, vol. 45, pp. 3738-3750, 2012.

[16] Q. A. Rahman, T. Janmohamed, H. Clarke, P. Ritvo, J. M. Heffernan, and J. Katz, "Interpretability and Class Imbalance in Prediction Models for Pain Volatility in Manage My Pain App Users: Analysis Using Feature Selection and Majority Voting Methods.," JMIR medical informatics, vol. 7, 2019.

[17] S. Sperandei, "Understanding logistic regression analysis," Biochemia Medica, vol. 24, pp. 12-18, 2014.

[18] X. Niuniu and L. Yuxun, "Notice of Retraction: Review of decision trees," vol. 5, pp. 105-109, 2010-01-01 2010.

[19] T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," pp. 785-794, 2016.

[20] L. Liu, Y. Yu, Z. Fei, M. Li, F. Wu, H. Li, Y. Pan, and J. Wang, "An interpretable boosting model to predict side effects of analgesics for osteoarthritis," BMC Systems Biology, vol. 12, pp. 29-38, 2018.

[21] N. Huang, G. Lu, G. Cai, D. Xu, J. Xu, F. Li, and L. Zhang, "Feature Selection of Power Quality Disturbance Signals with an Entropy-Importance-Based Random Forest," Entropy, vol. 18, p. 44, 2016.

[22] S. M. Lundberg and S. Lee, "A unified approach to interpreting model predictions," in Advances in neural information processing systems, 2017, pp. 4765-4774.

[23] C. Molnar, "Interpretable Machine Learning: A Guide for Making Black Box Models Explainable.( 2019)," URL https://christophm. github. io/interpretable-ml-book, 2019.

[24] R. Maiwall, G. Kumar, A. Bharadwaj, K. Jamwal, A. S. Bhadoria, P. Jain, and S. K. Sarin, "AKI persistence at 48 h predicts mortality in patients with acute on chronic liver failure," Hepatology International, vol. 11, pp. 529-539, 2017.

[25] D. N. Cruz, A. Ferrer-Nadal, P. Piccinni, S. L. Goldstein, L. S. Chawla, E. Alessandri, C. Belluomo Anello, W. Bohannon, T. Bove, N. Brienza, M. Carlini, F. Forfori, F. Garzotto, S. Gramaticopolo, M. Iannuzzi, L. Montini, P. Pelaia, C. Ronco, and I. NEFROINT, "Utilization of small changes in serum creatinine with clinical risk factors to assess the risk of AKI in critically lll adults," Clinical journal of the American Society of Nephrology, vol. 9, pp. 663-672, 2014-01-01 2014.

[26] J. Vanmassenhove, N. Lameire, A. Dhondt, R. Vanholder, and W. Van Biesen, "Prognostic robustness of serum creatinine based AKI definitions in patients with sepsis: a prospective cohort study.," BMC Nephrology, vol. 16, p. 112-112, 2015.

[27] L. E. Smith, D. K. Smith, J. D. Blume, E. D. Siew, and F. T. Billings, "Latent variable modeling improves AKI risk factor identification and AKI prediction compared to traditional methods," BMC Nephrology, vol. 18, p. 55, 2017.

[28] A. Dewitte, M. Biais, L. Petit, J. F. Cochard, G. Hilbert, C. Combe, and F. Sztark, "Fractional excretion of urea as a diagnostic index in acute kidney injury in intensive care patients," Journal of Critical Care, vol. 27, pp. 505-510, 2012.

[29] W. De Corte, R. Vanholder, A. Dhondt, J. J. De Waele, J. Decruyenaere, C. Danneels, S. Claus, and E. Hoste, "Serum urea concentration is probably not related to outcome in ICU patients with AKI and renal replacement therapy," Nephrology Dialysis Transplantation, vol. 26, pp. 3211-3218, 2011.

[30] J. A. A. G. Damen, L. Hooft, E. Schuit, T. P. A. Debray, G. S. Collins, I. Tzoulaki, C. Lassale, G. C. M. Siontis, V. Chiocchia, and C. Roberts, "Prediction models for cardiovascular disease risk in the general population: systematic review," BMJ, vol. 353, 2016.

Additional Files



Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.