Heart Disease Risk Classification and Feature Importance Based on Machine Learning Using Microdata SKI 2023

Authors

  • Zahra Mulki Syari'ati Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia
  • Maula Ismail Muhamad Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia
  • Lina Khasanah Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia
  • Bambang Karmanto Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia
  • Suratmi Program Studi Terapan Kebidanan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

DOI:

https://doi.org/10.36590/jika.v8i1.1929

Keywords:

heart disease, health survey, machine learning

Abstract

Cardiovascular disease remained the leading cause of mortality in Indonesia, with death rates increasing by more than 25% and national health expenditure reaching Rp17.92 trillion. Its complex risk profile requires population-based predictive approaches. The 2023 Indonesian Health Survey (SKI) provided extensive data suitable for Machine Learning-based risk modelling. This study aimed to develop a classification model for heart-disease risk and identify dominant risk factors using the Random Forest algorithm. A case-control design was applied, with data divided into training and testing sets using an 80:20 ratio to ensure objective model evaluation. The analysis followed the Knowledge Discovery in Database (KDD) framework, including data selection, preprocessing, transformation, modelling, and evaluation. Random Forest was used for classification, while feature importance was assessed using Information Gain and Gain Ratio. Model performance was evaluated using accuracy, sensitivity, and specificity. Age, hypertension, and Body Mass Index (BMI) were identified as the most influential predictors. The model achieved an accuracy of 72.13%, sensitivity of 73.11%, and specificity of 71.26%, indicating stable classification performance on large population data. However, this study is limited by the use of secondary data and the absence of external validation. These findings highlight the potential of Machine Learning to support population-based risk stratification and inform targeted prevention strategies, contributing to evidence-based policy development and early screening programs in primary healthcare settings.

Downloads

Download data is not yet available.

Author Biographies

Zahra Mulki Syari'ati, Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

Maula Ismail Muhamad, Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

Lina Khasanah, Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

https://scholar.google.com

Bambang Karmanto, Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

Program Studi Rekam Medis dan Informasi Kesehatan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

https://scholar.google.com

Suratmi, Program Studi Terapan Kebidanan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

Program Studi Terapan Kebidanan, Poltekkes Kemenkes Tasikmalaya, Tasikmalaya, Indonesia

https://scholar.google.com

References

Ahmed, M., Husien, I., 2024. Heart Disease Prediction using Hybrid Machine Learning : A Brief Review. Journal of Robotic and Control 5(3), 884–892. https://journal.umy.ac.id/index.php/jrc/article/view/21606

Alwakid, G., Ul Haq, F., Tariq, N., Humayun, M., Shaheen, M., Alsadun, M., 2025. Optimized Machine Learning Framework for Cardiovascular Disease Diagnosis: A Novel Ethical Perspective. BMC Cardiovascular Disorders 25(1), 1-28. Https://Doi.Org/10.1186/S12872-025-04550-W

[BKPK] Badan Kebijakan Pembangunan Kesehatan., 2021. National Health Accounts Indonesia Tahun 2020. BKPK Kemenkes RI, Jakarta.

Chandra, K., Prasetyo, J.S., 2024. Prediksi Penyakit Jantung Koroner Menggunakan Metode K-NN dan Regresi Logistik Berdasarkan Kerangka Kerja CRISP-DM. [Prosiding]. Seminar Nasional Ma Chung Sistem Informasi & Informatika, 4, 241–248.

Delavera, A., Siregar, K.N., Jazid, R., Eryando, T., 2021. Hubungan Kondisi Psikologis Stress dengan Hipertensi pada Penduduk Usia di atas 15 Tahun di Indonesia. Jurnal Biostatistik, Kependudukan, dan Informatika Kesehatan 1(3), 148–159. https://scholarhub.ui.ac.id/bikfokes/vol1/iss3/2/

Görtler, J., Hohman, F., Moritz, D., Kirchner, M., 2022. Neo: Generalizing Confusion Matrix Visualization to Hierarchical and Multi-Output Labels. [Prosiding]. CHI Conference on Human Factors in Computing Systems 408, 1-13. Https://Doi.Org/10.1145/3491102.3501823

Guha, A., Shah, V., Nahle, T., Singh, S., Kunhiraman, H.H., Shehnaz, F., Nain, P., Makram, O.M., Mahmoudi, M., Al-Kindi, S., Madabhushi, A., Shiradkar, R., Daoud, H., 2025. Artificial Intelligence Applications in Cardio-Oncology: A Comprehensive Review. Current Cardiology Reports 27(1), 1-22. Https://Doi.Org/10.1007/S11886-025-02215-W

Hidayat, H., Sunyoto, A., Al-Fatta, H., 2023. Klasifikasi Penyakit Jantung Menggunakan Random Forest Classifier. Jurnal Sistem Komputer dan Kecerdasan Buatan 7(1), 31–40. Https://Doi.Org/10.47970/Siskom-Kb.V7i1.464

Islam, M.M., Alam, M.J., Maniruzzaman, M., Ahmed, N.A.M.F., Ali, M.S., Rahman, M.J., Roy, D.C., 2023. Predicting The Risk of Hypertension using Machine Learning Algorithms: A Cross Sectional Study in Ethiopia. Plos One 18(8), 1–20. Https://Doi.Org/10.1371/Journal.Pone.0289613

Khalisatifa, A., Arum, H.D., Jambak, M.I., 2024. Klasifikasi Risiko Penyakit Serangan Jantung dengan Menggunakan Algoritma C4.5. Jurnal Ilmiah dan Teknologi 14(1), 57–64. Https://Doi.Org/10.32699/Device.V14i1.6869

Kurniawati, L., Priyanto, D., Sulistianingsih, N., Syahrir, M., Rismawati, R., 2025. Perbandingan Metode Berbasis Decision Tree untuk Mendeteksi Penyakit Paru Comparison of Decision Tree-Based Methods in Lung Disease Detection 7(1), 51–62. Https://Doi.Org/10.30812/Bite.V7i1.4909

Naser, M.A., Majeed, A.A., Alsabah, M., Al-Shaikhli, T.R., Kaky, K.M., 2024. A Review of Machine Learning’s Role in Cardiovascular Disease Prediction: Recent Advances and Future Challenges. Algorithms 17(2), 1–33. Https://Doi.Org/10.3390/A17020078

Nazari, M., Emami, H., Rabiei, R., Hosseini, A., Rahmatizadeh, S., 2024. Detection of Cardiovascular Diseases Using Data Mining Approaches: Application of An Ensemble-Based Model. Cognitive Computation 16(5), 2264–2278. Https://Doi.Org/10.1007/S12559-024-10306-Z

Nuraeni, N., 2024. Klasifikasi Data Mining untuk Prediksi Penyakit Kardiovaskular. Jurnal Teknik Informasi dan Komputer 7(1), 161-169. https://jurnal.murnisadar.ac.id/index.php/Tekinkom/article/view/1276

Pal, K., Patel, B.V., 2020. Data Classification with K-Fold Cross Validation and Holdout Accuracy Estimation Methods with 5 Different Machine Learning Techniques. 2020 Fourth International Conference on Computing Methodologies and Communication 83–87. Https://Doi.Org/10.1109/ICCMC48092.2020.ICCMC-00016

[PERKI] Perhimpunan Dokter Spesialis Kardiovaskular Indonesia., 2022. Panduan Prevensi Penyakit Kardiovaskular Arterosklerosis. Perhimpunan Dokter Spesialis Kardiovaskular Indonesia, Jakarta.

Ramadhan, B., Firdaus, D., Rafi, A.R., 2023. Teknik SMOTE sebagai Solusi Imbalance Class dalam Model Deteksi Intrusi DDoS dengan Metode PCA-Random Forest. Multimedia Artificial Intelligence Networking Database 8(1), 52–64. https://ejurnal.itenas.ac.id/index.php/mindjournal/article/view/8161

Ramadhanti, I., Izzati, M.N., Nurcandra, F., Apriningsih, A., 2024. Studi Kualitatif: Program Penanggulangan Penyakit Jantung dan Pembuluh Darah di Kementerian Kesehatan RI. Jurnal Kesehatan Tambusai 5(3), 757-824. https://journal.universitaspahlawan.ac.id/index.php/jkt/article/view/29796

Ratan, U., 2022. Applied Machine Learning for Healthcare and Life Sciences using AWS. Packt Publishing Ltd, Birmingham.

Reddy, S.P., Reddy, C.V.K., Sambath, M., Thangakumar, J., 2024. Heart Disease Prediction using Machine Learning Techniques. Communications in Computer and Information Science, 27–35. Https://Doi.Org/10.1007/978-3-031-75957-4_3

Rusyda, A.L., 2025. Exploring The Non-Communicable Disease Burden in Indonesia Findings from The 2023 Health Survey. Indonesia Journal of Public Health Nutrition 5(2), 1-16. Https://Doi.Org/10.7454/Ijphn.V5i2.1064

Saptawan, F., David, D., Wijaya, T., Kosasi, S., Kuway, S.M., 2024. Prediksi Epidemiologi Penyakit Tidak Menular Menggunakan Algoritma Random Forest pada Puskesmas. Jurnal Times 13(2), 192–201. Https://Doi.Org/10.51351/Jtm.13.2.2024788

Saputra, I., 2023. Belajar Mudah Data Mining untuk Pemula. Penerbit Informatika.

[SKI] Survei Kesehatan Indonesia., 2023. Laporan Survei Kesehatan Indonesia (SKI) 2023. Survei Kesehatan Indonesia, Jakarta.

Si, Y., Guo, L., Chen, S., Zhang, X., Dai, X., Wang, D., Liu, Y., Tran, B. X., Pronyk, P. M., Tang, S., 2025. Progressing Towards The 2030 Health-Related SDGs in ASEAN: A Systematic Analysis. Plos Medicine 22(4), 1–20. Https://Doi.Org/10.1371/Journal.Pmed.1004551

Solida, A., Noerjoedianto, D., Mekarisce, A.A., Widiastuti, F., 2021. Pola Belanja Kesehatan Katastropik Peserta Jaminan Kesehatan di Kota Jambi. Jurnal Kebijakan Kesehatan Indonesia 10(4), 209–215. https://journal.ugm.ac.id/jkki/article/view/68736

Tampubolon, L.F., Ginting, A., Turnip, F.E., 2023. Gambaran Faktor yang Mempengaruhi Kejadian Penyakit Jantung Koroner (PJK) di Pusat Jantung Terpadu (PJT). Jurnal Ilmiah Permas: Jurnal Ilmiah STIKES Kendal, 13(3), 1043–1052. Https://Doi.Org/10.32583/Pskm.V13i3.1077

Tasnim, A.F., Rahman, R., Prabha, M., Hossain, M.A., Nilima, S.I., Mahmud, M.A., Erdei, T.I., 2025. Explainable Machine Learning Algorithms to Predict Cardiovascular Strokes. Engineering, Technology and Applied Science Research 15(1), 20131–20137. Https://Doi.Org/10.48084/Etasr.9152

Vinet, L., Zhedanov, A., 2011. A “Missing” Family of Classical Orthogonal Polynomials. Journal of Physics A: Mathematical and Theoretical 44(8), 1–14. Https://Doi.Org/10.1088/1751-8113/44/8/085201

Wardhana, R.G., Wang, G., Sibuea, F., 2023. Penerapan Machine Learning dalam Prediksi Tingkat Kasus Penyakit di Indonesia. Journal of Information System Management 5(1), 40–45. Https://Doi.Org/10.24076/Joism.2023v5i1.1136

Downloads

Published

2026-04-30

How to Cite

Syari’ati, Z. M., Muhamad, M. I., Khasanah, L., Karmanto, B., & Suratmi, S. (2026). Heart Disease Risk Classification and Feature Importance Based on Machine Learning Using Microdata SKI 2023. Jurnal Ilmiah Kesehatan (JIKA), 8(1), 78–89. https://doi.org/10.36590/jika.v8i1.1929