Asosiasi Single Nucleotide Polymorphism pada Diabetes Mellitus Tipe 2 Menggunakan Random Forest Regression

Lina Herlina Tresnawati, Wisnu Ananta Kusuma, Sony Hartono Wijaya, Lailan Sahrina Hasibuan


Precision medicine dapat dikembangkan dengan menentukan asosiasi antara data genomic yang direpresentasikan oleh Single Nucleotide Polymorphism (SNP) dan fenotipe dari penyakit diabetes mellitus tipe 2 (T2D). SNP adalah penanda yang berjumlah sangat banyak. Untuk itu, diperlukan proses pengurutan dan penapisan sebelum dilakukan asosiasi. Tujuan makalah ini adalah melakukan asosiasi SNP dengan fenotipe T2D. Pemeringkatan SNP dilakukan untuk memilih SNP yang signifikan berdasarkan importance score. SNP yang terpilih diasosiasikan dengan fenotipe T2D dan dilakukan pemeriksaan epistatis (interaksi antar SNP). Metode yang digunakan adalah random forest regression. Makalah ini menghasilkan 301 SNP yang signifikan. Sepuluh SNP terbaik memiliki asosiasi dengan lima buah kandidat protein T2D. Hasil evaluasi menunjukkan bahwa model asosiasi yang diusulkan memiliki nilai Mean Absolute Error (MAE) sebesar 0,062. Hasil evaluasi ini menunjukkan keberhasilan metode random forest regression dalam melakukan asosiasi antara SNP dan fenotipe T2D serta memeriksa epistatis antar dua buah SNP.


Diabetes Mellitus Tipe 2; Epistatis; Pemetaan Asosiasi; Random Forest Regression; Single Nucleotide Polymorphism

Full Text:



X.D. Zhang, “Precision Medicine, Personalized Medicine, Omics and Big Data: Concepts and Relationships,” Journal of Pharmacogenomics Pharmacoproteomics, Vol. 06, No. 02, hal. 1–2, 2015.

B.E. Huang, W. Mulyasasmita, dan G. Rajagopal, “The Path from Big Data to Precision Medicine,” Expert Rev. Precis. Med. Drug Dev., Vol. 1, No. 2, hal. 129–143, 2016.

Y. Yu, B. Wang, Z. Wang, F. Wang, dan L. Liu, “Wrapper Feature Selection Based Multiple Logistic Regression Model for Determinants Analysis of Residential Electricity Consumption,” 2017 Asian Conf. on Energy, Power and Transport. Electrification (ACEPT), 2017, hal. 1-8.

R.L. Perlman, “Mouse Models of Human Disease: An Evolutionary Perspective,” Evol. Med. Public Health, Vol. 2016, No. 1 , hal. 170–176, 2016.

K. Zarkogianni, M. Athanasiou, A.C. Thanopoulou, dan K.S. Nikita, “Comparison of Machine Learning Approaches Towards Assessing the Risk of Developing Cardiovascular Disease as a Long-Term Diabetes Complication,” IEEE J. Biomed. Heal. Informatics, Vol. 22, No. c, hal. 1637-1647, 2017.

A. Boutorh dan A. Guessoum, “Engineering Applications of Artificial Intelligence Complex Diseases SNP Selection and Classification by Hybrid Association Rule Mining and Artificial Neural Network — based Evolutionary Algorithms,” Eng. Appl. Artif. Intell., Vol. 51, hal. 58–70, 2016.

B.W. Kang, H. Jeon, Y.S. Chae, S.J. Lee, J.Y. Park, J.E. Choi, J.S. Park, G.S. Choi, dan J.G. Kim, “Association between GWAS-Identified Genetic Variations and Disease Prognosis for Patients with Colorectal Cancer,” PLoS One, Vol. 10, No. 3, hal. 1–9, 2015.

H.J. Lee, J.W. Lee, S.H. Jin, H.J. Yoo, dan M. Park, “Detecting Highdimensional Genetic Associations using a Markov-Blanket in a Familybased Study,” 2016 IEEE Int. Conf. on Bioinf. and Biomed. (BIBM), 2016, hal. 1767–1770.

L. Zhang, Q. Pan, Y. Wang, X. Wu, dan X. Shi, “Bayesian Network Construction and Genotype-Phenotype Inference Using GWAS Statistics,” IEEE/ACM Trans. on Comp. Biol. and Bioinf., Vol. 16, No. 2, hal. 475-489, 2019.

J.H. Oh, S. Kerns, H. Ostrer, S.N. Powell, B. Rosenstein, dan J.O. Deasy, “Computational Methods using Genome-wide Association Studies to Predict Radiotherapy Complications and to Identify Correlative Molecular Processes,” Nat. Publ. Gr., hal. 1–10, 2017.

C. Yao, D.M. Spurlock, L.E. Armentano, C.D. Page Jr., M.J. Vandehaar, dan D.M. Bickhart, “Random Forests Approach for Identifying Additive and Epistatic Single Nucleotide Polymorphisms Associated with Residual Feed Intake in Dairy Cattle,” J. Dairy Sci., Vol. 96, No. 10, hal. 6716–6729, 2013.

D. Setiawan, W.A. Kusuma, dan A.H. Wigena. "SNP Selection using Variable Ranking and Sequential Forward Floating Selection with Two Optimality Criteria," J. Eng. Sci. Tech. Rev., Vol. 11, No. 5, hal. 76–85, 2018.

L. Crawford, P. Zeng, S. Mukherjee, dan X. Zhou, "Detecting Epistasis with the Marginal Epistasis Test in Genetic Mapping Studies of Quantitative Traits," PLoS Genetics, Vol. 13, No. 7, hal. 1-37, 2017.

C. Sandor, N. L. Beer, dan C. Webber, “Diverse Type 2 Diabetes Genetic Risk Factors Functionally Converge in a Phenotype-focused Gene Network,” PLoS Comput. Biol., Vol. 13, No. 10, hal. 1–23, 2017.

U. Ilhan, G. Tezel, dan C. Özcan, “Tag SNP Selection Using Similarity Associations between SNPs,” Proc. 2015 Int. Symp. Innov. Intell. Syst. Appl. (INISTA), 2015, hal. 1-8.

T.-T. Nguyen, J. Huang, Q. Wu, T. Nguyen, dan M. Li, “Genome-wide Association Data Classification and SNPs Selection Using Two-stage Quality-based Random Forests,” BMC Genomics, Vol. 16, Suppl. 2, hal. 1-11, 2015.

J.K. Jaiswal dan R. Samikannu, “Application of Random Forest Algorithm on Feature Subset Selection and Classification and Regression,” 2017 World Congr. on Comp. and Comm. Tech. (WCCCT), 2017, hal. 65-68.

K. Fawagreh, M.M. Gaber, dan E. Elyan, “Random Forests: From Early Developments to Recent Advancements,” Syst. Sci. Control Eng., Vol. 2, No. 1, hal. 602–609, 2014.

X. Guo, Y. Meng, N. Yu, dan Y. Pan, “Cloud Computing for Detecting High-order Genome-wide Epistatic Interaction via Dynamic Clustering,” BMC Bioinformatics, Vol. 15, No. 102, hal. 1–16, 2014.

A. Liaw dan M. Wiener, “Classification and Regression by randomForest,” R News, Vol. 2/3. hal. 18-22, 2002.

A. Mahajan, M.J. Go, W. Zhang, J.E. Below, K.J. Gaulton, et al, “Genome-Wide Trans-ancestry Meta-analysis Provides Insight into the Genetic Architecture of Type 2 Diabetes Susceptibility,” Nat Genet., Vol. 46, No. 3, hal. 234-244, 2014.

M. Kayri, I. Kayri, dan M.T. Gencoglu, “The Performance Comparison of Multiple Linear Regression, Random Forest and Artificial Neural Network by using Photovoltaic and Atmospheric Data,” 2017 14th Int. Conf. on Eng. of Modern Electric Systems (EMES), 2017, hal. 1-4.

A. Wonkam, V.J.N. Bitoungui, A.A. Vorster, R. Ramesar, R.S. Cooper, B. Tayo, G. Lettre, dan J. Ngogang, “Association of Variants at BCL11A and HBS1L-MYB with Hemoglobin F and Hospitalization Rates among Sickle Cell Patients in Cameroon,” PLoS One, Vol. 9, No. 3, hal. 1-9, 2014.

S.A. Haddad, J.R. Palmer, K.L. Lunetta, dan M.C.Y. Ng, “A Novel TCF7L2 Type 2 Diabetes SNP Identified from Fine Mapping in African American Women,” PLoS One, Vol. 12, No. 3, hal. 1–15, 2017.

C.E. Arámbul-carrillo dan M.E. Ramos-márquez, “Association between Polymorphism in the AKT1 Gene and Type 2 Diabetes Mellitus in a Mexican Population,” Rev. Mex. Endocrinol. Metab. Nutr., Vol. 2, hal. 167–170, 2015.

C.L. Schmalohr, J. Grossbach, M. Clément-Ziza, dan A. Beyer, “Detection of Epistatic Interactions with Random Forest Author Summary,” PLOS, hal. 1–23, 2018.



  • There are currently no refbacks.

Copyright (c) 2019 JNTETI (Jurnal Nasional Teknik Elektro dan Teknologi Informasi)

JNTETI (Jurnal Nasional Teknik Elektro dan Teknologi Informasi)

Departemen Teknik Elektro dan Teknologi Informasi, Fakultas Teknik Universitas Gadjah Mada
Jl. Grafika No 2. Kampus UGM Yogyakarta 55281
+62 274 552305