Mendeteksi Cyberhate pada Twitter Menggunakan Text Classification dan Crowdsourced Labeling

Hadi Kurniawan Sidiq, Dana Sulistyo Kusumo, Indra Lukmana Sardi

Abstract


During the 2019 presidential election campaign in Indonesia, a lot of support was made by the community with various forms of support, such as poster distribution or even content on social media. For example, in social media such as Twitter, there were many support tags during the presidential election, such as #2019gantipresiden, #2019tetapjokowi, and other hashtags related to the Indonesian presidential election. However, many hate speeches are contained in tweets with the related hashtag. Hate speech on the internet (cyberhate) could cause disputes between support groups of the two presidential candidates which cause conflicts such as riots and other actions that harm the country. This study uses the SVM algorithm to detect cyberhate that produces the best accuracy of 97%. Also, this study applies crowdsourced labeling in dataset labeling which results in 98% valid data.

Keywords


Crowdsourced Labeling; Cyberhate Tweets; Hate Speech Detection; Text Classification

Full Text:

PDF

References


I. Alfina, R. Mulia,M.I. Fanany, dan Y. Ekanata, “Hate Speech Detection in the Indonesian Language: A Dataset and Preliminary Study,” 2017 Int. Conf. on Advanced Computer Science and Information Systems (ICACSIS), 2017, hal. 233-238.

H. Margono, X. Yi, dan G.K. Raikundalia, ”Mining Indonesian

Cyberbullying Patterns in Social Networks,” Proc. of Thirty-Seventh Australasian Computer Science Conference, 2014, hal. 115-124.

S.H. Pratiwi, “Detection of Hate Speech against Religion on Tweet in the Indonesian Language Using Naïve Bayes Algorithm and Support Vector Machine,” B.Sc. Tesis, Universitas Indonesia, Jakarta, Indonesia, 2016.

I. Alfina, D. Sigmawaty, F. Nurhidayati, dan A.N. Hidayanto, “Utilizing Hashtags for Sentiment Analysis of Tweets in the Political Domain,” Proc. of the 9th Int. Conf. on Machine Learning and Computing, 2017, hal. 43-47.

A. Kahl, C. McConnell, dan W. Tsuma, “Crowdsourcing as a Tool in Conflict Prevention,” Conflict Trends, Vol. 2012, No. 1, hal. 27-34, Jan 2012.

(2018) “Pembobotan Kata atau Term Weighting TF-IDF,” [Online], https://informatikalogi.com/term-weighting-tfidf, tanggal akses: 3-Mei-2019.

J. Ramos, "Using TF-IDF to Determine Word Relevance in Document Queries," 1st Int. Conf. on Machine Learning, 2003, hal. 1-4.

A. Kontostathis, K. Reynolds, A. Garron dan L. Edwards, “Detecting Cyberbullying: Query Terms and Techniques,” Proc. of the 5th Annual ACM Web Science Conference (WebSci '13), 2013, hal. 195-204.

H. Nurrahmi dan D. Nurjanah, “Indonesian Twitter Cyberbullying Detection using Text Classification and User Credibility,” Int. Conf. on Information and Communications Technology (ICOIACT), 2018, hal 543-548.

I.E. Allen dan C.A. Seaman, (2007) "Likert Scale and Data Analyses," [Online], http://asq.org/quality-progress/2007/07/statistics/likertscaleand-data-analyses.html. tanggal akses: 20-Mei-2019.

K. Dinakar, B. Jones, C. Havasi, H. Lieberman, dan R. Picard, "Common Sense Reasoning for Detection, Prevention, and Mitigation of Cyberbullying," ACM Transactions on Interactive Intelligent Systems, Vol. 2, No. 3, hal. 18:1-30, 2012.




DOI: http://dx.doi.org/10.22146/jnteti.v8i4.530

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 JNTETI (Jurnal Nasional Teknik Elektro dan Teknologi Informasi)

JNTETI (Jurnal Nasional Teknik Elektro dan Teknologi Informasi)

Departemen Teknik Elektro dan Teknologi Informasi, Fakultas Teknik Universitas Gadjah Mada
Jl. Grafika No 2. Kampus UGM Yogyakarta 55281
+62 274 552305
jnteti@ugm.ac.id