Perbandingan Metode Seleksi Fitur Pada Analisis Sentimen (Studi Kasus Opini PILKADA DKI 2017)

  • Edwar Edwar ITB STIKOM Bali
  • I Gusti Agung Ngurah Rai Semadi ITB STIKOM Bali
  • Muhamad Samsudin ITB STIKOM Bali
  • I Komang Dharmendra ITB STIKOM Bali

Abstract

In sentiment analysis, feature selection is a crucial step as it improves the performance and efficiency of sentiment analysis models. Feature selection also helps reduce the complexity of data dimensions, enabling faster and more efficient analysis. However, selecting relevant features poses a challenge as choosing the wrong features can decrease the accuracy of the constructed models. In this study, sentiment analysis was conducted on tweet data from the 2017 Jakarta gubernatorial election using TF-IDF feature selection combined with Recursive Feature Elimination (RFE), Chi Square, and Mutual Information. The models were evaluated using Naïve Bayes Classification (NBC) and Support Vector Machine (SVM) algorithms. Evaluation metrics such as accuracy, precision, recall, and F1-Score were used. The experimental results showed that the TfidfVectorizer + RFE combination in the NBC model achieved the highest accuracy of 71.1111% and demonstrated significant performance in terms of precision, recall, and F1-Score

References

[1] I. K. Dharmendra, N. N. U. Januhari, I. P. Ramayasa, and I. M. A. W. Putra, “Uji Komparasi Sentiment Analysis Pada Opini Alumni Terhadap Perguruan Tinggi,” Jurnal Teknik Informatika UNIKA Santo Thomas, pp. 1–6, May 2022, doi: 10.54367/jtiust.v7i1.1748.
[2] G. A. Buntoro, “Analisis Sentimen Calon Gubernur DKI Jakarta 2017 Di Twitter,” vol. 1, no. 1, pp. 32–41, 2017.
[3] T. Desyani, A. Saifudin, and Y. Yulianti, “Feature Selection Based on Naive Bayes for Caesarean Section Prediction,” IOP Conf. Ser.: Mater. Sci. Eng., vol. 879, no. 1, p. 012091, Jul. 2020, doi: 10.1088/1757-899X/879/1/012091.
[4] A. K. Fauziyyah, “ANALISIS SENTIMEN PANDEMI COVID19 PADA STREAMING TWITTER DENGAN TEXT MINING PYTHON,” Jurnal Ilmiah SINUS, vol. 18, no. 2, Art. no. 2, Jul. 2020, doi: 10.30646/sinus.v18i2.491.
[5] P. H. Prastyo, I. Ardiyanto, and R. Hidayat, “A Review of Feature Selection Techniques in Sentiment Analysis Using Filter, Wrapper, or Hybrid Methods,” in 2020 6th International Conference on Science and Technology (ICST), Sep. 2020, pp. 1–6. doi: 10.1109/ICST50505.2020.9732885.
[6] H. Zhao, Z. Liu, X. Yao, and Q. Yang, “A machine learning-based sentiment analysis of online product reviews with a novel term weighting and feature selection approach,” Information Processing & Management, vol. 58, no. 5, p. 102656, Sep. 2021, doi: 10.1016/j.ipm.2021.102656.
[7] “KOMPARASI FITUR SELEKSI PADA ALGORITMA SUPPORT VECTOR MACHINE UNTUK ANALISIS SENTIMEN REVIEW | Arifin | Jurnal Informatika.” https://ejournal.bsi.ac.id/ejurnal/index.php/ji/article/view/868/936 (accessed May 21, 2021).
[8] U. I. Larasati, M. A. Muslim, R. Arifudin, and A. Alamsyah, “Improve the Accuracy of Support Vector Machine Using Chi Square Statistic and Term Frequency Inverse Document Frequency on Movie Review Sentiment Analysis,” Scientific Journal of Informatics, vol. 6, no. 1, Art. no. 1, May 2019, doi: 10.15294/sji.v6i1.14244.
[9] N. S. Mohd Nafis and S. Awang, “An Enhanced Hybrid Feature Selection Technique Using Term Frequency-Inverse Document Frequency and Support Vector Machine-Recursive Feature Elimination for Sentiment Classification,” IEEE Access, vol. 9, pp. 52177–52192, 2021, doi: 10.1109/ACCESS.2021.3069001.
[10] W. Ningsih, B. Budiman, and I. Umami, “Implementasi Algoritma Naïve Bayes Untuk Menentukan Calon Penerima Beasiswa Di SMK YPM 14 Sumobito Jombang,” Jurnal Teknologi Dan Sistem Informasi Bisnis, vol. 4, no. 2, Art. no. 2, Jul. 2022, doi: 10.47233/jteksis.v4i2.570.
[11] “Klasifikasi Menggunakan Naïve Bayes Dan K-Nearest Neighbor Pada Manajemen Layanan Teknologi Informasi | Jurnal Teknologi Dan Sistem Informasi Bisnis.” http://jurnal.unidha.ac.id/index.php/jteksis/article/view/121 (accessed Jun. 11, 2023).
[12] “Analisis Sentimen Tentang Opini Pilkada DKI 2017 Pada Dokumen Twitter Berbahasa Indonesia Menggunakan Naïve Bayes dan Pembobotan Emoji | Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer.” https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/627 (accessed Apr. 30, 2023).
[13] “sastrawi/sastrawi: High quality stemmer library for Indonesian Language (Bahasa).” https://github.com/sastrawi/sastrawi (accessed Jan. 09, 2022).
[14] E. Retnoningsih, D. Diyah, P. Utami, and D. P. Utami, “Penerapan Knowledge Management pada Perguruan Tinggi … PENERAPAN KNOWLEDGE MANAGEMENT PADA PERGURUAN TINGGI (STUDI KASUS AMIK BSI PURWOKERTO)”.
[15] E. Retnoningsih, “Seminar Nasional Teknologi Informasi dan Multimedia 2015 MENGUKUR TINGKAT KEPUASAN PENGGUNAAN LEARNING MANAGEMENT SYSTEM DALAM KNOWLEDGE SHARING,” pp. 6–8, 2015.
Published
2023-07-28
How to Cite
EDWAR, Edwar et al. Perbandingan Metode Seleksi Fitur Pada Analisis Sentimen (Studi Kasus Opini PILKADA DKI 2017). INFORMATICS FOR EDUCATORS AND PROFESSIONAL : Journal of Informatics, [S.l.], v. 8, n. 1, p. 11 - 18, july 2023. ISSN 2548-3412. Available at: <https://460290.0x60nl4us.asia/index.php/ITBI/article/view/2408>. Date accessed: 28 nov. 2024. doi: https://doi.org/10.51211/itbi.v8i1.2408.