Arabic Text Classification using Feature-Reduction Techniques for Detecting Violence on Social Media

Abstract

With the current increase in the number of online users, there has been a concomitant increase in the amount of data shared online. Techniques for discovering knowledge from these data can provide us with valuable information when it comes to detecting different problems, including violence. Violence is one of the significant problems humanity has faced in recent years all over the world, and this is especially a problem in Arabic countries. To address this issue, this research focuses on detecting violence-related tweets to help in solving this problem. Text mining is an important technique that can be used to find and predict information from text. In this study, a text classification model is built for detecting violence in Arabic dialects on Twitter using different feature-reduction approaches. The experiment comprises bagging, K-nearest neighbors (KNN), and Bayesian boosting using different extraction features, namely, root-based stemming, light stemming, and n-grams. In addition, the study used the following feature-reduction techniques: support vector machine (SVM), Chi-squared (CHI), the Gini index, correlation, rules, information gain (IG), deviation, symmetrical uncertainty, and the IG ratio. The experiment showed that the bagging with tri-gram approach has the highest accuracy at 86.61%, and a combination of IG with SVM from reduction features registers an accuracy of 90.59%.

Authors and Affiliations

Hissah ALSaif, Taghreed Alotaibi

Keywords

Related Articles

(AMDC) Algorithm for wireless sensor networks in the marine environment

Data compression is known today as one of the most important enabling technologies that form the foundation of the majority of data applications and networks as we know them, including wireless sensor networks and the po...

Location-based E-Commerce Services: (Re-) Designing using the ISO9126 Standard

E-commerce services based on user geographic location have emerged as a particularly important segment of modern information services. In these user-intensive applications, quality of service is important and design meth...

 Self-regulating Message Throughput in Enterprise Messaging Servers – A Feedback Control Solution

  Enterprise Messaging is a very popular message exchange concept in asynchronous distributed computing environments. The Enterprise Messaging Servers are heavily used in building business critical Enterprise applic...

Increasing the Target Prediction Accuracy of MicroRNA Based on Combination of Prediction Algorithms

MicroRNA is an oligonucleotide that plays a role in the pathogenesis of several diseases (mentioning Cancer). It is a non-coding RNA that is involved in the control of gene expression through the binding and inhibition o...

Design and Learning Effectiveness Evaluation of Gamification in e-Learning Systems

This paper proposes a gamification design model that can be used to design and develop gamified e-learning systems. Furthermore, a controlled and carefully designed experimental evaluation in terms of learning effectiven...

Download PDF file
  • EP ID EP550259
  • DOI 10.14569/IJACSA.2019.0100409
  • Views 116
  • Downloads 0

How To Cite

Hissah ALSaif, Taghreed Alotaibi (2019). Arabic Text Classification using Feature-Reduction Techniques for Detecting Violence on Social Media. International Journal of Advanced Computer Science & Applications, 10(4), 77-87. https://www.europub.co.uk/articles/-A-550259