Arabic Text Classification using Feature-Reduction Techniques for Detecting Violence on Social Media
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 4
Abstract
With the current increase in the number of online users, there has been a concomitant increase in the amount of data shared online. Techniques for discovering knowledge from these data can provide us with valuable information when it comes to detecting different problems, including violence. Violence is one of the significant problems humanity has faced in recent years all over the world, and this is especially a problem in Arabic countries. To address this issue, this research focuses on detecting violence-related tweets to help in solving this problem. Text mining is an important technique that can be used to find and predict information from text. In this study, a text classification model is built for detecting violence in Arabic dialects on Twitter using different feature-reduction approaches. The experiment comprises bagging, K-nearest neighbors (KNN), and Bayesian boosting using different extraction features, namely, root-based stemming, light stemming, and n-grams. In addition, the study used the following feature-reduction techniques: support vector machine (SVM), Chi-squared (CHI), the Gini index, correlation, rules, information gain (IG), deviation, symmetrical uncertainty, and the IG ratio. The experiment showed that the bagging with tri-gram approach has the highest accuracy at 86.61%, and a combination of IG with SVM from reduction features registers an accuracy of 90.59%.
Authors and Affiliations
Hissah ALSaif, Taghreed Alotaibi
Using the Facebook Iframe as an Effective Tool for Collaborative Learning in Higher Education
Facebook is increasingly becoming a popular senvironment for online learning. Despite the popularity of using Facebook as an e-learning tool, there is a limitation when it comes to presenting content: another platform is...
Domain and Schema Independent Semantic Model Verbalization: A Conceptual Overview
Semantic Web-based technologies have become extremely popular and its a success that has spread across many domains, additional to the computer science domain. Nevertheless, the reusability aspects associated with the cr...
Dynamic Allocation of Abundant Data Along Update Sub-Cycles To Support Update Transactions In Wireless Broadcasting
Supporting transactions processing over wireless broadcasting environment has attracted a considerable amount of research in a mobile computing system. To allow more than one conflicting transactions to be committed with...
Sentiment Analysis of Arabic Jordanian Dialect Tweets
Sentiment Analysis (SA) of social media contents has become one of the growing areas of research in data mining. SA provides the ability of text mining the public opinions of a subjective manner in real time. This paper...
Optimizing the Behaviour of Web Users Through Expectation Maximization Algorithm and Mixture of Normal Distributions
The proposed work is to analyse the user’s behaviour in web access. Worldwide, the web users are browsing through different websites every second. Aim of this paper is to identify the behaviour of user's in a time bound...