Twitter Sentiment Analysis in Under-Resourced Languages using Byte-Level Recurrent Neural Model
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 8
Abstract
Sentiment analysis in non-English language can be more challenging than the English language because of the scarcity of publicly available resources to build the prediction model with high accuracy. To alleviate this under-resourced problem, this article introduces the leverage of byte-level recurrent neural model to generate text representation for twitter sentiment analysis in the Indonesian language. As the main part of the proposed model training is unsupervised and does not require much-labeled data, this approach can be scalable by using a huge amount of unlabeled data that is easily gathered on the Internet, without much dependencies on human-generated resources. This paper also introduces an Indonesian dataset for general sentiment analysis. It consists of 10,806 twitter data (tweets) selected from a total of 454,559 gathered tweets which taken directly from twitter using twitter API. The 10,806 tweets are then classified into 3 categories, positive, negative, and neutral. This Indonesian dataset could help the development of Indonesian sentiment analysis especially general sentiment analysis and encouraged others to start publishing similar dataset in the future.
Authors and Affiliations
Ridi Ferdiana, Wiliam Fajar, Desi Dwi Purwanti, Artmita Sekar Tri Ayu, Fahim Jatmiko
Implementation of Vision-based Object Tracking Algorithms for Motor Skill Assessments
Assessment of upper extremity motor skills often involves object manipulation, drawing or writing using a pencil, or performing specific gestures. Traditional assessment of such skills usually requires a trained person t...
A Comparative Study of Classification Algorithms using Data Mining: Crime and Accidents in Denver City the USA
In the last five years, crime and accidents rates have increased in many cities of America. The advancement of new technologies can also lead to criminal misuse. In order to reduce incidents, there is a need to understan...
Hex Symbols Algorithm for Anti-Forensic Artifacts on Android Devices
Mobile phones technology has become one of the most common and important technologies that started as a communication tool and then evolved into key reservoirs of personal information and smart applications. With this in...
Design and Analysis of a Novel Low-Power SRAM Bit-Cell Structure at Deep-Sub-Micron CMOS Technology for Mobile Multimedia Applications
The growing demand for high density VLSI circuits and the exponential dependency of the leakage current on the oxide thickness is becoming a major challenge in deep-sub-micron CMOS technology. In this work, a novel Stati...
Service-Oriented Context-Aware Messaging System
In services oriented computing, location or spatial models are required to model the domain environment whenever location or spatial relationships are utilised by users and/or services. This research presents an ontology...