Feature-rich PoS Tagging through Taggers Combination : Experience in Arabic
Journal Title: Transactions on Machine Learning and Artificial Intelligence - Year 2017, Vol 5, Issue 4
Abstract
Since words can play different syntactic roles in different contexts, it is not trivial to assign the appropriate morphosyntactic category to each word according to the context. Part of Speech (PoS) tagging is the task which manage this issue. Several probabilistic methods have been adapted for PoS tagging such as Hidden Markov Models, Support Vector Machines, and Decision Tree. Based on these methods, languageindependent PoS taggers have been developed such as TnT, SVMTool, and Treetagger. The main purpose of this work is to combine automatically the output of these standard PoS taggers and investigate several options for how to do this combination. The experiments are applied to one of the morphologically complex languages, Arabic. In this paper, we highlight the use of these taggers via various experiments. In fact, the evaluations involve several tests on both Classical and Modern Standard Arabic, trained/untrained and tagged/untagged data. Finally, a deeper investigation of Arabic PoS tagging through these language-independent taggers combination is performed.
Authors and Affiliations
Imad Zeroual, Abdelhak Lakhouaja
The Bidirectional Long-Short-Term Memory Neural Network based Word Retrieval for Arabic Documents
The reflow from Arabic document image collections is a challenging task. This is partly due to the insolubility of the Arabic script. Because of the peculiarity of the whole body of the Arabic words, namely connectivity...
Survey and Comparative Study on Agile Methods in Software Engineering
Today‘s business environment is very much dynamic, and organizations are constantly changing their software requirements to adjust with new environment. They also demand for fast delivery of software products as well as...
Sentiment Analysis tool for Pharmaceutical Industry & Healthcare
Sentiment analysis (SA) is broadly used to analyze people’s opinions about a product or an event to identify breakpoints in public opinion. Particularly, pharmaceutical companies use SA to ensure they gain a competitive...
Integration of the ASR Toolkit Kaldi into a Domoticz Home Automation System
This paper presents the design and the implementation of an interface between Kaldi, automatic speech recognition toolkit, and a home automation system. This interface is based on Open Platform communication (OPC) protoc...
Intelligent System for the Management of Resources Dedicated to Humanity
Our work consists, one way or another, in projecting the light on the intensive need for the reasonable management of water resources. According to the latest studies and statistics, Morocco will soon face a serious cris...