Computational Intelligence Methods for Clustering of SenseTagged Nepali Documents
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2015, Vol 17, Issue 1
Abstract
Abstract: This paper presents a method using hybridization of self organizing map (SOM ), particle swarmoptimization(PSO) and k-means clustering algorithm for document clustering. Document representation is animportant step for clustering purposes. The common way of represent a text is bag of words approach. Thisapproach is simple but has two drawbacks viz. synonymy and polysemy which arise because of the ambiguity ofthe words and the lack of information about the relations between the words. To avoid the drawbacks of bag ofwords approach words are tagged with senses in WordNet in this paper. Sense tagging of words provide exactsenses of words. Feature vectors are generated using sense tagged documents and clustering is carried outusing proposed hybrid SOM+PSO+K-means algorithm. In the proposed algorithm initially SOM is applied tothe feature vectors to produce the prototypes and then K-means clustering algorithm is applied to cluster theprototypes. Particle Swarm Optimization algorithm is used to find the initial centroid for K-means algorithm.Text documents in Nepali language are used to test the hybrid SOM+PSO+K-means clustering algorithm.
Authors and Affiliations
Sunita Sarkar , Arindam Roy , Bipul Syam Purkayastha
An Efficient Secure Anonymous Communication Protocol in MANET based on Destinations Location
Abstract: The protocols and cryptographic techniques used in MANET are intended to provide complete security to the data transmitted with low cost. In hostile environments, as a part of providing security to data;...
Name Entity Detection and Relation Extraction from Unstructured Data by N-gram Features on Hidden Markov Model and Kernel Approach
Abstract: In recent years Name entity extraction and linking have received much attention. However, correct classification of entities and proper linking among these entities is a major challenge for researcher. We propo...
Can Wikipedia Be A Reliable Source For Translation?Testing Wikipedia Cross Lingual Coverage of Medical Domain
This paper introduces Wiki-Transpose, a query translation system for cross-lingual information retrieval (CLIR). Wiki-Transpose rely only on Wikipedia as information source for translations. The main goal of this paper i...
Passive Image Forensic Method to detect Copy Move Forgery in Digital Images
Abstract: Tampering in digital images has become very easy due to the availability of advanced image editing softwares to the users. Images are being tampered in a very efficient manner without leaving any visual c...
Jamming-Aware Traffic Allocation for Multiple-Path Routing Using Portfolio Selection against DDOS.
Abstract: Multiple-path routing protocols allow data source node to distribute the total traffic among available paths. We consider the problem of jamming-aware source routing in which the source node performs traf...