Computational Intelligence Methods for Clustering of SenseTagged Nepali Documents

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2015, Vol 17, Issue 1

Abstract

 Abstract: This paper presents a method using hybridization of self organizing map (SOM ), particle swarmoptimization(PSO) and k-means clustering algorithm for document clustering. Document representation is animportant step for clustering purposes. The common way of represent a text is bag of words approach. Thisapproach is simple but has two drawbacks viz. synonymy and polysemy which arise because of the ambiguity ofthe words and the lack of information about the relations between the words. To avoid the drawbacks of bag ofwords approach words are tagged with senses in WordNet in this paper. Sense tagging of words provide exactsenses of words. Feature vectors are generated using sense tagged documents and clustering is carried outusing proposed hybrid SOM+PSO+K-means algorithm. In the proposed algorithm initially SOM is applied tothe feature vectors to produce the prototypes and then K-means clustering algorithm is applied to cluster theprototypes. Particle Swarm Optimization algorithm is used to find the initial centroid for K-means algorithm.Text documents in Nepali language are used to test the hybrid SOM+PSO+K-means clustering algorithm.

Authors and Affiliations

Sunita Sarkar , Arindam Roy , Bipul Syam Purkayastha

Keywords

Related Articles

 Efficient Refining Of Why-Not Questions on Top-K Queries

Abstract: After decades of effort working on database performance, the quality and the usability of databasesystems have received more attention in recent years. In particular, answering the why-not questions after asear...

Scheduling Algorithm for University Timetabling Problem

Abstract: Scheduling for timetabling is one of the challenges faced by most Universities in developing countries. In this research work, consideration is made in developing of a scheduling algorithm capable of providing...

 Leveraging Map Reduce With Hadoop for Weather DataAnalytics

 Abstract : Collecting, storing and processing of huge amounts of climatic data is necessary for accurateprediction of weather. Meteorological departments use different types of sensors such as temperature, humidity...

Modification and Climate Change Analysis of surrounding Environment using Remote Sensing and Geographical Information System

Abstract: This review is presented in three parts. The first part explains such terms as climate, climate change, climate change adaptation, remote sensing (RS) and geographical information systems (GIS). The second part...

 Hospital Inpatient Caring By Markov Decision Process

Many challenges have been faced by the health care system involving high rates of drug-resistant and hospital-acquired disease, failures of care delivery leading to preventable adverse health events and skyrocketing cost...

Download PDF file
  • EP ID EP127150
  • DOI -
  • Views 110
  • Downloads 0

How To Cite

Sunita Sarkar, Arindam Roy, Bipul Syam Purkayastha (2015).  Computational Intelligence Methods for Clustering of SenseTagged Nepali Documents. IOSR Journals (IOSR Journal of Computer Engineering), 17(1), 83-89. https://www.europub.co.uk/articles/-A-127150