A Survey on Improving the Clustering Performance in Text Mining for Efficient Information Retrieval
Journal Title: INTERNATIONAL JOURNAL OF ENGINEERING TRENDS AND TECHNOLOGY - Year 2014, Vol 8, Issue 5
Abstract
In recent years, the development of information systems in every field such as business, academics and medicine has led to increase in the amount of stored data year by year. A vast majority of data are stored in documents that are virtually unstructured. Text mining technology is very helpful for people to process huge information by imposing structure upon text. Clustering is a popular technique for automatically organizing a large collection of text. However, in real application domains, the experimenter possesses some background knowledge that helps in clustering the data. Traditional clustering techniques are rather unsuitable of multiple data types and cannot handle sparsity and high dimensional data. Co-clustering techniques are adopted to overcome the traditional clustering technique by simultaneously performing document and word clustering handling both deficiencies. Semantic understanding has become essential ingredient for information extraction, which is made by adopting constraints as a semi-supervised learning strategy. This survey reviews on the constrained co-clustering strategies adopted by researchers to boost the clustering performance. Experimental results using 20-Newsgroups dataset shows that the proposed method is effective for clustering textual documents. Furthermore, the proposed algorithm consistently outperformed all the existing constrained clustering and coclustering methods under different conditions.
Authors and Affiliations
S. Saranya , R. Munieswari
Effect of Fly Ash and Steel Fibre on Portland Pozzolana Cement Concrete
This paper presents the result of an experimental investigation carried out to evaluate the mechanical properties of concrete with steel fibre and steel fibre fly ash in which portland pozzolana cement was partially repl...
Analysis of Bandwidth Recycling in IEEE 802.16 Network Using PSA, RB-RFA & HSA
IEEE 802.16 network protocol is designed to provide a Worldwide Interoperability for Microwave Access (WiMAX). Due to limited bandwidth and an expensive radio spectrum available for communication, it is necessary to use...
A New Paradigm on Experimental Investigation of Concrete for E- Plastic Waste Management
This research paper seeks to optimize the benefits of using E Plastic Waste in the fiber form in concrete. The E Plastic waste (insulation wires) is shredded into fibers of specific size and shape. Several design concret...
Performance Characteristics of Oxy Hydrogen Gas on Two Stroke Petrol Engine
In order to conserve petroleum fuels for future and to eliminate the above limitations there is a need of alter native and innovative fuel. Amongst many alternative fuels available oxy hydrogen gas and producer gas are u...
Distinction Between EMD & EEMD Algorithm for Pitch Detection in Speech Processing
In this paper we describes the different algorithms for finding pitch markers in speech signal and it also explain how EEMD is better than EMD algorithm One of the major problem in EMD algorithm is mode mixing. EEMD algo...