A Survey on Hadoop Assisted K-Means Clustering of Hefty Volume Images

Journal Title: International Journal on Computer Science and Engineering - Year 2014, Vol 6, Issue 3

Abstract

The objects or the overview of the objects in a remote sensing image can be detected or generated directly through the use of basic K-means clustering method. ENVI, ERDAS IMAGINE are some of the software that can be used to get the work done on PCs. But the hurdle to process the large amount of remote sensing images is limitations of hardware resources and the processing time. The parallel or the distributed computing remains the right choice in such cases. In this paper, the efforts are put to make the algorithm parallel using Hadoop MapReduce, a distributed computing framework which is an open source programming model. The introductory part explains the color representation of remote sensing images. There is a need to convert the RGB pixel values to CIELAB color space which is more suitable for distinguishing colors. The overview of the traditional K-means is provided and in the later part programming model MapReduce and the Hadoop platform for K-Means is described. To achieve this parallelization of the algorithm using the customized MapReduce functions in two stages is essential. The map and reduce functions for the algorithm are described by pseudo-codes. This method will be useful in the many similar situations of remote sensing images.

Authors and Affiliations

Anil R Surve , Nilesh S Paddune

Keywords

Related Articles

A NOVEL CODEBOOK INITIALISATION TECHNIQUE FOR GENERALIZED LLOYD ALGORITHM USING CLUSTER DENSITY

In this paper, a novel codebook initialization technique has been adopted for Generalized Lloyd Algorithm (GLA). GLA plays an important role in the design of Vector Quantizers. The proposed technique generates initial se...

Software Reliability Analyzer for improving Software Quality and Reliability

A software product is tested throughout testing stage of the software development life cycle to check whether the software meets the user’s necessities or not. For forecasting the reliability of the software, software re...

Monitoring Of Air Polution By Using Fuzzy Logic

The Air Quality Index is a simple and generalized way to describe the air quality in China, Hong Kong, Malaysia and now in India. Indian Air Quality Index (IND-AQI) is mainly a health related index with the descriptor wo...

Unsupervised Hybrid Classification for Texture Analysis Using Fixed and Optimal Window Size

For achieving better classification results in texture analysis, it is to combine different classification methods. Though there are existing methods which have been using fixed window size that resulted lack of classifi...

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem

The basic operation in elliptic curve cryptosystem is scalar ultiplication. It is the computation of integer multiple of a given point on the curve. Computation of scalar multiple is faster by using igned binary repres...

Download PDF file
  • EP ID EP88253
  • DOI -
  • Views 133
  • Downloads 0

How To Cite

Anil R Surve, Nilesh S Paddune (2014). A Survey on Hadoop Assisted K-Means Clustering of Hefty Volume Images. International Journal on Computer Science and Engineering, 6(3), 113-117. https://www.europub.co.uk/articles/-A-88253