Optimization of Naïve Bayes Data Mining Classification Algorithm

Abstract

As a probability-based statistical classification method, the Naïve Bayesian classifier has gained wide popularity; however, the performance of Naive Bayes classification algorithm suffers in the domains (data set) that involve correlated features. [Correlated features are the features which have a mutual relationship or connection with each other. As correlated features are related to each other, they are measuring the same feature only, means they are redundant features]. This paper is focused upon optimization of Naive Bayes classification algorithms to improve the accuracy of generated classification results with reduced time to build the model from training dataset. The aim is to improve the performance of Naive Bayes algorithms by removing the redundant correlated features before giving the dataset to classifier. This paper highlights and discusses the mathematical derivation of Naive Bayes classifier and theoretically proves how the redundant correlated features reduce the accuracy of the classification algorithm. Finally, from the experimental reviews using WEKA data mining software, this paper presents the impressive results with significant improvement into the accuracy and time taken to build the model by Naive Bayes classification algorithm.

Authors and Affiliations

Maneesh Singhal, Ramashankar Sharma

Keywords

Related Articles

A Survey on Power Quality Improvement of Multi Machine Systems Using FACTS Devices

The power quality plays a vital role in industries as well as transmitting the generating power to the utility it is necessary to minimize the power quality issues such as power losses , harmonics , power factor , react...

An Overview of Women Milk Producing CoOperatives in Karnataka

This paper analyzed the dairy development is one the strategies for women empowerment in developing countries like India. In this direction the present paper tries to highlight on the growth of milk production in India...

Wind Energy Power System Simulation for a Remote Area Power Supply

This work focuses on the designing and analysis of a wind energy power system for a Remote Area Power Supply (RAPS). The system introduces a three phase ac-ac converter with an isolated high frequency-link for a wind en...

A Study On Near Data Processing

This paper helps to understand the basics of Near Data Processing. Systems accessing big data need huge memory storage which results in quality less performance and high cost hence the term Near Data Processing emerged....

Fuzzy HX Subring of a HX Ring

In this paper, we define the concept of a fuzzy HX ring and define a new algebraic structure of a fuzzy HX subring of a HX ring. We also discuss some related properties of it.

Download PDF file
  • EP ID EP18589
  • DOI -
  • Views 861
  • Downloads 25

How To Cite

Maneesh Singhal, Ramashankar Sharma (2014). Optimization of Naïve Bayes Data Mining Classification Algorithm. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2(8), -. https://www.europub.co.uk/articles/-A-18589