A Multilingual Datasets Repository of the Hadith Content

Abstract

Knowledge extraction from unstructured data is a challenging research problem in research domain of Natural Language Processing (NLP). It requires complex NLP tasks like entity extraction and Information Extraction (IE), but one of the most challenging tasks is to extract all the required entities of data in the form of structured format so that data analysis can be applied. Our focus is to explain how the data is extracted in the form of datasets or conventional database so that further text and data analysis can be carried out. This paper presents a framework for Hadith data extraction from the Hadith authentic sources. Hadith is the collection of sayings of Holy Prophet Muhammad, who is the last holy prophet according to Islamic teachings. This paper discusses the preparation of the dataset repository and highlights issues in the relevant research domain. The research problem and their solutions of data extraction, pre-processing and data analysis are elaborated. The results have been evaluated using the standard performance evaluation measures. The dataset is available in multiple languages, multiple formats and is available free of cost for research purposes.

Authors and Affiliations

Ahsan Mahmood, Hikmat Ullah Khan, Fawaz K. Alarfaj, Muhammad Ramzan, Mahwish Ilyas

Keywords

Related Articles

Neural Network Based Lna Design for Mobile Satellite Receiver

Paper presents a Neural Network Modelling approach to microwave LNA design. To acknowledge the specifications of the amplifier, Mobile Satellite Systems are analyzed. Scattering parameters of the LNA in the frequency ran...

3D Servicescape Model: Atmospheric Qualities of Virtual Reality Retailing

The purpose of this paper is to provide a 3D servicescape conceptual model which explores the potential effect of 3D virtual reality retail stores’ environment on shoppers' behaviour. Extensive review of literature withi...

Clustering of Image Data Using K-Means and Fuzzy K-Means

Clustering is a major technique used for grouping of numerical and image data in data mining and image processing applications. Clustering makes the job of image retrieval easy by finding the images as similar as given i...

Energy Consumption Evaluation of AODV and AOMDV Routing Protocols in Mobile Ad-Hoc Networks

Mobile Ad-hoc Networks (MANETs) are mobile, multi-hop wireless networks that can be set up anytime, anywhere without the need of pre-existing infrastructure. Due to its dynamic topology the main challenge in such network...

NFC Technology for Contactless Payment Echosystems

Since the earliest ages, the human being has not ceased to develop its system of exchange of goods. The first system introduced is barter, it has evolved over time into currency by taking various forms (shells, teeth, fe...

Download PDF file
  • EP ID EP276758
  • DOI 10.14569/IJACSA.2018.090224
  • Views 104
  • Downloads 0

How To Cite

Ahsan Mahmood, Hikmat Ullah Khan, Fawaz K. Alarfaj, Muhammad Ramzan, Mahwish Ilyas (2018). A Multilingual Datasets Repository of the Hadith Content. International Journal of Advanced Computer Science & Applications, 9(2), 165-172. https://www.europub.co.uk/articles/-A-276758