A Novel Architecture of Agent based Crawling for OAI Resources - Europub

Search

Apply

A Novel Architecture of Agent based Crawling for OAI Resources

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 4

Abstract

Nowadays, most of the search engines are competing to index as much of the Surface Web as possible with leaving a lurch at the OAI content (pdf documents), which holds a huge amount of information than surface web. In this paper, a novel framework for OAI-PMH based Crawler is being proposed that uses agents to extract the metadata about the OAI resources nd store them in a repository which is later on queried hrough he OAI-PMH layer to generate the XML pages ontaining the metadata. These pages are further added to the search gines repository for indexing that makes in turn increases the relevancy of Search Engine. Agents are being used to rallelize the whole process so that metadata extraction from multiple resources can be carried out simultaneously.

Authors and Affiliations

Shruti Sharma , J. P. Gupta , A. K. Sharma

Keywords

OAI-PMH; Agents; Surface web; Hidden Web

Related Articles

Fuzzy Logic Based Decision Making for Customer Loyalty Analysis and Relationship Management

This paper presents customer loyalty analysis and relationship management by incorporating fuzzy logic approach. We employ a case study of Jane and Juliet supermarket located in Uyo, Akwa Ibom State in Nigeria, where dat...

Cryptography for Resource Constrained Devices: A Survey

Specifically designed and developed cryptographic algorithms, which are suitable for implementation in resource constrained devices such as RFID systems, smart cards and wireless sensor networks are called light weight c...

Design and Implementation of Neural Processor for Parsing Manufacturing Query Language 

Practically, all the approaches employed for parsing with natural languages use some or other type of neural network architecture and some typical statistical function for obtaining a parsing decision. In parsing with ne...

A Novel Document Clustering Algorithm Using Squared Distance Optimization Through Genetic Algorithms

K-Means Algorithm is most widely used algorithms in document clustering. However, it still suffer some shortcomings like random initialization, solution converges to local minima, and empty cluster formation. Genetic alg...

On Rough Set Modelling for Data Mining

Many problems in real world can be explained in natural languages. Rough Set Theory is defined with many operations, rules extended from classical set theory and is widely used to model systems related to data mining. Th...

Download PDF file

EP ID EP91883
DOI -
Views 118
Downloads 0