Extracting and Aligning the Data Using Tag Path Clustering and CTVS Method
Journal Title: International Journal of Advanced Research in Computer Engineering & Technology(IJARCET) - Year 2013, Vol 2, Issue 4
Abstract
– Web database generate query result page based on user’s query. The information extracted automatically from query result page is used in many web applications. We present a novel method called Tag path clustering for record extraction from multiple attributes. It focuses on how a distinct tag path appears repeatedly in the DOM tree of the web document. It compares a pair of tag path occurrence patterns (called visual signal) to estimate how likely these two tag path represents the same list of objects. This paper introduces the similarity measure that captures how closely the signals appear and interleave. We propose a new record alignment that aligns the attribute in the record, first pairwise and then holistically using CTVS method (combining tag and value similarity).We introduce a new technique to handle the case when the non contiguous QRR, which may be due to the presence of auxiliary information such as, comments, recommendations or advertisement. The nested structure is handled by the nested structure processing method.
Authors and Affiliations
J. KOWSALYA , K. DEEPA
Radio Interferences Performances in 750KV Transmission Line & 400KV Transmission Line of HVAC Transmission system by MATLAB program
This paper presents the methodologies for the radio interference measurements of electrical system in transmission lines, its effect, level, rules and design criteria describes that. This paper also shown that 750kV and...
Design of Multi-Channel UART Controller Based On FIFO and FPGA
This paper presents a multi-channel UART controller based on FPGA (Field Programmable Gate Array). UART a kind of serial communication circuit is used widely. A universal asynchronous receive/transmit (UART) is an...
SECURING DISTRIBUTED ACCOUNTABILITY FOR DATA SHARING IN CLOUD COMPUTING
The growing trend towards grid computing and cloud computing provides enormous potential for allowing dynamic, distributed and data demanding applications such as sharing and processing of large-scale scientific data. Cl...
A Hybrid Colour Image Enhancement Technique Based on Contrast Stretching and Peak Based Histogram Equalization
: Medical image enhancement technologies have attracted much attention since advanced medical equipments were put into use in the medical field. Enhanced medical images are desired by a surgeon to assist diagnosi...
Monitoring Data Integrity while using TPA in Cloud Environment
Cloud Computing is the arising technology that delivers software, platform and infrastructure as a service over a network. Cloud minimizes the burden of users by allowing them to remotely store their data and eliminat...