A shopping agent that automatically constructs wrappers for semi-structured online vendors

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

This paper proposes a shopping agent with a robust inductive learning method that automatically constructs wrappers for semi-structured online stores. Strong biases assumed in many existing systems are weakened so that the real stores with reasonably complex document structures can be handled. Our method treats a logical line as a basic unit, and recognizes the position and the structure of product descriptions by finding the most frequent pattern from the sequence of logical line information in output HTML pages. This method is capable of analyzing product descriptions that comprise multiple logical lines, and even those with extra or missing attributes. Experimental tests on over 60 sites show that it successfully constructs correct wrappers for most real stores.

Original languageEnglish
Title of host publicationIntelligent Data Engineering and Automated Learning - IDEAL 2000
Subtitle of host publicationData Mining, Financial Engineering, and Intelligent Agents - 2nd International Conference, Proceedings
EditorsKwong Sak Leung, Lai-Wan Chan, Helen Meng
PublisherSpringer Verlag
Pages368-373
Number of pages6
ISBN (Print)3540414509, 9783540414506
DOIs
StatePublished - 2000
Event2nd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2000 - Shatin, N.T., Hong Kong
Duration: 13 Dec 200015 Dec 2000

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1983
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2000
Country/TerritoryHong Kong
CityShatin, N.T.
Period13/12/0015/12/00

Fingerprint

Dive into the research topics of 'A shopping agent that automatically constructs wrappers for semi-structured online vendors'. Together they form a unique fingerprint.

Cite this