An Overview of Web Data Clustering Practices

TitleAn Overview of Web Data Clustering Practices
Publication TypeConference Paper
Year of Publication2004
Abstract

Clustering is a challenging topic in the area of Web data management.Various forms of clustering are required in a wide range of applications, includingfinding mirrored Web pages, detecting copyright violations, and reporting searchresults in a structured way. Clustering can either be performed once offline, (independentlyto search queries), or online (on the results of search queries). Importantefforts have focused on mining Web access logs and to cluster search engine resultson the fly. Online methods based on link structure and text have been appliedsuccessfully to finding pages on related topics. This paper presents an overview ofthe most popular methodologies and implementations in terms of clustering eitherWeb users or Web sources and presents a survey about current status and futuretrends in clustering employed over the Web.

PDF: 

auth logo

Location & Contact

Department of Informatics
Aristotle University of Thessaloniki
Thessaloniki GR-54124

t  | (+30) 2310 998415
e | oswinds@csd.auth.gr