|Title||Automatic extraction of structure, content and usage data statistics of web sites|
|Publication Type||Conference Paper|
|Year of Publication||2010|
|Authors||Paparrizos, Ioannis K., Vassiliki A. Koutsonikola, Lefteris Angelis, and Athena Vakali|
|Editor||Chignell, Mark H., and Elaine G. Toms|
|Keywords||classification, Crawling, Structure Content and Usage data, Web Mining Algorithm|
In this paper we present a web mining tool which automaticallyextracts the structure, content and usage data statistics of websites. This work inspired by the fact that web mining consists ofthree axes: web structure mining, web content mining and webusage mining. Each one of those axes is using the structure,content and usage data respectively. The scope is to use thedeveloped multi-thread web crawler as a tool to automaticallyextract from web pages data that are associated with each one ofthose three axes in order afterwards to compute several usefuldescriptive statistics and apply advanced mathematical andstatistical methods. A description of our system is provided aswell as some experimentation results.
Automatic extraction of structure, content and usage data statistics of web sites