Ontology-based automatic classification and ranking for web documents

Jun Fang, Lei Guo, Xiao Dong Wang, Ning Yang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

25 Scopus citations

Abstract

The process of web document classification involves calculating similarities between documents and categories by using the information extracted from them. In recent years, ontology-based web documents classification method is introduced to solve the problem of classifier training and not considering semantic relations between words in traditional Machine Learning algorithms. However, previous works on ontology-based web documents classification miss some important issues of automatic ontology construction and ranking of classified documents. In order to solve these problems, this paper proposes an ontology-based web documents classification and ranking method. Firstly, weighted terms set are extracted from web documents, and ontology is build up by using an effective ontology construction method which clarifies and augments an existent ontology; then similarity score between documents and ontology is computed based on WordNet by using Earth Mover's Distance (EMD) method; finally, web documents are assigned to categories according to the similarity score, and a simple ranking method is used to sort the documents in the same categories. The experiment result shows our classification algorithm achieves better precision and recall compare with adaptive KNN method, and is competitive with SVM method, the ranking method also has good performance.

Original languageEnglish
Title of host publicationProceedings - Fourth International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2007
Pages627-631
Number of pages5
DOIs
StatePublished - 2007
Event4th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2007 - Haikou, China
Duration: 24 Aug 200727 Aug 2007

Publication series

NameProceedings - Fourth International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2007
Volume3

Conference

Conference4th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2007
Country/TerritoryChina
CityHaikou
Period24/08/0727/08/07

Fingerprint

Dive into the research topics of 'Ontology-based automatic classification and ranking for web documents'. Together they form a unique fingerprint.

Cite this