A Cluster Based Approach for Classification of Web Results
Apeksha Khabia and M. B. Chandak
Abstract
Nowadays significant amount of information from web is present in the form of text, e.g., reviews, forum postings, blogs, news articles, email messages, web pages. It becomes difficult to classify documents in predefined categories as the number of document grows. Clustering is the classification of a data into clusters, so that the data in each cluster share some common trait – often vicinity according to some defined measure. Underlying distribution of data set can somewhat be depicted based on the learned clusters under the guidance of initial data set. Thus, clusters of documents can be employed to train the classifier by using defined features of those clusters. One of the important issues is also to classify the text data from web into different clusters by mining the knowledge. Conforming to that, this paper presents a review on most of document clustering technique and cluster based classification techniques used so far. Also pre-processing on text dataset and document clustering method is explained in brief.
Keyword
Text mining, clustering, classification, IF-IDF.
Cite this article
.A Cluster Based Approach for Classification of Web Results. International Journal of Advanced Computer Research. 2014;4(17):934-938.