Volume 3, Issue 1, February 2014, Page: 1-7
An Effective Cluster-Aware Labeling Method for Web Search Results Using Concordant Document Frequencies
Masafumi Matsuhara, Department of Software and Information Science, Iwate Prefectural University, Iwate, Japan
Toshihiro Yoshida, NTT Advanced Technology Corporation, Kanagawa, Japan
Received: Jan. 19, 2014;       Published: Feb. 20, 2014
In recent years, the amount of information on World Wide Web has exploded. Search engines are generally used for web searching; however, robot-type search engines have a few problems. One such problem is that it is difficult for a user to come up with an appropriate query for obtaining the search results she/he intends. Moreover, it is difficult for users to understand the contents of search results because a robot-type search engine outputs many search results in a long list format. To solve these problems, many methods have been proposed that classify the results of a robot-type search engine into clusters that are labeled and then shown to the user. To be effective, the cluster label needs to consist of appropriate words to describe the web sites within the cluster. In this study, we propose a labeling method using concordant document frequencies where the web search results of a query are classified into clusters and we use our techniques to assign the proper labels to those clusters. We then find the set of web sites that result from an AND-query using an original query word and the cluster label. If this set and the members of the cluster are common, we say that the concordant document frequency is high, and the cluster label is assigned a high weight. Thus, it is possible to assign an appropriate label using our proposed cluster-aware method. We demonstrate the effectiveness of our proposed method by simulation experiments.
Labeling, Clustering; Web Search
