University of Wolverhampton
Browse
Collection All
bullet
bullet
bullet
bullet
Listed communities
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet

Wolverhampton Intellectual Repository and E-Theses > Research Institutes > Research Institute in Information and Language Processing > Statistical Cybermetrics Research Group > Finding similar academic Web sites with links, bibliometric couplings and colinks

Please use this identifier to cite or link to this item: http://hdl.handle.net/2436/27375
    Del.icio.us     LinkedIn     Citeulike     Connotea     Facebook     Stumble it!



Title: Finding similar academic Web sites with links, bibliometric couplings and colinks
Authors: Thelwall, Mike
Wilkinson, David
Citation: Information Processing & Management, 40 (3): 515-526
Publisher: Elsevier
Journal: Information Processing & Management
Issue Date: 2004
URI: http://hdl.handle.net/2436/27375
DOI: 10.1016/S0306-4573(03)00042-6
Additional Links: http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6VC8-48WPPGX-1&_user=1644469&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000054077&_version=1&_urlVersion=0&_userid=1644469&md5=3426871bfba523a853ec684de16a9ecf
Abstract: A common task in both Webmetrics and Web information retrieval is to identify a set of Web pages or sites that are similar in content. In this paper we assess the extent to which links, colinks and couplings can be used to identify similar Web sites. As an experiment, a random sample of 500 pairs of domains from the UK academic Web were taken and human assessments of site similarity, based upon content type, were compared against ratings for the three concepts. The results show that using a combination of all three gives the highest probability of identifying similar sites, but surprisingly this was only a marginal improvement over using links alone. Another unexpected result was that high values for either colink counts or couplings were associated with only a small increased likelihood of similarity. The principal advantage of using couplings and colinks was found to be greater coverage in terms of a much larger number of pairs of sites being connected by these measures, instead of increased probability of similarity. In information retrieval terminology, this is improved recall rather than improved precision.
Type: Article
Language: en
Keywords: Document clustering
Webometrics
Information retrieval
Academic websites
ISSN: 03064573
Appears in Collections: Statistical Cybermetrics Research Group
Statistical Cybermetrics Research Group

Files in This Item:

There are no files associated with this item.



All Items in WIRE are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Fairtrade - Guarantees a better deal for Third World Producers

University of Wolverhampton, Wulfruna Street, Wolverhampton, WV1 1LY

Course enquiries: 0800 953 3222, General enquiries: 01902 321000,
Email: enquiries@wlv.ac.uk | Freedom of Information | Disclaimer and copyright | Website feedback | The University as a charity

OR Logo Powered by Open Repository | Cookies