University of Wolverhampton
Browse
Collection All
bullet
bullet
bullet
bullet
Listed communities
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet
bullet

Wolverhampton Intellectual Repository and E-Theses > School of Technology > School of Computing and IT > Statistical Cybermetrics Research Group  > A layered approach for investigating the topological structure of communities in the Web.

Please use this identifier to cite or link to this item: http://hdl.handle.net/2436/4009
    Del.icio.us     LinkedIn     Citeulike     Connotea     Facebook     Stumble it!



Title: A layered approach for investigating the topological structure of communities in the Web.
Authors: Thelwall, Mike
Citation: Journal of Documentation, 59: 4
Publisher: MCB UP Ltd
Issue Date: 2003
URI: http://hdl.handle.net/2436/4009
DOI: 10.1108/00220410310485703
Additional Links: http://www.emeraldinsight.com/10.1108/00220410310485703
Abstract: A layered approach for identifying communities in the Web is presented and explored by applying the flake exact community identification algorithm to the UK academic Web. Although community or topic identification is a common task in information retrieval, a new perspective is developed by: the application of alternative document models, shifting the focus from individual pages to aggregated collections based upon Web directories, domains and entire sites; the removal of internal site links; and the adaptation of a new fast algorithm to allow fully-automated community identification using all possible single starting points. The overall topology of the graphs in the three least-aggregated layers was first investigated and found to include a large number of isolated points but, surprisingly, with most of the remainder being in one huge connected component, exact proportions varying by layer. The community identification process then found that the number of communities far exceeded the number of topological components, indicating that community identification is a potentially useful technique, even with random starting points. Both the number and size of communities identified was dependent on the parameter of the algorithm, with very different results being obtained in each case. In conclusion, the UK academic Web is embedded with layers of non-trivial communities and, if it is not unique in this, then there is the promise of improved results for information retrieval algorithms that can exploit this additional structure, and the application of the technique directly to partially automate Web metrics tasks such as that of finding all pages related to a given subject hosted by a single country's universities.
Type: Article
Language: en
Keywords: Information retrieval
Webometrics
Modelling
UK
Academic websites
Collaborative working
ISSN: 00220418,00000000
Appears in Collections: Statistical Cybermetrics Research Group
Statistical Cybermetrics Research Group

Files in This Item:
File Description Size Format View/Open
2003 A layered approach preprint.pdf329KbAdobe PDFThumbnail
View/Open

All Items in WIRE are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Fairtrade - Guarantees a better deal for Third World Producers

University of Wolverhampton, Wulfruna Street, Wolverhampton, WV1 1LY

Course enquiries: 0800 953 3222, General enquiries: 01902 321000,
Email: enquiries@wlv.ac.uk | Freedom of Information | Disclaimer and copyright | Website feedback | The University as a charity

OR Logo Powered by Open Repository | Cookies