Brief Communication: The clustering power of low frequency words in academic Webs
Price, Liz ; Thelwall, Mike
Price, Liz
Thelwall, Mike
Authors
Editors
Other contributors
Affiliation
Epub Date
Issue Date
2005
Submitted date
Alternative
Abstract
The value of low frequency words for subject-based academic Web site clustering is assessed. A new technique is introduced to compare the relative clustering power of different vocabularies. The technique is designed for word frequency tests in large document clustering exercises. Results for the Australian and New Zealand academic Web spaces indicate that low frequency words are useful for clustering academic Web sites along subject lines; removing low frequency words results in sites becoming, on average, less dissimilar to sites from other subjects.
Citation
Journal of the American Society for Information Science and Technology, 56 (8): 883-888
Publisher
Research Unit
PubMed ID
PubMed Central ID
Embedded videos
Additional Links
Type
Journal article
Language
en
Description
Series/Report no.
ISSN
15322882
15322890
15322890