Brief Communication: The clustering power of low frequency words in academic Webs
Abstract
The value of low frequency words for subject-based academic Web site clustering is assessed. A new technique is introduced to compare the relative clustering power of different vocabularies. The technique is designed for word frequency tests in large document clustering exercises. Results for the Australian and New Zealand academic Web spaces indicate that low frequency words are useful for clustering academic Web sites along subject lines; removing low frequency words results in sites becoming, on average, less dissimilar to sites from other subjects.Citation
Journal of the American Society for Information Science and Technology, 56 (8): 883-888Publisher
WileyJournal
Journal of the American Society for Information Science and TechnologyAdditional Links
http://www3.interscience.wiley.com/journal/110435728/abstractType
Journal articleLanguage
enISSN
1532288215322890
ae974a485f413a2113503eed53cd6c53
10.1002/asi.20177