Loading...
Size Matters: A Quantitative Approach to Corpus Representativeness
Corpas Pastor, Gloria ; Seghiri Domínguez, Míriam
Corpas Pastor, Gloria
Seghiri Domínguez, Míriam
Editors
Other contributors
Affiliation
Epub Date
Issue Date
2010-06-01
Submitted date
Subjects
Files
Loading...
size matters.pdf
Adobe PDF, 401.43 KB
Alternative
Abstract
We should always bear in mind that the assumption of representativeness ‘must be regarded largely as an act of faith’ (Leech 1991: 2), as at present we have no means of ensuring it, or even evaluating it objectively. (Tognini-Bonelli 2001: 57) Corpus Linguistics (CL) has not yet come of age. It does not make any difference whether we consider it a full-fledged linguistic discipline (Tognini-Bonelli 2000: 1) or, else, a set of analytical techniques that can be applied to any discipline (McEnery et al. 2006: 7). The truth is that CL is still striving to solve thorny, central issues such as optimum size, balance and representativeness of corpora (of the language as a whole or of some subset of the language). Corpus-driven/based studies rely on the quality and representativeness of each corpus as their true foundation for producing valid results. This entails deciding on valid external and internal criteria for corpus design and compilation. A basic tenet is that corpus representativeness determines the kinds of research questions that can be addressed and the generalizability of the results obtained (cf. Biber et al. 1988: 246). Unfortunately, faith and beliefs do not seem to ensure quality. In this paper we will attempt to deal with these key questions. Firstly, we will give a brief description of the R&D projects which originally have served as the main framework for this research. Secondly, we will focus on the complex notion of corpus representativeness and ideal size, from both a theoretical and an applied perspective. Finally, we will describe a computer application which has been developed as part of the research. This software will be used to verify whether a sample bilingual comparable corpus could be deemed representative.
Citation
Corpas Pastor, G. and Seghiri Domínguez, M. (2010) Size Matters: A Quantitative Approach to Corpus Representativeness, in Rabadán, R., Fernández López, M. and Guzmán González, T. (Eds.) Lengua, traducción, recepción en honor de Julio César Santoyo. León: Universidad de León Área de Publicaciones, pp. 111-145.
Publisher
Journal
Research Unit
DOI
PubMed ID
PubMed Central ID
Embedded videos
Additional Links
Type
Chapter in book
Language
en
Description
Series/Report no.
ISSN
EISSN
ISBN
9788497735292