• Ngrams and Engrams: the use of structural and conceptual features to discriminate between English translations of religious texts

      Franklin, Emma; Oakes, Michael (Edinburgh University Press, 2017-01-11)
      In this paper, we present experiments using the Linguistic Inquiry and Word Count (LIWC) program, a ‘closed-class keyword’ (CCK) analysis and a ‘correspondence analysis’ (CA) to examine whether the Scientology texts of L. Ron Hubbard are linguistically and conceptually like those of other religions. A Kruskal–Wallis test comparing the frequencies of LIWC category words in the Scientology texts and the English translations of the texts of five other religions showed that there were eighteen categories for which the Scientology texts differed from the others, and between one and seventeen for the other religions. In the CCK experiment, keywords typical of each religion were found, both by comparing the religious texts with one another and with the Brown corpus of general English. The most typical keywords were looked up in a concordancer and were manually coded with conceptual tags. The set of categories found for the Scientology texts showed little overlap with those found for the others. Our CA experiments produced fairly clear clusters of texts by religion. Scientology texts were seen at one pole on the first factor, with Christian and Islamic texts at the other. It appears that, in several ways, the Scientology texts are dissimilar to the texts of some of the world's major religions.