Abstract
This paper discusses some of the methodological challenges for a first corpus-based analysis of so-called /h/ insertion in English, a feature that has been widely observed yet not analysed empirically so far. We survey the existing literature and present what is known about historical variation and change, before describing our data-driven approach in the Helsinki Corpus and the Corpus of Early English Correspondence and presenting some first results on internal conditioning, restriction to word type, and overall frequency. We show that /h/ was inserted on English as well as French loanwords, nouns, adjectives, verbs, adverbs, and numerals, and that there was a positive match of identical lexical items in the two corpora (able, am, it, and itself), making this a historically robust feature.
Keywords: /h/ insertion; corpus analysis; historical variation and change; Helsinki Corpus; Corpus of Early English Correspondence