• Accent Conversion via Formant-based Spectral Mapping and Pitch Contour Modification

      Dyke,D.W., Berryman, F., Morgan, C.; Zheng, Dang Cong (University of Wolverhampton, 2011)
      Accent conversion intends to change the accent of a speaker to a desired accent and preserve the speaker’s voice identity. This technology can offer a number of useful applications. For example, integrating accent conversion to a text-to-speech system (TTS) can produce a voice with a desired accent instantly and inexpensively. Applying the technology to the film industry can change an actor’s or actress’s accent to a desired accent without hard training for the actor or actress to learn a new accent; this can be achieved by modifying the accent of the film recordings. As a foreign language learning tool, it could allow the learners to listen to their own voice with the native speaker’s accent and to mimic that accent. Hence, enhance learning experience and improve learning progress. In this dissertation, a new approach in both accent analysis and conversion has been proposed. In contrast to previous approaches in accent-related research, such as in regional or foreign accent classification and identification, where the databases are formed from large groups of single-accent speakers, this study uses data from an individual who can speak in two accents. This removes the effects of inter-speaker variability and facilitates efficient identification and analysis of acoustic features of different accents. Two British regional accents which display distinct differences to the human listener were used in this study as two typical British regional accents. Vowel based acoustic analysis was carried out to investigate the acoustic characteristics of the two accents and identify the prominent features that are most influential on the variability of accents. Acoustic characteristics such as formant frequencies, fundamental frequency and its variation slope, intensity of speech, and duration of phone were used for accent acoustic analysis. In this dissertation, accent conversion via formants modification and pitch contour manipulation was investigated. Three different formant-based spectral mappingalgorithms, mean-variance linear conversion, Nth order non-linear conversion and piece-wise linear transformation based on Gaussian mixture model conversion were investigated. Furthermore, the project has implemented accent conversion on a general speech analysis and synthesis system; the output speech synthesized by the three mapping algorithms was assessed by objective and subjective evaluation. The effects of spectral conversion and pitch contour conversion on accent conversion were also evaluated. The results of the study showed that accent conversion can be achieved to some degree via formants modification and pitch contour manipulation.