Text-Independent Speaker Identification Using Vowel Formants

2.50
Hdl Handle:
http://hdl.handle.net/2436/620725
Title:
Text-Independent Speaker Identification Using Vowel Formants
Authors:
Almaadeed, Noor; Aggoun, Amar; Amira, Abbes
Abstract:
Automatic speaker identification has become a challenging research problem due to its wide variety of applications. Neural networks and audio-visual identification systems can be very powerful, but they have limitations related to the number of speakers. The performance drops gradually as more and more users are registered with the system. This paper proposes a scalable algorithm for real-time text-independent speaker identification based on vowel recognition. Vowel formants are unique across different speakers and reflect the vocal tract information of a particular speaker. The contribution of this paper is the design of a scalable system based on vowel formant filters and a scoring scheme for classification of an unseen instance. Mel-Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) have both been analysed for comparison to extract vowel formants by windowing the given signal. All formants are filtered by known formant frequencies to separate the vowel formants for further processing. The formant frequencies of each speaker are collected during the training phase. A test signal is also processed in the same way to find vowel formants and compare them with the saved vowel formants to identify the speaker for the current signal. A score-based scheme allows the speaker with the highest matching formants to own the current signal. This model requires less than 100 bytes of data to be saved for each speaker to be identified, and can identify the speaker within a second. Tests conducted on multiple databases show that this score-based scheme outperforms the back propagation neural network and Gaussian mixture models. Usually, the longer the speech files, the more significant were the improvements in accuracy.
Citation:
Text-Independent Speaker Identification Using Vowel Formants 2015, 82 (3):345 Journal of Signal Processing Systems
Publisher:
Springer US
Journal:
Journal of Signal Processing Systems
Issue Date:
5-May-2015
URI:
http://hdl.handle.net/2436/620725
DOI:
10.1007/s11265-015-1005-5
Additional Links:
http://link.springer.com/10.1007/s11265-015-1005-5
Type:
Article
ISSN:
1939-8018; 1939-8115
Appears in Collections:
FSE

Full metadata record

DC FieldValue Language
dc.contributor.authorAlmaadeed, Nooren
dc.contributor.authorAggoun, Amaren
dc.contributor.authorAmira, Abbesen
dc.date.accessioned2017-10-04T08:48:03Z-
dc.date.available2017-10-04T08:48:03Z-
dc.date.issued2015-05-05-
dc.identifier.citationText-Independent Speaker Identification Using Vowel Formants 2015, 82 (3):345 Journal of Signal Processing Systemsen
dc.identifier.issn1939-8018-
dc.identifier.issn1939-8115-
dc.identifier.doi10.1007/s11265-015-1005-5-
dc.identifier.urihttp://hdl.handle.net/2436/620725-
dc.description.abstractAutomatic speaker identification has become a challenging research problem due to its wide variety of applications. Neural networks and audio-visual identification systems can be very powerful, but they have limitations related to the number of speakers. The performance drops gradually as more and more users are registered with the system. This paper proposes a scalable algorithm for real-time text-independent speaker identification based on vowel recognition. Vowel formants are unique across different speakers and reflect the vocal tract information of a particular speaker. The contribution of this paper is the design of a scalable system based on vowel formant filters and a scoring scheme for classification of an unseen instance. Mel-Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) have both been analysed for comparison to extract vowel formants by windowing the given signal. All formants are filtered by known formant frequencies to separate the vowel formants for further processing. The formant frequencies of each speaker are collected during the training phase. A test signal is also processed in the same way to find vowel formants and compare them with the saved vowel formants to identify the speaker for the current signal. A score-based scheme allows the speaker with the highest matching formants to own the current signal. This model requires less than 100 bytes of data to be saved for each speaker to be identified, and can identify the speaker within a second. Tests conducted on multiple databases show that this score-based scheme outperforms the back propagation neural network and Gaussian mixture models. Usually, the longer the speech files, the more significant were the improvements in accuracy.en
dc.publisherSpringer USen
dc.relation.urlhttp://link.springer.com/10.1007/s11265-015-1005-5en
dc.rightsArchived with thanks to Journal of Signal Processing Systemsen
dc.subjectVowel formantsen
dc.subjectSpeaker identificationen
dc.subjectVowel recognitionen
dc.subjectLinear predictive codingen
dc.subjectMel-frequency Cepstral coefficientsen
dc.titleText-Independent Speaker Identification Using Vowel Formants-
dc.typeArticleen
dc.identifier.journalJournal of Signal Processing Systemsen
All Items in WIRE are protected by copyright, with all rights reserved, unless otherwise indicated.