Speaker identification using multimodal neural networks and wavelet analysis

2.50
Hdl Handle:
http://hdl.handle.net/2436/620913
Title:
Speaker identification using multimodal neural networks and wavelet analysis
Authors:
Aggoun, Amar; Almaadeed, Noor; Amira, Abbes
Abstract:
The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem of identifying a speaker from its voice regardless of the content. In this study, the authors designed and implemented a novel text-independent multimodal speaker identification system based on wavelet analysis and neural networks. Wavelet analysis comprises discrete wavelet transform, wavelet packet transform, wavelet sub-band coding and Mel-frequency cepstral coefficients (MFCCs). The learning module comprises general regressive, probabilistic and radial basis function neural networks, forming decisions through a majority voting scheme. The system was found to be competitive and it improved the identification rate by 15% as compared with the classical MFCC. In addition, it reduced the identification time by 40% as compared with the back-propagation neural network, Gaussian mixture model and principal component analysis. Performance tests conducted using the GRID database corpora have shown that this approach has faster identification time and greater accuracy compared with traditional approaches, and it is applicable to real-time, text-independent speaker identification systems.
Citation:
Speaker identification using multimodal neural networks and wavelet analysis 2015, 4 (1):18 IET Biometrics
Publisher:
IET
Journal:
IET Biometrics
Issue Date:
1-Mar-2015
URI:
http://hdl.handle.net/2436/620913
DOI:
10.1049/iet-bmt.2014.0011
Additional Links:
http://digital-library.theiet.org/content/journals/10.1049/iet-bmt.2014.0011
Type:
Article
Language:
en
ISSN:
2047-4938; 2047-4946
Appears in Collections:
FSE

Full metadata record

DC FieldValue Language
dc.contributor.authorAggoun, Amaren
dc.contributor.authorAlmaadeed, Nooren
dc.contributor.authorAmira, Abbesen
dc.date.accessioned2017-11-28T16:51:36Z-
dc.date.available2017-11-28T16:51:36Z-
dc.date.issued2015-03-01-
dc.identifier.citationSpeaker identification using multimodal neural networks and wavelet analysis 2015, 4 (1):18 IET Biometricsen
dc.identifier.issn2047-4938-
dc.identifier.issn2047-4946-
dc.identifier.doi10.1049/iet-bmt.2014.0011-
dc.identifier.urihttp://hdl.handle.net/2436/620913-
dc.description.abstractThe rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem of identifying a speaker from its voice regardless of the content. In this study, the authors designed and implemented a novel text-independent multimodal speaker identification system based on wavelet analysis and neural networks. Wavelet analysis comprises discrete wavelet transform, wavelet packet transform, wavelet sub-band coding and Mel-frequency cepstral coefficients (MFCCs). The learning module comprises general regressive, probabilistic and radial basis function neural networks, forming decisions through a majority voting scheme. The system was found to be competitive and it improved the identification rate by 15% as compared with the classical MFCC. In addition, it reduced the identification time by 40% as compared with the back-propagation neural network, Gaussian mixture model and principal component analysis. Performance tests conducted using the GRID database corpora have shown that this approach has faster identification time and greater accuracy compared with traditional approaches, and it is applicable to real-time, text-independent speaker identification systems.en
dc.language.isoenen
dc.publisherIETen
dc.relation.urlhttp://digital-library.theiet.org/content/journals/10.1049/iet-bmt.2014.0011en
dc.rightsArchived with thanks to IET Biometricsen
dc.subjectprincipal component analysisen
dc.subjectdiscrete wavelet transformsen
dc.subjecttext analysisen
dc.subjectcepstral analysisen
dc.subjectbiometrics (access control)en
dc.subjectbackpropagationen
dc.subjectGaussian processesen
dc.subjectradial basis function networksen
dc.subjectaudio databasesen
dc.subjectmixture modelsen
dc.subjectspeaker recognitionen
dc.titleSpeaker identification using multimodal neural networks and wavelet analysisen
dc.typeArticleen
dc.identifier.journalIET Biometricsen
All Items in WIRE are protected by copyright, with all rights reserved, unless otherwise indicated.