Show simple item record

dc.contributor.authorAggoun, Amar
dc.contributor.authorAlmaadeed, Noor
dc.contributor.authorAmira, Abbes
dc.date.accessioned2017-11-28T16:51:36Z
dc.date.available2017-11-28T16:51:36Z
dc.date.issued2015-01-19
dc.identifier.citationAlmaadeed, N., Aggoun, A., and Amira, A. (21015)Speaker identification using multimodal neural networks and wavelet analysis, IET Biometrics, 4 (1), pp. 18-28
dc.identifier.issn2047-4938
dc.identifier.issn2047-4946
dc.identifier.doi10.1049/iet-bmt.2014.0011
dc.identifier.urihttp://hdl.handle.net/2436/620913
dc.description© 2014 The Authors. Published by IET. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1049/iet-bmt.2014.0011
dc.description.abstractThe rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem of identifying a speaker from its voice regardless of the content. In this study, the authors designed and implemented a novel text-independent multimodal speaker identification system based on wavelet analysis and neural networks. Wavelet analysis comprises discrete wavelet transform, wavelet packet transform, wavelet sub-band coding and Mel-frequency cepstral coefficients (MFCCs). The learning module comprises general regressive, probabilistic and radial basis function neural networks, forming decisions through a majority voting scheme. The system was found to be competitive and it improved the identification rate by 15% as compared with the classical MFCC. In addition, it reduced the identification time by 40% as compared with the back-propagation neural network, Gaussian mixture model and principal component analysis. Performance tests conducted using the GRID database corpora have shown that this approach has faster identification time and greater accuracy compared with traditional approaches, and it is applicable to real-time, text-independent speaker identification systems.
dc.language.isoen
dc.publisherIET
dc.relation.urlhttp://digital-library.theiet.org/content/journals/10.1049/iet-bmt.2014.0011
dc.subjectprincipal component analysis
dc.subjectdiscrete wavelet transforms
dc.subjecttext analysis
dc.subjectcepstral analysis
dc.subjectbiometrics (access control)
dc.subjectbackpropagation
dc.subjectGaussian processes
dc.subjectradial basis function networks
dc.subjectaudio databases
dc.subjectmixture models
dc.subjectspeaker recognition
dc.titleSpeaker identification using multimodal neural networks and wavelet analysis
dc.typeJournal article
dc.identifier.journalIET Biometrics
dc.date.accepted2014-09-11
dc.source.volume4
dc.source.issue1
dc.source.beginpage18
dc.source.endpage28
refterms.dateFOA2020-05-13T10:43:09Z
html.description.abstractThe rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem of identifying a speaker from its voice regardless of the content. In this study, the authors designed and implemented a novel text-independent multimodal speaker identification system based on wavelet analysis and neural networks. Wavelet analysis comprises discrete wavelet transform, wavelet packet transform, wavelet sub-band coding and Mel-frequency cepstral coefficients (MFCCs). The learning module comprises general regressive, probabilistic and radial basis function neural networks, forming decisions through a majority voting scheme. The system was found to be competitive and it improved the identification rate by 15% as compared with the classical MFCC. In addition, it reduced the identification time by 40% as compared with the back-propagation neural network, Gaussian mixture model and principal component analysis. Performance tests conducted using the GRID database corpora have shown that this approach has faster identification time and greater accuracy compared with traditional approaches, and it is applicable to real-time, text-independent speaker identification systems.


Files in this item

Thumbnail
Name:
Almaadeed_et_al_Speaker_identi ...
Size:
637.5Kb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record