Hostname: page-component-8448b6f56d-tj2md Total loading time: 0 Render date: 2024-04-19T18:51:50.199Z Has data issue: false hasContentIssue false

Some acoustic effects of speaking style on utterances for automatic speaker verification

Published online by Cambridge University Press:  27 April 2009

Jana Dankovičová
Affiliation:
University College London, U.K.j.dankovicova@ucl.ac.uk
Francis Nolan
Affiliation:
University of Cambridge, U.K.fjnl@cus.cam.ac.uk

Extract

This paper reports the results of an experiment on the effects of six speaking styles on some of the acoustic properties of speech. The experiment was part of an exploration of within-speaker variation in connection with automatic speaker verification (ASV), pursuing the hypothesis that the elicitation of style variation in the training phase of an ASV system (‘structured training’) would enhance the performance of the system. Swedish-speaking subjects produced a digit sequence at varying speaking rates and loudness levels, and also with simulated denasality (pinched nose) and under cognitive stress. Duration of vowels and consonants, and formant frequencies of vowels, were measured. A number of consistent patterns of variation emerged for duration and vowel quality and are reported here. The discussion explores the relation between the patterns observed and the success, or in the case of speech under stress the failure, of structured training in reducing the error rates in ASV.

Type
Articles
Copyright
Copyright © Journal of the International Phonetic Association 1999

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Doddington, G. R. (1985). Speaker recognition – identifying people by their voices. In Proceedings of the IEEE, 73, 1651–64.CrossRefGoogle Scholar
Furui, S. (1994). An overview of speaker recognition technology. In Proceedings of ESCA Workshop on Speaker Recognition, Identification, and Verification, 19. Martigny, Switzerland, 5–7 04 1994.Google Scholar
Gay, T. (1981). Mechanisms in the control of speech rate. Phonetica, 38, 148158.CrossRefGoogle ScholarPubMed
Jessen, M. (1997) Phonetic manifestations of cognitive and physical stress in trained and untrained police officers. Forensic Linguistics, 4, 125147.Google Scholar
Karlsson, I., Banziger, T., Dankovičová, J., Johnstone, T., Lindberg, J., Melin, H., Nolan, F. & Scherer, K. (1998a). Speaker verification with elicited speaking-styles in the VeriVox project. In Proceedings of La Reconnaissance du Locuteur et ses Applications Commerciales et Criminalistiques (RLA2C), Avignon, 202304 1998.Google Scholar
Karlsson, I., Banziger, T., Dankovičová, J., Johnstone, T., Lindberg, J., Melin, H., Nolan, F. & Scherer, K. (1998b). Within-speaker variability due to speaking manners. In Proceedings of ICSLP '98, Sydney, Australia, 6, 23792382.Google Scholar
Koopmans-van Beinum, F. J. (1980). Vowel Contrast Reduction: An Acoustic and Perceptual Study of Dutch Vowels in Various Speech Conditions. Amsterdam: Academische Pers. B.V.Google Scholar
Sakoe, H. & Chiba, S. (1978). Dynamic programming algorithm optimisation for spoken word recognition. In IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-26, 4349.CrossRefGoogle Scholar
Stevens, K. N. & House, A. S. (1963). Perturbation of vowel articulations by consonantal context: An acoustical study. Journal of Speech and Hearing Research, 6, 111128.CrossRefGoogle ScholarPubMed