Interspeech 2025 Keynotes
Prof. Roger K. Moore
ISCA Medalist 2025, Chair of Spoken Language Processing, Head of Speech & Hearing Research Group (SpandH), Vocal Interactivity Lab (VILab), Sheffield Robotics
School of Computer Science, University of Sheffield
Title: ISCA Medal for Scientific Achievement Keynote Prof. Roger K. Moore: From Talking and Listening Devices to Intelligent Communicative Machines
Abstract: Having been 'in the business' of speech technology for over 50 years, I've had the pleasure of witnessing (and being involved first-hand) in many of the astounding developments that have led to the incredible solutions we have today. Indeed, my involvement in the field of spoken language has been somewhat of a love affair, and it's been a huge honour and privilege to have been working with so many excellent researchers on "the most sophisticated behaviour of the most complex organism in the known universe"! Although I've always been heavily committed to the establishment of machine learning approaches to spoken language processing - including publishing one of the first papers on the application of artificial neural networks to automatic speech recognition - my approach has always been one of attempting to uncover the underlying mechanisms of 'intelligent' (speech-based) interaction, on the basis that living systems are remarkably data-efficient in their learning. This talk will both look back (rather a long way) and look forward, asking the question how did we get here and where are we going? I hope that some of my insights may inspire others to follow a similar path.
Biography: Prof. Moore (http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/) has over 50 years’ experience in Speech Technology R&D and, although an engineer by training, much of his research has been based on insights from human speech perception and production. He studied Computer & Communications Engineering at the University of Essex and was awarded the B.A. (Hons.) degree in 1973. He subsequently received the M.Sc.(Res.) and Ph.D. degrees from the same university in 1975 and 1977 respectively, both theses being on the topic of automatic speech recognition. After a period of post-doctoral research in the Phonetics Department at University College London, Prof. Moore was recruited in 1980 to establish a speech recognition research team at the Royal Signals and Radar Establishment (RSRE) in Malvern. As Head of the UK Government's Speech Research Unit from 1985 to 1999, he was responsible for the development of the Aurix range of speech technology products and the subsequent formation of 20/20 Speech Ltd. Since 2004 he has been Professor of Spoken Language Processing at the University of Sheffield, and also holds Visiting Chairs at Bristol Robotics Laboratory and University College London Psychology & Language Sciences. Since joining Sheffield, his research has focused on understanding the fundamental principles of speech-based interaction, and in 2017 he initiated the first in the series of international workshops on ‘Vocal Interactivity in-and-between Humans, Animals and Robots' (VIHAR).
As President of both the European Speech Communication Association (ESCA) and Permanent Council of the International Conference on Spoken Language Processing (PC-ICSLP) from 1997, Prof. Moore pioneered their integration to form the International Speech Communication Association (ISCA). He was subsequently General Chair for INTERSPEECH-2009 and ISCA Distinguished Lecturer during 2014-15. He has received several awards, including the UK Institute of Acoustics Tyndall Medal for “distinguished work in the field of speech research and technology“, the NATO RTO Scientific Achievement Award for “repeated contribution in scientific and technological cooperation”, the LREC Antonio Zampoli Prize for "Outstanding Contributions to the Advancement of Language Resources & Language Technology Evaluation within Human Language Technologies", and the ISCA Special Service Medal for "Service in the establishment, leadership and international growth of ISCA". Prof. Moore is the current Editor-in-Chief of Computer Speech & Language, and Associate Editor for Speech Communication, Languages, the Journal of Future Robot Life, and Frontiers in Robotics and AI (Computational Intelligence in Robotics).
Prof. Dr. Alex Waibel
Director of InterACT, Carnegie Mellon University & Institute for Anthropomatics and Robotics, Interactive Systems Labs in Karlsruhe Institute for Technology (KIT)
Title: TO BE ANNOUNCED
Dr. Judith Holler
Donders Centre for Brain, Cognition and Behaviour, Radboud University & Max Planck Institute for Psycholinguistics
Title: Using and comprehending language in face-to-face conversation
Abstract: Face-to-face conversational interaction is at the very heart of human sociality and the natural ecological niche in which language has evolved and is acquired. Yet, we still know rather little about how utterances are produced and comprehended in this environment. In this talk, I will focus on how hand gestures, facial and head movements are organised to convey semantic and pragmatic meaning in conversation, as well as on how the presence and timing of these signals impacts utterance comprehension and responding. Specifically, I will present studies based on complementary approaches, which feed into and inform one another. This includes qualitative and quantitative multimodal corpus studies showing that visual signals indeed often occur early, and experimental comprehension studies, which are based on and inspired by the corpus results, implementing controlled manipulations to test for causal effects between visual bodily signals and comprehension processes and mechanisms. These experiments include behavioural and EEG studies, most of them using multimodally animated virtual characters. Together, the findings provide evidence for the hypothesis that visual bodily signals form an integral part of semantic and pragmatic meaning communication in conversational interaction, and that they facilitate language processing, especially due to their timing and the predictive potential they gain through their temporal orchestration.
Biography: Judith Holler is Associate Professor at the Donders Institute for Brain, Cognition & Behaviour, Radboud University where she leads the research group Communication in Social Interaction, and senior investigator at the Max Planck Institute for Psycholinguistics. Her research program investigates human language in the very environment in which it has evolved, is acquired, and used most: face-to-face interaction. Within this context, Judith focuses on the semantics and pragmatics of human communication from a multimodal perspective considering spoken language within the rich, visual infrastructure that embeds it, such as manual gestures, head movements, facial signals, and gaze. She uses a combination of methods from different fields to investigate human multimodal communication, including quantitative conversational corpus analyses, in-situ eyetracking, behavioural and neurocognitive experimentation using multimodal language stimuli involving virtual animations. Her research has been supported by a range of prestigious research grants from funders including the European Research Council (EU), The Dutch Science Foundation (NWO), Marie Curie Fellowships (EU), Economic & Social Research Council (UK), Parkinson UK, The Leverhulme Trust (UK), the British Academy (UK), Volkswagen Stiftung (Germany) and the German Science Foundation (DFG, Mercator Fellowships).
Prof. Carol Espy-Wilson
Electrical & Computer Engineering Department, Institute for Systems Research, University of Maryland College Park
Title: Speech Kinematic Analysis from Acoustics: Scientific, Clinical and Practical Applications
Abstract: Much of my research has involved studying how small changes in the spatiotemporal coordination of speech articulators affect variability in the acoustic characteristics of the speech signal. This interest in speech variability ultimately led me to develop a speech inversion (SI) system that recovers articulatory movements of the lips, tongue tip, and tongue body from the speech signal. Recently, we were able to extend the SI system to provide information about the velopharyngeal port opening (nasality) and will soon investigate a methodology to uncover information about the tongue root and the size of the glottal opening. Our SI system has proven to be speaker independent and generalizes well across acoustic databases. In this talk, I will explain how we developed the SI system, and ways in which we have used it to date: for clinical purposes in mental health and speech disorder assessment, in scientific analysis of cross-linguistic speech patterns, and for improving automatic speech recognition.
Biography: Carol Espy-Wilson is a full professor in the Electrical and Computer Engineering Department and the Institute for Systems Research at the University of Maryland College Park. She received her BS in electrical engineering from Stanford University and her MS, EE and PhD degrees in electrical engineering from the Massachusetts Institute of Technology. Dr. Espy-Wilson is a Fellow of the Acoustical Society of America (ASA), the International Speech Communication Association (ISCA) and the IEEE. She was recently elected VP-elect of ASA, and to the ISCA Advisory Board. She is currently serving on the Editorial Board of Computer, Speech and Language. She has been Chair of the Speech Communication Technical Committee of ASA, elected member of the Speech and Language Technical Committee of IEEE and Associate Editor of the Journal of the Acoustical Society of America. Finally, at the National Institutes of Health, she has served on the Advisory Councils for the National Institute on Deafness and Communication Disorders and the National Institutes of Biomedical Imaging and Bioengineering, on the Medical Rehabilitation Advisory Board of the National Institute of Child Health and Human Development, and she has been a member of the Language and Communication Study Section.
Carol directs the Speech Communication Lab where they combine digital signal processing, speech science, linguistics and machine learning to conduct research in speech communication. Current research projects include speech inversion, mental health assessment based on speech, video and text, speech recognition for elementary school classrooms, entrainment based on articulatory and facial gestures in unstructured conversations between neurotypical and neurodiverse participants, and speech enhancement. Her laboratory has received federal funding (NSF, NIH and DoD) and industry grants and she has 13 patents.
Interspeech 2025
PCO: TU Delft Events
Delft University of Technology
Communication Department
Prometheusplein 1
2628 ZC Delft
The Netherlands
Email: pco@interspeech2025.org
X (formerly Twitter): @ISCAInterspeech
Bluesky: @interspeech.bsky.social
Interspeech 2025 is working under the privacy policy of TU Delft
Interspeech 2025
Interspeech 2025interspeech2025@tudelft.nl
Interspeech 2025interspeech2025@tudelft.nlhttps://www.interspeech2025.org
2025-08-17
2025-08-17
OfflineEventAttendanceMode
EventScheduled
Interspeech 2025Interspeech 20250.00EUROnlineOnly2019-01-01T00:00:00Z
To be announcedTo be announced