Interspeech 2025 - Best Student Papers

Interspeech 2025 Best Student Papers

Winners

"On the Relationship between Accent Strength and Articulatory Features", Kevin Huang, Sean Foley, Jihwan Lee, Yoonjeong Lee, Dani Byrd, Shrikanth Narayanan

"OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning Models", Yifan Peng, Muhammad Shakeel, Yui Sudo, William Chen, Jinchuan Tian, Chyi-Jiunn Lin and Shinji Watanabe

"Attention Models and Auditory Transduction Features for Noise Robustness", Cathal Ó Faoláin and Andrew Hines

Nominees

Piotr Masztalski, Michał Romaniuk, Jakub Żak, Mateusz Matuszewski, Konrad Kowalczyk, "Clustering-based hard negative sampling for supervised contrastive speaker verification".

Junqi Yang, Yuhong Yang, Weiping Tu, Xin Zhao, Cedar Lin, "Band-SCNet: A Causal, Lightweight Model for High-Performance Real-Time Music Source Separation".

Seungu Han, Sungho Lee, Juheon Lee, Kyogu Lee, "Few-step Adversarial Schrödinger Bridge for Generative Speech Enhancement".

Chao Yi-Wen, Yizhou Peng, Dianwen Ng, Yukun Ma, Chongjia Ni, Eng Siong Chng, Bin Ma, "A-SMiLE: Affective Sparse Mixture-of-Experts Adapter with Multi-Task Learning for Spoken Dialogue Models".

Tianhua Qi, Shiyan Wang, Cheng Lu, Tengfei Song, Hao Yang, Zhanglin Wu, Wenming Zheng, "PromptEVC: Controllable Emotional Voice Conversion with Natural Language Prompts".

Yifan Peng, Shakeel Muhammad, Yui Sudo, William Chen, Jinchuan Tian, Chyi-Jiunn Lin, Shinji Watanabe, "OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning".

Minji Ryu, Ji-Hyeon Hur, Sung Heuk Kim, Gahgene Gweon, "Pitch Contour Model (PCM) with Transformer Cross-Attention for Speech Emotion Recognition".

Byeong Hyeon Kim, Hyungseob Lim, Inseon Jang, Hong-Goo Kang, "Towards an Ultra-Low-Delay Neural Audio Coding with Computational Efficiency".

Filippo Villani , Wai-Yip Chan, Zheng-Hua Tan, Jan Østergaard, Jesper Jensen, "Analysis and Extension of a Near-End Listening Enhancement Method Based on Long-Term Fractile Noise Statistics".

Wang Dong, Jiqing Han, Tieran Zheng, Guibin Zheng, Yongjun He, "Dual Orthogonality Sub-center Loss for Enhanced Anomalous Sound Detection".

Cathal Ó Faoláin, Andrew Hines, "Attention Models and Auditory Transduction Features for Noise Robustness".

Kevin Huang, Sean Foley, Jihwan Lee, Yoonjeong Lee, Dani Byrd, Shrikanth Narayanan, "On the Relationship between Accent Strength and Articulatory Features".

Le Xuan Chan, Annika Heuser, "Relative cue weighting in multilingual stop voicing production".

Jingran Xie, Xiang Li, Hui Wang, Yue Yu, Yang Xiang, Xixin Wu, Zhiyong Wu, "Enhancing Generalization of Speech Large Language Models with Multi-Task Behavior Imitation and Speech-Text Interleaving".

Chetan Sharma, Vaishnavi Chandwanshi, Shreya Shrikant Karkun, Aditya Anand Gupta, Prasanta Kumar Ghosh, "A real-time MRI study on asymmetry in velum dynamics during VCV production with nasal sounds".

Interspeech 2025

PCO: TU Delft Events

Delft University of Technology

Communication Department

Prometheusplein 1

2628 ZC Delft

The Netherlands

Email: pco@interspeech2025.org

X (formerly Twitter): @ISCAInterspeech

Bluesky: @interspeech.bsky.social

Interspeech 2025 is working under the privacy policy of TU Delft