Ph.D. candidate,
The Chinese University of Hong Kong
Email: meng [at] link.cuhk.edu.hk
I am a Ph.D. candidate at Human-Computer Communications Laboratory (HCCL), The Chinese University of Hong Kong (CUHK), supervised by Prof. Helen Meng. Previously, I received my M.Phil. degree from the Institute of Automation, Chinese Academy of Sciences (CASIA), where I was supervised by Prof. Jie Tian. And I received my Bachelor's degree from Harbin Institute of Technology (HIT). I was a research intern at Microsoft Research Asia.
My research interests focus on language modeling for speech synthesis and the integration of speech with large language models. I am also working on speech processing and recognition.
🟢 I am currently on the job market. Please drop me emails!
[Apr 2025]
Our [paper] is the only winner of the 2025 IEEE Ganesh N. Ramaswamy Memorial Student Grant
[Feb 2025]
Two papers have been submitted to ACL Rolling Review. Four papers have been submitted to ISCA INTERSPEECH 2025
[Jan 2025]
We present [ARLON] (ICLR 2025), boosting diffusion transformers with autoregressive models for long video generation
[Dec 2024]
Two papers have been accepted to IEEE ICASSP 2025, including one first-authored [paper]
[Dec 2024]
I participated in writing the excellent [survey] on Next Token Prediction Towards Multimodal Intelligence
[Jul 2024]
We present [MELLE], a concise, continuous-value token-based language model for TTS, forgoing cumbersome NAR steps
[Jun 2024]
We propose [WavLLM], a robust and adaptive Speech LLM achieving SOTA performance on various speech-related tasks
[Jun 2024]
Three papers have been accepted to ISCA INTERSPEECH 2024, including one first-authored [paper]
[Jan 2024]
Two papers have been accepted to IEEE ICASSP 2024
Since 2021, Ph.D. student, The Chinese University of Hong Kong (CUHK)
2018 - 2021, M.Phil., Pattern Recognition and Intelligent Systems, Institute of Automation, Chinese Academy of Sciences (CASIA)
2014 - 2018, B.Eng., Electrical Engineering, Harbin Institute of Technology (HIT)
MELLE: Autoregressive Speech Synthesis without Vector Quantization | [demo]
Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu,
Helen Meng, Furu Wei
submitted, 2025
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions | [code]
Lingwei Meng, Shujie Hu, Jiawen Kang, Zhaoqing Li, Yuejiang Wang, Wenxuan Wu, Xixin Wu, Xunying Liu, Helen Meng
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System | [code]
Lingwei Meng, Jiawen Kang, Yuejiao Wang, Zengrui Jin, Xixin Wu, Xunying Liu, Helen Meng
ISCA INTERSPEECH, 2024
A Sidecar Separator Can Convert a Single-Talker
Speech Recognition System to
a Multi-Talker One | [slides] [video]
Lingwei Meng, Jiawen Kang, Mingyu Cui, Yuejiao Wang, Xixin Wu, Helen Meng
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Unified Modeling of Multi-Talker
Overlapped Speech Recognition and Diarization with a Sidecar Separator
Lingwei Meng, Jiawen Kang, Mingyu Cui, Haibin Wu, Xixin Wu, Helen Meng
ISCA INTERSPEECH, 2023
FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching
Hui Wang, Shujie Liu, Lingwei Meng, Jinyu Li, Yifan Yang, Shiwan Zhao, Haiyang Sun, Yanqing Liu, Haoqin Sun, Jiaming Zhou,
Yan Lu, Yong Qin
arXiv, 2025
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Jiawen Kang, Lingwei Meng, Mingyu Cui, Yuejiao Wang, Xixin Wu, Xunying Liu, Helen Meng
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
(The only winner of the 2025 lEEE Ganesh N. Ramaswamy Memorial Student Grant)
ARLON: Boosting Diffusion Transformers with
Autoregressive Models for Long Video Generation | [demo]
Zongyi Li, Shujie Hu, Shujie Liu, Long Zhou, Jeongsoo Choi, Lingwei Meng, Xun Guo, Jinyu
Li, Hefei Ling, Furu Wei
The Thirteenth International Conference on Learning Representations (ICLR), 2024
WavLLM: Towards Robust and Adaptive Speech Large Language Model | [code]
Shujie Hu, Long Zhou, Shujie Liu, Sanyuan Chen, Lingwei Meng, Hongkun Hao, Jing Pan, Xunying Liu, Jinyu Li, Sunit Sivasankaran, Linquan Liu, Furu Wei
EMNLP Findings, 2024
VALL-E R: Robust and Efficient Zero-Shot
Text-to-Speech Synthesis via Monotonic Alignment | [demo]
Bing Han, Long Zhou, Shujie Liu, Sanyuan Chen, Lingwei Meng, Yanming Qian, Yanqing Liu,
Sheng Zhao, Jinyu Li, Furu Wei
arXiv, 2024
Large Language Model-based FMRI Encoding of Language Functions for Subjects with Neurocognitive Disorder
Yuejiao Wang, Xianmin Gong, Lingwei Meng, Xixin Wu, Helen Meng
Sheng Zhao, Jinyu Li, Furu Wei
ISCA INTERSPEECH, 2024
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition | [code]
Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization
Yuejiao Wang, Xixin Wu, Disong Wang, Lingwei Meng, Helen Meng
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Spoofing-Aware Speaker Verification by Multi-Level Fusion
Haibin Wu, Lingwei Meng, Jiawen Kang,
Jinchao Li, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng
ISCA INTERSPEECH, 2022
Exploring Linguistic Feature and Model Combination
for Speech Recognition Based Automatic AD Detection
Yi Wang, Tianzi Wang, Zi Ye, Lingwei Meng, Shoukang Hu, Xixin Wu, Xunying
Liu, Helen Meng
ISCA INTERSPEECH, 2022
The CUHK-Tencent Speaker Diarization
System
for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao
Weng, Dan Su, Helen Meng
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
2D and
3D CT Radiomic
Features Performance Comparison in Characterization of Gastric Cancer: a Multi-Center
Study
Lingwei Meng, Di
Dong, Xin Chen, Mengjie Fang, Rongpin Wang, Jing Li, Zaiyi Liu, Jie Tian
IEEE Journal of Biomedical and Health Informatics (IEEE JBHI), 2021 (Impact Factor: 6.7)
(ESI Highly Cited Paper)
A Deep
Learning Prognosis
Model Help Alert for COVID-19 Patients at High-Risk of Death: a
Multi-Center Study
Lingwei Meng, Di
Dong, Liang Li, Meng Niu, Yan Bai, Meiyun Wang, Xiaoming Qiu, Yunfei Zha, Jie Tian
IEEE Journal of Biomedical and Health Informatics (IEEE JBHI), 2020 (Impact Factor: 6.7)
Noninvasive Model
for Predicting Future Ischemic Strokes in Patients with Silent Lacunar Infarction Using
Radiomics
Lingwei Meng#, Jiehua Su#, Di Dong#, Wenyan Zhuo, Jianming Wang, Libin Liu, Yi
Qin, Ye Tian, Jie Tian, Zhaohui Li
BMC Medical Imaging, 2020 (# denotes the co-first authorship.) (Impact Factor: 2.9)
Serving as a reviewer of
SEEM 3440, Operation Research II
AIST 3510 / SEEM 3510, Human-Computer Interaction
Some awards on control algorithm and circuit design competitions: