Li Sheng 李勝

Researcher Number	70840940
Other IDs	https://orcid.org/0000-0001-7636-3797
Affiliation (Current)	2025: 東京科学大学, 工学院, 助教
Affiliation (based on the past Project Information) *help	2021 – 2024: 国立研究開発法人情報通信研究機構, ユニバーサルコミュニケーション研究所先進的音声翻訳研究開発推進センター, 研究員 2020: 国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的音声技術研究室, 研究員 2019: 国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的音声技術研究室, 研究員
Review Section/Research Field	Principal Investigator Basic Section 61030:Intelligent informatics-related / Basic Section 61050:Intelligent robotics-related / 1002:Human informatics, applied informatics and related fields Except Principal Investigator Sections That Are Subject to Joint Review: Basic Section60030:Statistical science-related , Basic Section61030:Intelligent informatics-related / Basic Section 60030:Statistical science-related / Basic Section 61030:Intelligent informatics-related
Keywords	Principal Investigator speech recognition / quality estimation / federated learning / Low-resource / Multilingual / Multimodal / Multitask / Deep neural network / Adversarial attack / Dialogue robotic system … More / Speech recognition / speech enhancement / adversarial attacks / spoken dialogue system / privacy preserving / security / spoken dialogue / deepfake detection / privacy perserving / adversarial attack / speaker diarization / end-to-end / code-switched / disordered speech / language identification / multi-unit modeling / speech translation / low-resourced modeling / multilingual modeling / End-to-End / articulation / multilingual … More Except Principal Investigator メタ介入 / 音声翻訳 / 感情音声認識 / 多言語対話 / 意図理解 / 音声対話翻訳 Less

意図を的確に伝える音声対話翻訳の基盤技術の創出
- Principal Investigator
  
  チョシンキ
- Project Period (FY)
  2023 – 2026
- Research Category
  
  Grant-in-Aid for Scientific Research (B)
- Review Section
  
  Basic Section 61030:Intelligent informatics-related
  Basic Section 60030:Statistical science-related
  Sections That Are Subject to Joint Review: Basic Section60030:Statistical science-related , Basic Section61030:Intelligent informatics-related
- Research Institution
  Kyoto University
M3OLR: Towards Effective Multilingual, Multimodal and Multitask Oriental Low-resourced Language Speech RecognitionPrincipal Investigator
- Principal Investigator
  
  李勝
- Project Period (FY)
  2023 – 2025
- Research Category
  
  Grant-in-Aid for Scientific Research (C)
- Review Section
  
  Basic Section 61030:Intelligent informatics-related
- Research Institution
  National Institute of Information and Communications Technology
Phantom in the Opera: the Vulnerabilities of Speech Interface for Robotic Dialogue SystemPrincipal Investigator
- Principal Investigator
  
  Li Sheng
- Project Period (FY)
  2021 – 2022
- Research Category
  
  Grant-in-Aid for Early-Career Scientists
- Review Section
  
  Basic Section 61050:Intelligent robotics-related
- Research Institution
  National Institute of Information and Communications Technology
Next generation multilingual End-to-End speech recognition (from G30 to G200)Principal Investigator
- Principal Investigator
  
  Li Sheng
- Project Period (FY)
  2019 – 2020
- Research Category
  
  Grant-in-Aid for Research Activity Start-up
- Review Section
  
  1002:Human informatics, applied informatics and related fields
- Research Institution
  National Institute of Information and Communications Technology

All 2024 2023 2022 2021 2020 2019

All Journal Article Presentation Book Patent

[Book] Phantom in the Opera: The Vulnerabilities of Speech-based Artificial Intelligence Systems2022
- Author(s)
  Sheng Li
- Total Pages
  110
- Publisher
  NICT
- ISBN
  9784904020265
- Data Source
  KAKENHI-PROJECT-21K17837
[Book] Voices of the Himalayas: Investigation of Speech Recognition Technology for the Tibetan Language2022
- Author(s)
  Sheng Li
- Total Pages
  112
- Publisher
  NICT
- ISBN
  9784904020289
- Data Source
  KAKENHI-PROJECT-21K17837
[Book] Automatic speech recognition2020
- Author(s)
  X. Lu, S. Li, M. Fujimoto
- Total Pages
  18
- Publisher
  Springer Singapore
- ISBN
  9789811505959
- Data Source
  KAKENHI-PROJECT-19K24376
[Journal Article] Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network2024
- Author(s)
  Li Nan、Wang Longbiao、Ge Meng、Unoki Masashi、Li Sheng、Dang Jianwu
- Journal Title
  
  Speech Communication
  
  Volume: 157 Pages: 103024-103024
- DOI
  10.1016/j.specom.2023.103024
- Peer Reviewed / Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Journal Article] Phantom in the opera: adversarial music attack for robot dialogue system2024
- Author(s)
  Li Sheng、Li Jiyi、Cao Yang
- Journal Title
  
  Frontiers in Computer Science, 15 February 2024
  
  Volume: 6 Pages: 1-9
- DOI
  10.3389/fcomp.2024.1355975
- Peer Reviewed / Open Access
- Data Source
  KAKENHI-PROJECT-23K11227
[Journal Article] Disordered speech recognition considering low resources and abnormal articulation2023
- Author(s)
  Lin Yuqin、Dang Jianwu、Wang Longbiao、Li Sheng、Ding Chenchen
- Journal Title
  
  Speech Communication
  
  Volume: 155 Pages: 103002-103002
- DOI
  10.1016/j.specom.2023.103002
- Peer Reviewed / Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Journal Article] Finetuning Pretrained Model with Embedding of Domain and Language Information for ASR of Very Low-Resource Settings2023
- Author(s)
  Soky Kak、Li Sheng、Chu Chenhui、Kawahara Tatsuya
- Journal Title
  
  International Journal of Asian Language Processing
  
  Volume: 33 Issue: 04 Pages: 2350024-2350024
- DOI
  10.1142/s2717554523500248
- Peer Reviewed
- Data Source
  KAKENHI-PROJECT-23K11227, KAKENHI-PROJECT-23K28144
[Journal Article] Cross-Lingual Transfer Learning for End-to-End Speech Translation2022
- Author(s)
  Shimizu Shuichiro、Chu Chenhui、Li Sheng、Kurohashi Sadao
- Journal Title
  
  Journal of Natural Language Processing
  
  Volume: 29 Issue: 2 Pages: 611-637
- DOI
  10.5715/jnlp.29.611
- ISSN
  1340-7619, 2185-8314
- Language
  English
- Peer Reviewed / Open Access
- Data Source
  KAKENHI-PROJECT-21K17837
[Journal Article] TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies2022
- Author(s)
  Soky Kak、Mimura Masato、Kawahara Tatsuya、Chu Chenhui、Li Sheng、Ding Chenchen、Sam Sethserey
- Journal Title
  
  International Journal of Asian Language Processing
  
  Volume: 31 Issue: 03n04 Pages: 1-21
- DOI
  10.1142/s2717554522500072
- Peer Reviewed / Open Access
- Data Source
  KAKENHI-PROJECT-20H00602, KAKENHI-PROJECT-21K17837, KAKENHI-PROJECT-21H05054
[Journal Article] Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling2022
- Author(s)
  Qin Siqing、Wang Longbiao、Li Sheng、Dang Jianwu、Pan Lixin
- Journal Title
  
  EURASIP Journal on Audio, Speech, and Music Processing
  
  Volume: 2022 Issue: 1 Pages: 1-10
- DOI
  10.1186/s13636-021-00233-4
- Peer Reviewed / Open Access / Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-20K11883, KAKENHI-PROJECT-21K17837
[Journal Article] Adversarial Attack and Defense on Deep Neural Network-Based Voice Processing Systems: An Overview2021
- Author(s)
  Chen Xiaojiao、Li Sheng、Huang Hao
- Journal Title
  
  Applied Sciences
  
  Volume: 11 Issue: 18 Pages: 8450-8450
- DOI
  10.3390/app11188450
- Peer Reviewed / Open Access / Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Journal Article] Knowledge Distillation-based Representation Learning for Short-Utterance Spoken Language Identification2020
- Author(s)
  P. Shen, X. Lu, S. Li, H. Kawai.
- Journal Title
  
  IEEE/ACM Trans. Audio, Speech \& Language Process.
  
  Volume: 28 Pages: 2674-2683
- DOI
  10.1109/taslp.2020.3023627
- Peer Reviewed
- Data Source
  KAKENHI-PROJECT-19K24376
[Patent] 推論器および推論器の学習方法2020
- Inventor(s)
  李勝、ルーシュガン、河井恒
- Industrial Property Rights Holder
  国立研究開発法人情報通信研究機構
- Industrial Property Rights Type
  特許
- Industrial Property Number
  2020-059962
- Filing Date
  2020
- Data Source
  KAKENHI-PROJECT-19K24376
[Patent] 推論器、推論プログラムおよび学習方法2019
- Inventor(s)
  李勝、ルーシュガン、丁塵辰、河原達也、河井恒
- Industrial Property Rights Holder
  国立研究開発法人情報通信研究機構
- Industrial Property Rights Type
  特許
- Industrial Property Number
  2019-163555
- Filing Date
  2019
- Data Source
  KAKENHI-PROJECT-19K24376
[Patent] 言語識別モデルの訓練方法及び装置、並びにそのためのコンピュータプログラム2019
- Inventor(s)
  沈鵬, ルーシュガン , 李勝 , 河井恒
- Industrial Property Rights Holder
  国立研究開発法人情報通信研究機構
- Industrial Property Rights Type
  特許
- Industrial Property Number
  2019-086005
- Filing Date
  2019
- Acquisition Date
  2020
- Data Source
  KAKENHI-PROJECT-19K24376
[Patent] 推論器、学習方法および学習プログラム2019
- Inventor(s)
  李勝、ルーシュガン、ダブレラジ、河井恒
- Industrial Property Rights Holder
  国立研究開発法人情報通信研究機構
- Industrial Property Rights Type
  特許
- Industrial Property Number
  2019-051008
- Filing Date
  2019
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Cross-lingual Mapping for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition2024
- Author(s)
  Zhengdong Yang, Qianying Liu, Sheng Li, Chenhui Chu, Fei Cheng, Sadao Kurohashi
- Organizer
  日本音響学会第 150 回(2023 年秋季)研究発表会
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Investigating Effective Methods for Combining Large Language Model with Speech Recognition System2024
- Author(s)
  Sheng Li, Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Hisashi Kawai
- Organizer
  日本音響学会第151回(2024年春季)研究発表会
- Data Source
  KAKENHI-PROJECT-23K28144
[Presentation] Investigating effective methods for combining large language model with speech recognition system2024
- Author(s)
  李勝, 楊正東, 周汪勁, Chenhui Chu, 河井恒
- Organizer
  日本音響学会第151回(2024年春季)研究発表会
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Combining Large Language Model with Speech Recognition System in Low-resource Settings2024
- Author(s)
  李勝, 楊正東, 周汪勁, Chenhui Chu, Chen Chen, Chng Eng Siong, 河井恒
- Organizer
  言語処理学会第30回年次大会
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction2024
- Author(s)
  Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara
- Organizer
  IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Combining Large Language Model with Speech Recognition System in Low-resource Settings2024
- Author(s)
  Sheng Li, Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Chen Chen, Eng Siong Chng, Hisashi Kawai
- Organizer
  言語処理学会第30回年次大会
- Data Source
  KAKENHI-PROJECT-23K28144
[Presentation] Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition2023
- Author(s)
  Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao Kurohashi
- Organizer
  In Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K28144
[Presentation] GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System2023
- Author(s)
  Xiaojiao Chen, Sheng Li, Jiyi Li, Yang Cao, Hao Huang, Liang He
- Organizer
  ACM Multimedia Asia
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Kyoto Speech-to-Speech Translation System for IWSLT 20232023
- Author(s)
  Zhengdong Yang, Shuichiro Shimizu, Zhou Wangjin, Sheng Li, Chenhui Chu
- Organizer
  In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023). pp.357-362
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K28144
[Presentation] Dialogue State Tracking with Sparse Local Slot Attention2023
- Author(s)
  Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
- Organizer
  ACL 2023 Workshop on NLP for Conversational AI
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] KyotoMOS: An Automatic MOS Scoring System for Speech Synthesis2023
- Author(s)
  Wangjin Zhou, Zhengdong Yang, Sheng Li, Chenhui Chu
- Organizer
  ACM Multimedia Asia Workshops 2023
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Self-Supervised Learning MOS Prediction with Listener Enhancement2023
- Author(s)
  Sheng Li
- Organizer
  VoiceMOS mini workshop
- Invited / Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Towards Speech Dialogue Translation Mediating Speakers of Different Languages2023
- Author(s)
  Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
- Organizer
  In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings Volume. pp.1122-1134
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K28144
[Presentation] Hierarchical Softmax for End-To-End Low-Resource Multilingual Speech Recognition2023
- Author(s)
  Qianying Liu Zhuo Gong Zhengdong Yang Yuhang Yang Sheng Li Chenchen Ding Nobuaki Minematsu Hao Huang Fei Cheng Chenhui Chu Sadao Kurohashi
- Organizer
  2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization2023
- Author(s)
  Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He
- Organizer
  ACM Multimedia Asia
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Video-Helpful Multimodal Machine Translation2023
- Author(s)
  Yihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li
- Organizer
  In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023). pp.4281-4299
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K28144
[Presentation] The Kyoto Speech-to-Speech Translation System for IWSLT 20232023
- Author(s)
  Zhengdong Yang, Shuichiro Shimizu, Wangjin Zhou, Sheng Li, Chenhui Chu
- Organizer
  International Conference on Spoken Language Translation (IWSLT) 2023
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Multi-Domain Dialogue State Tracking with Disentangled Domain-Slot Attention2023
- Author(s)
  Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
- Organizer
  In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Towards Speech Dialogue Translation Mediating Speakers of Different Languages2023
- Author(s)
  Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
- Organizer
  In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] GENERAL OR SPECIFIC? INVESTIGATING EFFECTIVE PRIVACY PROTECTION IN FEDERATED LEARNING FOR SPEECH EMOTION RECOGNITION2023
- Author(s)
  Chao Tan, Yang Cao, Sheng Li and Masatoshi Yoshikawa
- Organizer
  ICASSP
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Domain and Language Adaptation Using Heterogeneous Datasets for Wav2vec2.0-based Speech Recognition of Low-resource Language2023
- Author(s)
  Kak Soky, Sheng Li, Chenhui Chu, Tatsuya Kawahara
- Organizer
  In Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K28144
[Presentation] LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement2023
- Author(s)
  Zili Qi, Xinhui Hu, Wangjin Zhou, Sheng Li, Hao Wu, Jian Lu, Xinkang Xu
- Organizer
  IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] FedCPC: An Effective Federated Contrastive Learning Method for Privacy Preserving Early-Stage Alzheimers Speech Detection2023
- Author(s)
  Wenqing Wei, Zhengdong Yang, Yuan Gao, Jiyi Li, Chenhui Chu, Shogo Okada, Sheng Li
- Organizer
  IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] KyotoMOS: An Automatic MOS Scoring System for Speech Synthesis2023
- Author(s)
  Wangjin Zhou, Zhengdong Yang, Sheng Li, Chenhui Chu
- Organizer
  In Proceedings of ACM Multimedia Asia Workshop of Multimodal, Multilingual and Multitask Modeling Technologies for Oriental Languages (M3Oriental)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K28144
[Presentation] DOMAIN AND LANGUAGE ADAPTATION USING HETEROGENEOUS DATASETS FOR WAV2VEC2.0-BASED SPEECH RECOGNITION OF LOW-RESOURCE LANGUAGE2023
- Author(s)
  Kak Soky, Sheng Li, Chenhui Chu, Tatsuya Kawahara
- Organizer
  ICASSP
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Correction while Recognition: Combining Pretrained Language Model for Taiwan-Accented Speech Recognition2023
- Author(s)
  Sheng Li, Jiyi Li
- Organizer
  Artificial Neural Networks and Machine Learning (ICANN) 2023
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-23K11227
[Presentation] Compressing Transformer-based ASR Model by Task-driven Loss and Attention-based Multi-level Feature Distillation2022
- Author(s)
  Y. Lv, L. Wang, M. Ge, S. Li, C. Ding, L. Pan, Y. Wang, J. Dang, K. Honda
- Organizer
  in Proc. IEEE-ICASSP, pp. 7992--7996, 2022.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism2022
- Author(s)
  Kak Soky, Sheng Li, Masato Mimura, Chenhui Chu, Tatsuya Kawahara
- Organizer
  INTERSPEECH 2022
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Mining Hard Samples Locally and Globally for Improved Speech Separation2022
- Author(s)
  K. Wang, Y. Peng, H. Huang, Y. Hu, and S. Li
- Organizer
  in Proc. IEEE-ICASSP, pp. 6037--6041, 2022.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] The System Description for VoiceMOS Challenge 2022 (KK team, main/ood tasks)2022
- Author(s)
  S. Li, R. Dabre, R. Raphael, W. Zhou, Z. Yang, C. Chu, Y. Zhao
- Organizer
  VoiceMOS Challenge 2022
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network2022
- Author(s)
  Kai Li, Xugang Lu, Masato Akagi, Jianwu Dang, Sheng Li, Masashi Unoki
- Organizer
  30th European Signal Processing Conference (EUSIPCO)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Augmented Adversarial Self-Supervised Learning for Early-Stage Alzheimer's Speech Detection2022
- Author(s)
  Longfei Yang, Wenqing Wei, Sheng Li, Jiyi Li, Takahiro Shinozaki
- Organizer
  INTERSPEECH 2022
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Subband-based Spectrogram Fusion for Speech Enhancement by Combining Mapping and Masking Approaches2022
- Author(s)
  Hao Shi, Longbiao Wang, Sheng Li, Jianwu Dang, Tatsuya Kawahara
- Organizer
  Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model2022
- Author(s)
  Z. Gong, D. Saito, L. Yang, T. Shinozaki, S. Li, H. Kawai and N. Minematsu
- Organizer
  ISCA-Odyssey (The Speaker and Language Recognition Workshop)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection2022
- Author(s)
  S. Li, J. Li, Q. Liu and Z. Gong
- Organizer
  LREC (Language Resources and Evaluation Conference)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Can We Train a Language Model Inside an End-to-End ASR Model? - Investigating Effective Implicit Language Modeling2022
- Author(s)
  Zhuo Gong, Saito Daisuke, Sheng Li, Hisashi Kawai, Minematsu Nobuaki
- Organizer
  Proceedings of the Second Workshop on When Creative AI Meets Conversational AI
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Nict-Tib1: A Public Speech Corpus Of Lhasa Dialect For Benchmarking Tibetan Language Speech Recognition Systems2022
- Author(s)
  Kak Soky, Zhuo Gong, Sheng Li
- Organizer
  25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Fusion of Self-supervised Learned Models for MOS Prediction2022
- Author(s)
  Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Sheng Li, Raj Dabre, Raphael Rubino, Yi Zhao
- Organizer
  INTERSPEECH 2022
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Multi-Domain Dialogue State Tracking with Top-k Slot Self Attention2022
- Author(s)
  Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
- Organizer
  SIGdial Meeting Discourse \& Dialogue 2022
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction2022
- Author(s)
  Hao Shi, Longbiao Wang, Sheng Li, Jianwu Dang, Tatsuya Kawahara
- Organizer
  INTERSPEECH 2022
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection2022
- Author(s)
  Kai Li, Sheng Li, Xugang Lu, Masato Akagi, Meng Liu, Lin Zhang, Chang Zeng, Longbiao Wang, Jianwu Dang, Masashi Unoki
- Organizer
  INTERSPEECH 2022
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] An investigation of using hybrid modeling units for improving End-to-End speech recognition systems.2021
- Author(s)
  S. Chen, X. Hu, S. Li, X. Xu,
- Organizer
  IEEE-ICASSP, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Adversarial Attack and Defense on Deep Neural Network-based Voice Processing Systems: An Overview2021
- Author(s)
  X. Chen, H. Huang, and S. Li
- Organizer
  National Conference on Man-Machine Speech Communication (NCMMSC), 2021. (report is selected to publish in Applied Sciences, Special Issues of Machine Speech Communication)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Robust voice activity detection using a masked auditory encoder based convolutional neural network.2021
- Author(s)
  N. Li, L. Wang, M. Unoki, S. Li, R. Wang, M. Ge, J. Dang,
- Organizer
  IEEE-ICASSP, 2021
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Speech Dereverberation Based on Scale-aware Mean Square Error Loss2021
- Author(s)
  L. Qiang, H. Shi, M. Ge, H. Yin, N. Li, L. Wang, S. Li and J. Dang
- Organizer
  International Conference on Neural Information Processing (ICONIP2021), pp 55-63, Springer, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] System description of Alzheimer's disease early detection (Silk-road team, short speech track)2021
- Author(s)
  W. Wei, R. Wong, S. Li, Y. Guo and H. Huang
- Organizer
  In special session of NCMMSC2021 (Alzheimer's disease detection challenge), 2021
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Spectrograms Fusion-based End-to-End Robust Automatic Speech Recognition2021
- Author(s)
  H. Shi, L. Wang, S. Li, C. Fan, J. Dang, and T. Kawahara
- Organizer
  In Proc. APSIPA ASC, pp. 438--442, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Encoder-Decoder based pitch tracking and joint model training for Mandarin tone classification.2021
- Author(s)
  H. Huang, K. Wang, Y. Hu, S. Li,
- Organizer
  IEEE-ICASSP, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Exploring Effective Speech Representation via ASR for High-Quality End-to-End Multispeaker TTS2021
- Author(s)
  D. Liu, L. Wang, S. Li, H. Li, C. Ding, J. Zhang and J. Dang
- Organizer
  International Conference on Neural Information Processing (ICONIP2021), pp 110-118, Springer, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] The RoyalFlush-NICT System Description for AP21-OLR Challenge (Silk-road team, full tasks)2021
- Author(s)
  D. Wang, S. Ye, X. Hu, S. Li
- Organizer
  OLR2021 (oriental language recognition challenge)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Multilingual Approach to Joint Speech and Accent Recognition with DNN-HMM Framework2021
- Author(s)
  Y. Peng, J. Zhang, H. Zhang, H. Xu, H. Huang, S. Li, and E.S. Chng
- Organizer
  In Proc. APSIPA ASC, pp. 1043--1048, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] On the Use of Speaker Information for Automatic Speech Recognition in Speaker-imbalanced Corpora2021
- Author(s)
  K. Soky, S. Li, M. Mimura, C. Chu, and T. Kawahara
- Organizer
  In Proc. APSIPA ASC, pp. 433--437, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model2021
- Author(s)
  D. Wang, S. Ye, X. Hu, S. Li, and X. Xu
- Organizer
  in Proc. INTERSPEECH, pp. 3266--3270, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time- Frequency Domain2021
- Author(s)
  K. Wang, H. Huang, Y. Hu, Z. Huang, and S. Li
- Organizer
  in Proc. INTERSPEECH, pp. 3046--3050, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Simultaneous Progressive Filtering-based Monaural Speech Enhancement2021
- Author(s)
  H. Yin, L. Qiang, H. Shi, L. Wang, S. Li, M. Ge, G. Zhang and J. Dang
- Organizer
  International Conference on Neural Information Processing (ICONIP2021), pp 213-221, Springer, 2021.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-21K17837
[Presentation] Comparison of End-to-End Models for Joint Speaker and Speech Recognition2021
- Author(s)
  K. Soky, S. Li, M. Mimura, C. Chu, T. Kawahara,
- Organizer
  IEICE-SP, 2021.
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Improvement of x-vector for short utterance spoken language identification,2020
- Author(s)
  P. Shen, X. Lu, K. Sugiura, S. Li, H. Kawai,
- Organizer
  Acoustical Society of Japan, spring, 2020.
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation.2020
- Author(s)
  H. Shi, L. Wang, M. Ge, S. Li, and J. Dang.
- Organizer
  IEEE-ICASSP
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Effectively Synthesizing Code-switched Speech Using Highly Imbalanced Mix-lingual Data and mask embedding2020
- Author(s)
  S. Guo, L. Wang, S. Li, J. Zhang, C. Gong, Y. Wang, J. Dang, K. Honda
- Organizer
  Interspeech 2020 Satellite Workshop (SLIMTS2020)
- Invited / Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Multilingual transformer training for Khmer automatic speech recognition2020
- Author(s)
  K. Soky, S. Li, T. Kawahara, S. Seng,
- Organizer
  Interspeech 2020 Satellite Workshop (SLIMTS2020)
- Invited / Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] System Description for Voice Privacy Challenge (Kyoto Team).2020
- Author(s)
  Y. Han, S. Li, Y. Cao, M. Yoshikawa,
- Organizer
  In special session of INTERSPEECH 2020 (VoicePrivacy challenge 2020).
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription2020
- Author(s)
  Y. Lin, L. Wang, S. Li, J. Dang, and C. Ding.
- Organizer
  In Proc. INTERSPEECH, 2020 (Travel Granted by ISCA).
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Voice-Indistinguishability: Protecting Voiceprint in Privacy Preserving Speech Data Release.2020
- Author(s)
  Y. Han, S. Li, Y. Cao, Q. Ma and M. Yoshikawa.
- Organizer
  IEEE-ICME
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Compensation on x-vector for short utterance spoken language identification.2020
- Author(s)
  P. Shen, X. Lu, K. Sugiura, S. Li and H. Kawai.
- Organizer
  ISCA-Odyssey (The Speaker and Language Recognition Workshop)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] VOIS: The First Speech Therapy App in the World for Myanmar Hearing-Impaired Children.2020
- Author(s)
  A. Thida, N. Han, S. Oo, S. Li and C. Ding.
- Organizer
  In Proc. O-COCOSDA, 2020.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] End-to-End Articulatory Attribute Modeling for Low-resource Multilingual Speech Recognition,2020
- Author(s)
  S. Li, C. Ding, X. Lu, P. Shen and H. Kawai,
- Organizer
  Acoustical Society of Japan, spring, 2020.
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] End-to-End Speech Translation with Cross-lingual Transfer Learning2020
- Author(s)
  S. Shimizu, C. Chu, S. Li, S. Kurohashi,
- Organizer
  NLP, 2021.
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Effectively Synthesizing Code-switched Speech Using Highly Imbalanced Mix-lingual Data2020
- Author(s)
  S. Guo, L. Wang, S. Li, J. Zhang, C. Gong, Y. Wang, J. Dang, K. Honda.
- Organizer
  In Proc. ICONIP, 2020.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Singing Voice Extraction with Attention based Spectrograms Fusion.2020
- Author(s)
  H. Shi, L. Wang, S. Li, C. Ding, M. Ge, N. Li, J. Dang, and H. Seki.
- Organizer
  In Proc. INTERSPEECH, 2020 (Travel Granted by ISCA).
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Voice-Indistinguishability: Protecting Voiceprint with Differential Privacy under an Untrusted Server.2020
- Author(s)
  Y. Han, Y. Cao, S. Li, Q. Ma, M. Yoshikawa.
- Organizer
  ACM conference on Computer and Communications Security (CCS), demo, 2020.
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Joint Training End-to-End Speech Recognition Systems with Speaker Attributes.2020
- Author(s)
  S. Li, X. Lu, R. Dabre, P. Shen and H. Kawai
- Organizer
  ISCA-Odyssey (The Speaker and Language Recognition Workshop)
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] A Mixture of Character and Word End-to-End System for Keyword Spotting2020
- Author(s)
  H. Zhang, S. Ueno, M. Mimura, S. Li, W. Zhang, T. Kawahara,
- Organizer
  Interspeech 2020 Satellite Workshop (SLIMTS2020)(full paper).
- Invited / Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release,2020
- Author(s)
  Y. Han, Y. Cao, S. Li, Q. Ma, M. Yoshikawa.
- Organizer
  Interspeech 2020 Satellite Workshop (SLIMTS2020) (invited report).
- Invited / Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Phantom in the Opera: Effective Adversarial Music Attack on Keyword Spotting Systems.2020
- Author(s)
  H. Zhang, S. Li, X. Ma, Y. Zhao, Y. Cao, T. Kawahara,
- Organizer
  IEEE-SLT, 2021
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] Joint Training End-to-End Systems for Speech and Speaker Recognition with Speaker Attributes,2020
- Author(s)
  S. Li, X. Lu, R. Dabre, P. Shen and H. Kawai,
- Organizer
  Acoustical Society of Japan, spring, 2020.
- Data Source
  KAKENHI-PROJECT-19K24376
[Presentation] End-To-End Articulatory Modeling for Dysarthria Articulatory Attribute Detection.2020
- Author(s)
  Y. Lin, L. Wang, J. Dang, S. Li, and C. Ding.
- Organizer
  IEEE-ICASSP
- Int'l Joint Research
- Data Source
  KAKENHI-PROJECT-19K24376

1. チョシンキ (70784891)

# of Collaborated Projects: 2 results

# of Collaborated Products: 18 results
2. 李吉屹 (30726667)

# of Collaborated Projects: 1 results

# of Collaborated Products: 20 results
3. TOU Takeshi

# of Collaborated Projects: 0 results

# of Collaborated Products: 1 results
4. 河原達也

# of Collaborated Projects: 0 results

# of Collaborated Products: 1 results

Li Sheng 李 勝

Research Projects

Research Products

Co-Researchers

意図を的確に伝える音声対話翻訳の基盤技術の創出

Principal Investigator

Project Period (FY)

Research Category

Review Section

Research Institution

M3OLR: Towards Effective Multilingual, Multimodal and Multitask Oriental Low-resourced Language Speech RecognitionPrincipal Investigator

Principal Investigator

Project Period (FY)

Research Category

Review Section

Research Institution

Phantom in the Opera: the Vulnerabilities of Speech Interface for Robotic Dialogue SystemPrincipal Investigator

Principal Investigator

Project Period (FY)

Research Category

Review Section

Research Institution

Next generation multilingual End-to-End speech recognition (from G30 to G200)Principal Investigator

Principal Investigator

Project Period (FY)

Research Category

Review Section

Research Institution

[Book] Phantom in the Opera: The Vulnerabilities of Speech-based Artificial Intelligence Systems2022

Author(s)

Total Pages

Publisher

ISBN

Data Source

[Book] Voices of the Himalayas: Investigation of Speech Recognition Technology for the Tibetan Language2022

Author(s)

Total Pages

Publisher

ISBN

Data Source

[Book] Automatic speech recognition2020

Author(s)

Total Pages

Publisher

ISBN

Data Source

[Journal Article] Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network2024

Author(s)

Journal Title

DOI

Data Source

[Journal Article] Phantom in the opera: adversarial music attack for robot dialogue system2024

Author(s)

Journal Title

DOI

Data Source

[Journal Article] Disordered speech recognition considering low resources and abnormal articulation2023

Author(s)

Journal Title

DOI

Data Source

[Journal Article] Finetuning Pretrained Model with Embedding of Domain and Language Information for ASR of Very Low-Resource Settings2023

Author(s)

Journal Title

DOI

Data Source

[Journal Article] Cross-Lingual Transfer Learning for End-to-End Speech Translation2022

Author(s)

Journal Title

DOI

ISSN

Language

Data Source

[Journal Article] TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies2022

Author(s)

Journal Title

DOI

Data Source

[Journal Article] Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling2022

Author(s)

Li Sheng 李勝