• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Li Sheng  李 勝

Researcher Number 70840940
Other IDs
  • ORCIDhttps://orcid.org/0000-0001-7636-3797
Affiliation (Current) 2025: 東京科学大学, 工学院, 助教
Affiliation (based on the past Project Information) *help 2021 – 2024: 国立研究開発法人情報通信研究機構, ユニバーサルコミュニケーション研究所先進的音声翻訳研究開発推進センター, 研究員
2020: 国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター 先進的音声技術研究室, 研究員
2019: 国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的音声技術研究室, 研究員
Review Section/Research Field
Principal Investigator
Basic Section 61030:Intelligent informatics-related / Basic Section 61050:Intelligent robotics-related / 1002:Human informatics, applied informatics and related fields
Except Principal Investigator
Sections That Are Subject to Joint Review: Basic Section60030:Statistical science-related , Basic Section61030:Intelligent informatics-related / Basic Section 60030:Statistical science-related / Basic Section 61030:Intelligent informatics-related
Keywords
Principal Investigator
speech recognition / quality estimation / federated learning / Low-resource / Multilingual / Multimodal / Multitask / Deep neural network / Adversarial attack / Dialogue robotic system … More / Speech recognition / speech enhancement / adversarial attacks / spoken dialogue system / privacy preserving / security / spoken dialogue / deepfake detection / privacy perserving / adversarial attack / speaker diarization / end-to-end / code-switched / disordered speech / language identification / multi-unit modeling / speech translation / low-resourced modeling / multilingual modeling / End-to-End / articulation / multilingual … More
Except Principal Investigator
メタ介入 / 音声翻訳 / 感情音声認識 / 多言語対話 / 意図理解 / 音声対話翻訳 Less
  • Research Projects

    (4 results)
  • Research Products

    (92 results)
  • Co-Researchers

    (4 People)
  •  意図を的確に伝える音声対話翻訳の基盤技術の創出

    • Principal Investigator
      チョ シンキ
    • Project Period (FY)
      2023 – 2026
    • Research Category
      Grant-in-Aid for Scientific Research (B)
    • Review Section
      Basic Section 61030:Intelligent informatics-related
      Basic Section 60030:Statistical science-related
      Sections That Are Subject to Joint Review: Basic Section60030:Statistical science-related , Basic Section61030:Intelligent informatics-related
    • Research Institution
      Kyoto University
  •  M3OLR: Towards Effective Multilingual, Multimodal and Multitask Oriental Low-resourced Language Speech RecognitionPrincipal Investigator

    • Principal Investigator
      李 勝
    • Project Period (FY)
      2023 – 2025
    • Research Category
      Grant-in-Aid for Scientific Research (C)
    • Review Section
      Basic Section 61030:Intelligent informatics-related
    • Research Institution
      National Institute of Information and Communications Technology
  •  Phantom in the Opera: the Vulnerabilities of Speech Interface for Robotic Dialogue SystemPrincipal Investigator

    • Principal Investigator
      Li Sheng
    • Project Period (FY)
      2021 – 2022
    • Research Category
      Grant-in-Aid for Early-Career Scientists
    • Review Section
      Basic Section 61050:Intelligent robotics-related
    • Research Institution
      National Institute of Information and Communications Technology
  •  Next generation multilingual End-to-End speech recognition (from G30 to G200)Principal Investigator

    • Principal Investigator
      Li Sheng
    • Project Period (FY)
      2019 – 2020
    • Research Category
      Grant-in-Aid for Research Activity Start-up
    • Review Section
      1002:Human informatics, applied informatics and related fields
    • Research Institution
      National Institute of Information and Communications Technology

All 2024 2023 2022 2021 2020 2019

All Journal Article Presentation Book Patent

  • [Book] Phantom in the Opera: The Vulnerabilities of Speech-based Artificial Intelligence Systems2022

    • Author(s)
      Sheng Li
    • Total Pages
      110
    • Publisher
      NICT
    • ISBN
      9784904020265
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Book] Voices of the Himalayas: Investigation of Speech Recognition Technology for the Tibetan Language2022

    • Author(s)
      Sheng Li
    • Total Pages
      112
    • Publisher
      NICT
    • ISBN
      9784904020289
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Book] Automatic speech recognition2020

    • Author(s)
      X. Lu, S. Li, M. Fujimoto
    • Total Pages
      18
    • Publisher
      Springer Singapore
    • ISBN
      9789811505959
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Journal Article] Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network2024

    • Author(s)
      Li Nan、Wang Longbiao、Ge Meng、Unoki Masashi、Li Sheng、Dang Jianwu
    • Journal Title

      Speech Communication

      Volume: 157 Pages: 103024-103024

    • DOI

      10.1016/j.specom.2023.103024

    • Peer Reviewed / Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Journal Article] Phantom in the opera: adversarial music attack for robot dialogue system2024

    • Author(s)
      Li Sheng、Li Jiyi、Cao Yang
    • Journal Title

      Frontiers in Computer Science, 15 February 2024

      Volume: 6 Pages: 1-9

    • DOI

      10.3389/fcomp.2024.1355975

    • Peer Reviewed / Open Access
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Journal Article] Disordered speech recognition considering low resources and abnormal articulation2023

    • Author(s)
      Lin Yuqin、Dang Jianwu、Wang Longbiao、Li Sheng、Ding Chenchen
    • Journal Title

      Speech Communication

      Volume: 155 Pages: 103002-103002

    • DOI

      10.1016/j.specom.2023.103002

    • Peer Reviewed / Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Journal Article] Finetuning Pretrained Model with Embedding of Domain and Language Information for ASR of Very Low-Resource Settings2023

    • Author(s)
      Soky Kak、Li Sheng、Chu Chenhui、Kawahara Tatsuya
    • Journal Title

      International Journal of Asian Language Processing

      Volume: 33 Issue: 04 Pages: 2350024-2350024

    • DOI

      10.1142/s2717554523500248

    • Peer Reviewed
    • Data Source
      KAKENHI-PROJECT-23K11227, KAKENHI-PROJECT-23K28144
  • [Journal Article] Cross-Lingual Transfer Learning for End-to-End Speech Translation2022

    • Author(s)
      Shimizu Shuichiro、Chu Chenhui、Li Sheng、Kurohashi Sadao
    • Journal Title

      Journal of Natural Language Processing

      Volume: 29 Issue: 2 Pages: 611-637

    • DOI

      10.5715/jnlp.29.611

    • ISSN
      1340-7619, 2185-8314
    • Language
      English
    • Peer Reviewed / Open Access
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Journal Article] TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies2022

    • Author(s)
      Soky Kak、Mimura Masato、Kawahara Tatsuya、Chu Chenhui、Li Sheng、Ding Chenchen、Sam Sethserey
    • Journal Title

      International Journal of Asian Language Processing

      Volume: 31 Issue: 03n04 Pages: 1-21

    • DOI

      10.1142/s2717554522500072

    • Peer Reviewed / Open Access
    • Data Source
      KAKENHI-PROJECT-20H00602, KAKENHI-PROJECT-21K17837, KAKENHI-PROJECT-21H05054
  • [Journal Article] Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling2022

    • Author(s)
      Qin Siqing、Wang Longbiao、Li Sheng、Dang Jianwu、Pan Lixin
    • Journal Title

      EURASIP Journal on Audio, Speech, and Music Processing

      Volume: 2022 Issue: 1 Pages: 1-10

    • DOI

      10.1186/s13636-021-00233-4

    • Peer Reviewed / Open Access / Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-20K11883, KAKENHI-PROJECT-21K17837
  • [Journal Article] Adversarial Attack and Defense on Deep Neural Network-Based Voice Processing Systems: An Overview2021

    • Author(s)
      Chen Xiaojiao、Li Sheng、Huang Hao
    • Journal Title

      Applied Sciences

      Volume: 11 Issue: 18 Pages: 8450-8450

    • DOI

      10.3390/app11188450

    • Peer Reviewed / Open Access / Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Journal Article] Knowledge Distillation-based Representation Learning for Short-Utterance Spoken Language Identification2020

    • Author(s)
      P. Shen, X. Lu, S. Li, H. Kawai.
    • Journal Title

      IEEE/ACM Trans. Audio, Speech \& Language Process.

      Volume: 28 Pages: 2674-2683

    • DOI

      10.1109/taslp.2020.3023627

    • Peer Reviewed
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Patent] 推論器および推論器の学習方法2020

    • Inventor(s)
      李勝、ルーシュガン、河井恒
    • Industrial Property Rights Holder
      国立研究開発法人情報通信研究機構
    • Industrial Property Rights Type
      特許
    • Industrial Property Number
      2020-059962
    • Filing Date
      2020
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Patent] 推論器、推論プログラムおよび学習方法2019

    • Inventor(s)
      李勝、 ルーシュガン、 丁塵辰、 河原達也、 河井恒
    • Industrial Property Rights Holder
      国立研究開発法人情報通信研究機構
    • Industrial Property Rights Type
      特許
    • Industrial Property Number
      2019-163555
    • Filing Date
      2019
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Patent] 言語識別モデルの訓練方法及び装置、並びにそのためのコンピュータプログラム2019

    • Inventor(s)
      沈 鵬, ルー シュガン , 李 勝 , 河井 恒
    • Industrial Property Rights Holder
      国立研究開発法人情報通信研究機構
    • Industrial Property Rights Type
      特許
    • Industrial Property Number
      2019-086005
    • Filing Date
      2019
    • Acquisition Date
      2020
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Patent] 推論器、学習方法および学習プログラム2019

    • Inventor(s)
      李勝、 ルーシュガン、 ダブレラジ、 河井恒
    • Industrial Property Rights Holder
      国立研究開発法人情報通信研究機構
    • Industrial Property Rights Type
      特許
    • Industrial Property Number
      2019-051008
    • Filing Date
      2019
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Cross-lingual Mapping for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition2024

    • Author(s)
      Zhengdong Yang, Qianying Liu, Sheng Li, Chenhui Chu, Fei Cheng, Sadao Kurohashi
    • Organizer
      日本音響学会第 150 回(2023 年秋季)研究発表会
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Investigating Effective Methods for Combining Large Language Model with Speech Recognition System2024

    • Author(s)
      Sheng Li, Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Hisashi Kawai
    • Organizer
      日本音響学会第151回(2024年春季)研究発表会
    • Data Source
      KAKENHI-PROJECT-23K28144
  • [Presentation] Investigating effective methods for combining large language model with speech recognition system2024

    • Author(s)
      李 勝, 楊 正東, 周 汪勁, Chenhui Chu, 河井 恒
    • Organizer
      日本音響学会第151回(2024年春季)研究発表会
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Combining Large Language Model with Speech Recognition System in Low-resource Settings2024

    • Author(s)
      李 勝, 楊 正東, 周 汪勁, Chenhui Chu, Chen Chen, Chng Eng Siong, 河井 恒
    • Organizer
      言語処理学会第30回年次大会
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction2024

    • Author(s)
      Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara
    • Organizer
      IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Combining Large Language Model with Speech Recognition System in Low-resource Settings2024

    • Author(s)
      Sheng Li, Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Chen Chen, Eng Siong Chng, Hisashi Kawai
    • Organizer
      言語処理学会 第30回年次大会
    • Data Source
      KAKENHI-PROJECT-23K28144
  • [Presentation] Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition2023

    • Author(s)
      Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao Kurohashi
    • Organizer
      In Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K28144
  • [Presentation] GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System2023

    • Author(s)
      Xiaojiao Chen, Sheng Li, Jiyi Li, Yang Cao, Hao Huang, Liang He
    • Organizer
      ACM Multimedia Asia
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Kyoto Speech-to-Speech Translation System for IWSLT 20232023

    • Author(s)
      Zhengdong Yang, Shuichiro Shimizu, Zhou Wangjin, Sheng Li, Chenhui Chu
    • Organizer
      In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023). pp.357-362
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K28144
  • [Presentation] Dialogue State Tracking with Sparse Local Slot Attention2023

    • Author(s)
      Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
    • Organizer
      ACL 2023 Workshop on NLP for Conversational AI
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] KyotoMOS: An Automatic MOS Scoring System for Speech Synthesis2023

    • Author(s)
      Wangjin Zhou, Zhengdong Yang, Sheng Li, Chenhui Chu
    • Organizer
      ACM Multimedia Asia Workshops 2023
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Self-Supervised Learning MOS Prediction with Listener Enhancement2023

    • Author(s)
      Sheng Li
    • Organizer
      VoiceMOS mini workshop
    • Invited / Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Towards Speech Dialogue Translation Mediating Speakers of Different Languages2023

    • Author(s)
      Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
    • Organizer
      In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings Volume. pp.1122-1134
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K28144
  • [Presentation] Hierarchical Softmax for End-To-End Low-Resource Multilingual Speech Recognition2023

    • Author(s)
      Qianying Liu Zhuo Gong Zhengdong Yang Yuhang Yang Sheng Li Chenchen Ding Nobuaki Minematsu Hao Huang Fei Cheng Chenhui Chu Sadao Kurohashi
    • Organizer
      2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization2023

    • Author(s)
      Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He
    • Organizer
      ACM Multimedia Asia
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Video-Helpful Multimodal Machine Translation2023

    • Author(s)
      Yihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li
    • Organizer
      In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023). pp.4281-4299
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K28144
  • [Presentation] The Kyoto Speech-to-Speech Translation System for IWSLT 20232023

    • Author(s)
      Zhengdong Yang, Shuichiro Shimizu, Wangjin Zhou, Sheng Li, Chenhui Chu
    • Organizer
      International Conference on Spoken Language Translation (IWSLT) 2023
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Multi-Domain Dialogue State Tracking with Disentangled Domain-Slot Attention2023

    • Author(s)
      Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
    • Organizer
      In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Towards Speech Dialogue Translation Mediating Speakers of Different Languages2023

    • Author(s)
      Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
    • Organizer
      In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] GENERAL OR SPECIFIC? INVESTIGATING EFFECTIVE PRIVACY PROTECTION IN FEDERATED LEARNING FOR SPEECH EMOTION RECOGNITION2023

    • Author(s)
      Chao Tan, Yang Cao, Sheng Li and Masatoshi Yoshikawa
    • Organizer
      ICASSP
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Domain and Language Adaptation Using Heterogeneous Datasets for Wav2vec2.0-based Speech Recognition of Low-resource Language2023

    • Author(s)
      Kak Soky, Sheng Li, Chenhui Chu, Tatsuya Kawahara
    • Organizer
      In Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K28144
  • [Presentation] LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement2023

    • Author(s)
      Zili Qi, Xinhui Hu, Wangjin Zhou, Sheng Li, Hao Wu, Jian Lu, Xinkang Xu
    • Organizer
      IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] FedCPC: An Effective Federated Contrastive Learning Method for Privacy Preserving Early-Stage Alzheimers Speech Detection2023

    • Author(s)
      Wenqing Wei, Zhengdong Yang, Yuan Gao, Jiyi Li, Chenhui Chu, Shogo Okada, Sheng Li
    • Organizer
      IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] KyotoMOS: An Automatic MOS Scoring System for Speech Synthesis2023

    • Author(s)
      Wangjin Zhou, Zhengdong Yang, Sheng Li, Chenhui Chu
    • Organizer
      In Proceedings of ACM Multimedia Asia Workshop of Multimodal, Multilingual and Multitask Modeling Technologies for Oriental Languages (M3Oriental)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K28144
  • [Presentation] DOMAIN AND LANGUAGE ADAPTATION USING HETEROGENEOUS DATASETS FOR WAV2VEC2.0-BASED SPEECH RECOGNITION OF LOW-RESOURCE LANGUAGE2023

    • Author(s)
      Kak Soky, Sheng Li, Chenhui Chu, Tatsuya Kawahara
    • Organizer
      ICASSP
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Correction while Recognition: Combining Pretrained Language Model for Taiwan-Accented Speech Recognition2023

    • Author(s)
      Sheng Li, Jiyi Li
    • Organizer
      Artificial Neural Networks and Machine Learning (ICANN) 2023
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-23K11227
  • [Presentation] Compressing Transformer-based ASR Model by Task-driven Loss and Attention-based Multi-level Feature Distillation2022

    • Author(s)
      Y. Lv, L. Wang, M. Ge, S. Li, C. Ding, L. Pan, Y. Wang, J. Dang, K. Honda
    • Organizer
      in Proc. IEEE-ICASSP, pp. 7992--7996, 2022.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism2022

    • Author(s)
      Kak Soky, Sheng Li, Masato Mimura, Chenhui Chu, Tatsuya Kawahara
    • Organizer
      INTERSPEECH 2022
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Mining Hard Samples Locally and Globally for Improved Speech Separation2022

    • Author(s)
      K. Wang, Y. Peng, H. Huang, Y. Hu, and S. Li
    • Organizer
      in Proc. IEEE-ICASSP, pp. 6037--6041, 2022.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] The System Description for VoiceMOS Challenge 2022 (KK team, main/ood tasks)2022

    • Author(s)
      S. Li, R. Dabre, R. Raphael, W. Zhou, Z. Yang, C. Chu, Y. Zhao
    • Organizer
      VoiceMOS Challenge 2022
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network2022

    • Author(s)
      Kai Li, Xugang Lu, Masato Akagi, Jianwu Dang, Sheng Li, Masashi Unoki
    • Organizer
      30th European Signal Processing Conference (EUSIPCO)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Augmented Adversarial Self-Supervised Learning for Early-Stage Alzheimer's Speech Detection2022

    • Author(s)
      Longfei Yang, Wenqing Wei, Sheng Li, Jiyi Li, Takahiro Shinozaki
    • Organizer
      INTERSPEECH 2022
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Subband-based Spectrogram Fusion for Speech Enhancement by Combining Mapping and Masking Approaches2022

    • Author(s)
      Hao Shi, Longbiao Wang, Sheng Li, Jianwu Dang, Tatsuya Kawahara
    • Organizer
      Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model2022

    • Author(s)
      Z. Gong, D. Saito, L. Yang, T. Shinozaki, S. Li, H. Kawai and N. Minematsu
    • Organizer
      ISCA-Odyssey (The Speaker and Language Recognition Workshop)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection2022

    • Author(s)
      S. Li, J. Li, Q. Liu and Z. Gong
    • Organizer
      LREC (Language Resources and Evaluation Conference)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Can We Train a Language Model Inside an End-to-End ASR Model? - Investigating Effective Implicit Language Modeling2022

    • Author(s)
      Zhuo Gong, Saito Daisuke, Sheng Li, Hisashi Kawai, Minematsu Nobuaki
    • Organizer
      Proceedings of the Second Workshop on When Creative AI Meets Conversational AI
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Nict-Tib1: A Public Speech Corpus Of Lhasa Dialect For Benchmarking Tibetan Language Speech Recognition Systems2022

    • Author(s)
      Kak Soky, Zhuo Gong, Sheng Li
    • Organizer
      25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Fusion of Self-supervised Learned Models for MOS Prediction2022

    • Author(s)
      Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Sheng Li, Raj Dabre, Raphael Rubino, Yi Zhao
    • Organizer
      INTERSPEECH 2022
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Multi-Domain Dialogue State Tracking with Top-k Slot Self Attention2022

    • Author(s)
      Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
    • Organizer
      SIGdial Meeting Discourse \& Dialogue 2022
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction2022

    • Author(s)
      Hao Shi, Longbiao Wang, Sheng Li, Jianwu Dang, Tatsuya Kawahara
    • Organizer
      INTERSPEECH 2022
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection2022

    • Author(s)
      Kai Li, Sheng Li, Xugang Lu, Masato Akagi, Meng Liu, Lin Zhang, Chang Zeng, Longbiao Wang, Jianwu Dang, Masashi Unoki
    • Organizer
      INTERSPEECH 2022
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] An investigation of using hybrid modeling units for improving End-to-End speech recognition systems.2021

    • Author(s)
      S. Chen, X. Hu, S. Li, X. Xu,
    • Organizer
      IEEE-ICASSP, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Adversarial Attack and Defense on Deep Neural Network-based Voice Processing Systems: An Overview2021

    • Author(s)
      X. Chen, H. Huang, and S. Li
    • Organizer
      National Conference on Man-Machine Speech Communication (NCMMSC), 2021. (report is selected to publish in Applied Sciences, Special Issues of Machine Speech Communication)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Robust voice activity detection using a masked auditory encoder based convolutional neural network.2021

    • Author(s)
      N. Li, L. Wang, M. Unoki, S. Li, R. Wang, M. Ge, J. Dang,
    • Organizer
      IEEE-ICASSP, 2021
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Speech Dereverberation Based on Scale-aware Mean Square Error Loss2021

    • Author(s)
      L. Qiang, H. Shi, M. Ge, H. Yin, N. Li, L. Wang, S. Li and J. Dang
    • Organizer
      International Conference on Neural Information Processing (ICONIP2021), pp 55-63, Springer, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] System description of Alzheimer's disease early detection (Silk-road team, short speech track)2021

    • Author(s)
      W. Wei, R. Wong, S. Li, Y. Guo and H. Huang
    • Organizer
      In special session of NCMMSC2021 (Alzheimer's disease detection challenge), 2021
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Spectrograms Fusion-based End-to-End Robust Automatic Speech Recognition2021

    • Author(s)
      H. Shi, L. Wang, S. Li, C. Fan, J. Dang, and T. Kawahara
    • Organizer
      In Proc. APSIPA ASC, pp. 438--442, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Encoder-Decoder based pitch tracking and joint model training for Mandarin tone classification.2021

    • Author(s)
      H. Huang, K. Wang, Y. Hu, S. Li,
    • Organizer
      IEEE-ICASSP, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Exploring Effective Speech Representation via ASR for High-Quality End-to-End Multispeaker TTS2021

    • Author(s)
      D. Liu, L. Wang, S. Li, H. Li, C. Ding, J. Zhang and J. Dang
    • Organizer
      International Conference on Neural Information Processing (ICONIP2021), pp 110-118, Springer, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] The RoyalFlush-NICT System Description for AP21-OLR Challenge (Silk-road team, full tasks)2021

    • Author(s)
      D. Wang, S. Ye, X. Hu, S. Li
    • Organizer
      OLR2021 (oriental language recognition challenge)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Multilingual Approach to Joint Speech and Accent Recognition with DNN-HMM Framework2021

    • Author(s)
      Y. Peng, J. Zhang, H. Zhang, H. Xu, H. Huang, S. Li, and E.S. Chng
    • Organizer
      In Proc. APSIPA ASC, pp. 1043--1048, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] On the Use of Speaker Information for Automatic Speech Recognition in Speaker-imbalanced Corpora2021

    • Author(s)
      K. Soky, S. Li, M. Mimura, C. Chu, and T. Kawahara
    • Organizer
      In Proc. APSIPA ASC, pp. 433--437, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model2021

    • Author(s)
      D. Wang, S. Ye, X. Hu, S. Li, and X. Xu
    • Organizer
      in Proc. INTERSPEECH, pp. 3266--3270, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time- Frequency Domain2021

    • Author(s)
      K. Wang, H. Huang, Y. Hu, Z. Huang, and S. Li
    • Organizer
      in Proc. INTERSPEECH, pp. 3046--3050, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Simultaneous Progressive Filtering-based Monaural Speech Enhancement2021

    • Author(s)
      H. Yin, L. Qiang, H. Shi, L. Wang, S. Li, M. Ge, G. Zhang and J. Dang
    • Organizer
      International Conference on Neural Information Processing (ICONIP2021), pp 213-221, Springer, 2021.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-21K17837
  • [Presentation] Comparison of End-to-End Models for Joint Speaker and Speech Recognition2021

    • Author(s)
      K. Soky, S. Li, M. Mimura, C. Chu, T. Kawahara,
    • Organizer
      IEICE-SP, 2021.
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Improvement of x-vector for short utterance spoken language identification,2020

    • Author(s)
      P. Shen, X. Lu, K. Sugiura, S. Li, H. Kawai,
    • Organizer
      Acoustical Society of Japan, spring, 2020.
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation.2020

    • Author(s)
      H. Shi, L. Wang, M. Ge, S. Li, and J. Dang.
    • Organizer
      IEEE-ICASSP
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Effectively Synthesizing Code-switched Speech Using Highly Imbalanced Mix-lingual Data and mask embedding2020

    • Author(s)
      S. Guo, L. Wang, S. Li, J. Zhang, C. Gong, Y. Wang, J. Dang, K. Honda
    • Organizer
      Interspeech 2020 Satellite Workshop (SLIMTS2020)
    • Invited / Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Multilingual transformer training for Khmer automatic speech recognition2020

    • Author(s)
      K. Soky, S. Li, T. Kawahara, S. Seng,
    • Organizer
      Interspeech 2020 Satellite Workshop (SLIMTS2020)
    • Invited / Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] System Description for Voice Privacy Challenge (Kyoto Team).2020

    • Author(s)
      Y. Han, S. Li, Y. Cao, M. Yoshikawa,
    • Organizer
      In special session of INTERSPEECH 2020 (VoicePrivacy challenge 2020).
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription2020

    • Author(s)
      Y. Lin, L. Wang, S. Li, J. Dang, and C. Ding.
    • Organizer
      In Proc. INTERSPEECH, 2020 (Travel Granted by ISCA).
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Voice-Indistinguishability: Protecting Voiceprint in Privacy Preserving Speech Data Release.2020

    • Author(s)
      Y. Han, S. Li, Y. Cao, Q. Ma and M. Yoshikawa.
    • Organizer
      IEEE-ICME
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Compensation on x-vector for short utterance spoken language identification.2020

    • Author(s)
      P. Shen, X. Lu, K. Sugiura, S. Li and H. Kawai.
    • Organizer
      ISCA-Odyssey (The Speaker and Language Recognition Workshop)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] VOIS: The First Speech Therapy App in the World for Myanmar Hearing-Impaired Children.2020

    • Author(s)
      A. Thida, N. Han, S. Oo, S. Li and C. Ding.
    • Organizer
      In Proc. O-COCOSDA, 2020.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] End-to-End Articulatory Attribute Modeling for Low-resource Multilingual Speech Recognition,2020

    • Author(s)
      S. Li, C. Ding, X. Lu, P. Shen and H. Kawai,
    • Organizer
      Acoustical Society of Japan, spring, 2020.
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] End-to-End Speech Translation with Cross-lingual Transfer Learning2020

    • Author(s)
      S. Shimizu, C. Chu, S. Li, S. Kurohashi,
    • Organizer
      NLP, 2021.
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Effectively Synthesizing Code-switched Speech Using Highly Imbalanced Mix-lingual Data2020

    • Author(s)
      S. Guo, L. Wang, S. Li, J. Zhang, C. Gong, Y. Wang, J. Dang, K. Honda.
    • Organizer
      In Proc. ICONIP, 2020.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Singing Voice Extraction with Attention based Spectrograms Fusion.2020

    • Author(s)
      H. Shi, L. Wang, S. Li, C. Ding, M. Ge, N. Li, J. Dang, and H. Seki.
    • Organizer
      In Proc. INTERSPEECH, 2020 (Travel Granted by ISCA).
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Voice-Indistinguishability: Protecting Voiceprint with Differential Privacy under an Untrusted Server.2020

    • Author(s)
      Y. Han, Y. Cao, S. Li, Q. Ma, M. Yoshikawa.
    • Organizer
      ACM conference on Computer and Communications Security (CCS), demo, 2020.
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Joint Training End-to-End Speech Recognition Systems with Speaker Attributes.2020

    • Author(s)
      S. Li, X. Lu, R. Dabre, P. Shen and H. Kawai
    • Organizer
      ISCA-Odyssey (The Speaker and Language Recognition Workshop)
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] A Mixture of Character and Word End-to-End System for Keyword Spotting2020

    • Author(s)
      H. Zhang, S. Ueno, M. Mimura, S. Li, W. Zhang, T. Kawahara,
    • Organizer
      Interspeech 2020 Satellite Workshop (SLIMTS2020)(full paper).
    • Invited / Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release,2020

    • Author(s)
      Y. Han, Y. Cao, S. Li, Q. Ma, M. Yoshikawa.
    • Organizer
      Interspeech 2020 Satellite Workshop (SLIMTS2020) (invited report).
    • Invited / Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Phantom in the Opera: Effective Adversarial Music Attack on Keyword Spotting Systems.2020

    • Author(s)
      H. Zhang, S. Li, X. Ma, Y. Zhao, Y. Cao, T. Kawahara,
    • Organizer
      IEEE-SLT, 2021
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] Joint Training End-to-End Systems for Speech and Speaker Recognition with Speaker Attributes,2020

    • Author(s)
      S. Li, X. Lu, R. Dabre, P. Shen and H. Kawai,
    • Organizer
      Acoustical Society of Japan, spring, 2020.
    • Data Source
      KAKENHI-PROJECT-19K24376
  • [Presentation] End-To-End Articulatory Modeling for Dysarthria Articulatory Attribute Detection.2020

    • Author(s)
      Y. Lin, L. Wang, J. Dang, S. Li, and C. Ding.
    • Organizer
      IEEE-ICASSP
    • Int'l Joint Research
    • Data Source
      KAKENHI-PROJECT-19K24376
  • 1.  チョ シンキ (70784891)
    # of Collaborated Projects: 2 results
    # of Collaborated Products: 18 results
  • 2.  李 吉屹 (30726667)
    # of Collaborated Projects: 1 results
    # of Collaborated Products: 20 results
  • 3.  TOU Takeshi
    # of Collaborated Projects: 0 results
    # of Collaborated Products: 1 results
  • 4.  河原 達也
    # of Collaborated Projects: 0 results
    # of Collaborated Products: 1 results

URL: 

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi