Utmos: Utokyo-sarulab system for voicemos challenge 2022 T Saeki, D Xin, W Nakata, T Koriyama, S Takamichi, H Saruwatari arXiv preprint arXiv:2204.02152, 2022 | 122 | 2022 |
Audiobook Speech Synthesis Conditioned by Cross-Sentence Context-Aware Word Embeddings W Nakata, T Koriyama, S Takamichi, N Tanji, Y Ijima, R Masumura, ... Proc. 11th ISCA Speech Synthesis Workshop (SSW 11), 211-215, 2021 | 14 | 2021 |
Coco-nut: Corpus of japanese utterance and voice characteristics description for prompt-based control A Watanabe, S Takamichi, Y Saito, W Nakata, D Xin, H Saruwatari 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 9 | 2023 |
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis S Takamichi, W Nakata, N Tanji, H Saruwatari arXiv preprint arXiv:2201.10896, 2022 | 8 | 2022 |
Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis W Nakata, T Koriyama, S Takamichi, Y Saito, Y Ijima, R Masumura, ... Proc. Interspeech 2022, 4551-4555, 2022 | 8 | 2022 |
UTDUSS: UTokyo-SaruLab System for Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge W Nakata, K Yamauchi, D Yang, H Hyodo, Y Saito arXiv preprint arXiv:2403.13720, 2024 | 1 | 2024 |
J-KAC: 日本語オーディオブック・紙芝居朗読音声コーパス 高道慎之介, 中田亘, 郡山知樹, 丹治尚子, 井島勇祐, 増村亮, 猿渡洋 研究報告音楽情報科学 (MUS) 2021 (14), 1-4, 2021 | 1 | 2021 |
The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech K Baba, W Nakata, Y Saito, H Saruwatari arXiv preprint arXiv:2409.09305, 2024 | | 2024 |
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling W Nakata, K Seki, H Yanaka, Y Saito, S Takamichi, H Saruwatari arXiv preprint arXiv:2407.15828, 2024 | | 2024 |
Building speech corpus with diverse voice characteristics for its prompt-based representation A Watanabe, S Takamichi, Y Saito, W Nakata, D Xin, H Saruwatari arXiv preprint arXiv:2403.13353, 2024 | | 2024 |
An Empirical Study of Self-Supervised Learning Model Features for Speech Waveform Reconstruction W NAKATA, T SAEKI, Y SAITO, S TAKAMICHI, H SARUWATARI 日本音響学会研究発表会講演論文集 (CD-ROM) 2023, 2-29, 2023 | | 2023 |
Analysis of urgency of evacuation announcement speech and its application to text-to-speech S HARADA, W NAKATA, S TAKAMICHI, Y SAITO, Y SAITO, ... 日本音響学会研究発表会講演論文集 (CD-ROM) 2022, 2-41, 2022 | | 2022 |
Audiobook Speech Synthesis based on Character embedding for Distinguishable Character Acting W NAKATA, T KOORIYAMA, S TAKAMICHI, Y SAITO, Y IJIMA, ... 日本音響学会研究発表会講演論文集 (CD-ROM) 2022, 3-3, 2022 | | 2022 |
VQVAE によって獲得されたキャラクター演技スタイルに基づく多話者オーディオブック音声合成 中田亘, 郡山知樹, 高道慎之介, 齋藤佑樹, 井島勇祐, 増村亮, 猿渡洋 電子情報通信学会技術研究報告; 信学技報 121 (282), 42-47, 2021 | | 2021 |