JVS corpus: free Japanese multi-speaker voice corpus S Takamichi, K Mitsui, Y Saito, T Koriyama, N Tanji, H Saruwatari arXiv preprint arXiv:1908.06248, 2019 | 73 | 2019 |
JSUT and JVS: Free Japanese voice corpora for accelerating speech synthesis research S Takamichi, R Sonobe, K Mitsui, Y Saito, T Koriyama, N Tanji, ... Acoustical Science and Technology 41 (5), 761-768, 2020 | 61 | 2020 |
End-to-end text-to-speech based on latent representation of speaking styles using spontaneous dialogue K Mitsui, T Zhao, K Sawada, Y Hono, Y Nankaku, K Tokuda Proceedings of the INTERSPEECH 2022 Conference, 2328-2332, 2022 | 18 | 2022 |
An Integration of Pre-Trained Speech and Language Models for End-to-End Speech Recognition Y Hono, K Mitsuda, T Zhao, K Mitsui, T Wakatsuki, K Sawada arXiv preprint arXiv:2312.03668, 2023 | 11 | 2023 |
Release of pre-trained models for the Japanese language K Sawada, T Zhao, M Shing, K Mitsui, A Kaga, Y Hono, T Wakatsuki, ... arXiv preprint arXiv:2404.01657, 2024 | 8 | 2024 |
Towards human-like spoken dialogue generation between AI agents from written dialogue K Mitsui, Y Hono, K Sawada arXiv preprint arXiv:2310.01088, 2023 | 8 | 2023 |
A jerk-based algorithm ACCEL for the accurate classification of sleep–wake states from arm acceleration KL Ode, S Shi, M Katori, K Mitsui, S Takanashi, R Oguchi, D Aoki, ... Iscience 25 (2), 103727, 2022 | 8 | 2022 |
Deep Gaussian process based multi-speaker speech synthesis with latent speaker representation K Mitsui, T Koriyama, H Saruwatari Speech Communication 132, 132-145, 2021 | 7 | 2021 |
Multi-speaker text-to-speech synthesis using deep Gaussian processes K Mitsui, T Koriyama, H Saruwatari Proceedings of the INTERSPEECH 2020 Conference, 2032-2036, 2020 | 6 | 2020 |
UniFLG: Unified Facial Landmark Generator from Text or Speech K Mitsui, Y Hono, K Sawada Proceedings of the INTERSPEECH 2023 Conference, 5501-5505, 2023 | 5 | 2023 |
PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems K Mitsui, K Mitsuda, T Wakatsuki, Y Hono, K Sawada arXiv preprint arXiv:2406.12428, 2024 | 2 | 2024 |
Text-Guided Scene Sketch-to-Photo Synthesis AP MaungMaung, M Shing, K Mitsui, K Sawada, F Okura arXiv preprint arXiv:2302.06883, 2023 | | 2023 |
SLEEP-WAKEFULNESS DETERMINATION DEVICE AND PROGRAM H Ueda, K Ode, S Shi, K Mitsui, M Katori US Patent App. 17/623,952, 2022 | | 2022 |
MSR-NV: Neural vocoder using multiple sampling rates K Mitsui, K Sawada Proceedings of the INTERSPEECH 2022 Conference, 798-802, 2021 | | 2021 |
Application of Deep Gaussian Process to Multi-Speaker Text-to-Speech Synthesis using Speaker Codes K MITSUI, T KOORIYAMA, H SARUWATARI 電子情報通信学会技術研究報告 119 (398 (SP2019 44-49)(Web)), 31-36, 2020 | | 2020 |