GPT-4 Technical Report J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ... arXiv preprint arXiv:2303.08774, 2023 | 7510* | 2023 |
PaLM: Scaling Language Modeling with Pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint arXiv:2204.02311, 2022 | 5143 | 2022 |
Scaling Instruction-Finetuned Language Models HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, E Li, X Wang, ... arXiv preprint arXiv:2210.11416, 2022 | 2943 | 2022 |
Large language models encode clinical knowledge K Singhal, S Azizi, T Tu, SS Mahdavi, J Wei, HW Chung, N Scales, ... Nature 620 (7972), 172-180, 2023 | 2044 | 2023 |
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model TL Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... arXiv preprint arXiv:2211.05100, 2022 | 1629 | 2022 |
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning S Longpre, L Hou, T Vu, A Webson, HW Chung, Y Tay, D Zhou, QV Le, ... arXiv preprint arXiv:2301.13688, 2023 | 611 | 2023 |
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them M Suzgun, N Scales, N Schärli, S Gehrmann, Y Tay, HW Chung, ... arXiv preprint arXiv:2210.09261, 2022 | 561 | 2022 |
Adversarial Attacks Against Medical Deep Learning Systems SG Finlayson, HW Chung, IS Kohane, AL Beam arXiv preprint arXiv:1804.05296, 2018 | 293 | 2018 |
UL2: Unifying language learning paradigms Y Tay, M Dehghani, VQ Tran, X Garcia, J Wei, X Wang, HW Chung, ... ICLR 2023, 2022 | 280 | 2022 |
Energy consumption in desalinating produced water from shale oil and gas extraction GP Thiel, EW Tow, LD Banchik, HW Chung, JH Lienhard Desalination 366, 94-112, 2015 | 258 | 2015 |
Language Models are Multilingual Chain-of-Thought Reasoners F Shi, M Suzgun, M Freitag, X Wang, S Srivats, S Vosoughi, HW Chung, ... arXiv preprint arXiv:2210.03057, 2022 | 228 | 2022 |
Energy efficiency of permeate gap and novel conductive gap membrane distillation J Swaminathan, HW Chung, DM Warsinger, FA AlMarzooqi, HA Arafat Journal of Membrane Science 502, 171-178, 2016 | 185 | 2016 |
Energy efficiency of membrane distillation up to high salinity: Evaluating critical system size and optimal membrane thickness J Swaminathan, HW Chung, DM Warsinger Applied energy 211, 715-734, 2018 | 181 | 2018 |
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? T Wang, A Roberts, D Hesslow, TL Scao, HW Chung, I Beltagy, J Launay, ... arXiv preprint arXiv:2204.05832, 2022 | 168 | 2022 |
Scaling up models and data with t5x and seqio A Roberts, HW Chung, A Levskaya, G Mishra, J Bradbury, D Andor, ... arXiv preprint arXiv:2203.17189, 2022 | 154 | 2022 |
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization Y Tay, VQ Tran, S Ruder, J Gupta, HW Chung, D Bahri, Z Qin, ... arXiv preprint arXiv:2106.12672, 2021 | 142 | 2021 |
Rethinking embedding coupling in pre-trained language models HW Chung, T Févry, H Tsai, M Johnson, S Ruder arXiv preprint arXiv:2010.12821, 2020 | 141 | 2020 |
Multistage vacuum membrane distillation (MSVMD) systems for high salinity applications HW Chung, J Swaminathan, DM Warsinger Journal of Membrane Science 497, 128-141, 2016 | 133 | 2016 |
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers Y Tay, M Dehghani, J Rao, W Fedus, S Abnar, HW Chung, S Narang, ... arXiv preprint arXiv:2109.10686, 2021 | 132 | 2021 |
Membrane distillation model based on heat exchanger theory and configuration comparison J Swaminathan, HW Chung, DM Warsinger Applied Energy 184, 491-505, 2016 | 127 | 2016 |