Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments T Xie, D Zhang, J Chen, X Li, S Zhao, R Cao, TJ Hua, Z Cheng, D Shin, ... NeurIPS 2024, 2024 | 80* | 2024 |
Moelora: Contrastive learning guided mixture of experts on parameter-efficient fine-tuning for large language models T Luo, J Lei, F Lei, W Liu, S He, J Zhao, K Liu arXiv preprint arXiv:2402.12851, 2024 | 29 | 2024 |
Answering numerical reasoning questions in table-text hybrid contents with graph-based encoder and tree-based decoder F Lei, S He, X Li, J Zhao, K Liu COLING 2022, 2022 | 25 | 2022 |
Menatqa: A new dataset for testing the temporal comprehension and reasoning abilities of large language models Y Wei, Y Su, H Ma, X Yu, F Lei, Y Zhang, J Zhao, K Liu EMNLP 2023, 2023 | 20 | 2023 |
Competition-level problems are effective llm evaluators Y Huang, Z Lin, X Liu, Y Gong, S Lu, F Lei, Y Liang, Y Shen, C Lin, ... ACL 2024, 2023 | 18 | 2023 |
S3eval: A synthetic, scalable, systematic evaluation suite for large language models F Lei, Q Liu, Y Huang, S He, J Zhao, K Liu NAACL 2024, 2023 | 17 | 2023 |
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? R Cao, F Lei, H Wu, J Chen, Y Fu, H Gao, X Xiong, H Zhang, Y Mao, W Hu, ... NeurIPS 2024, 2024 | 15* | 2024 |
S HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering F Lei, X Li, Y Wei, S He, Y Huang, J Zhao, K Liu ACL 2023, 2023 | 14 | 2023 |
Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent X Yu, T Luo, Y Wei, F Lei, Y Huang, P Hao, L Zhu EMNLP 2024, 2024 | 11 | 2024 |
Multi-view graph representation learning for answering hybrid numerical reasoning question Y Wei, F Lei, Y Zhang, J Zhao, K Liu arXiv preprint arXiv:2305.03458, 2023 | 11 | 2023 |
Assessing knowledge editing in language models via relation perspective Y Wei, X Yu, H Ma, F Lei, Y Weng, R Song, K Liu arXiv preprint arXiv:2311.09053, 2023 | 10 | 2023 |
MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images W Liu, F Lei, T Luo, J Lei, S He, J Zhao, K Liu arXiv preprint arXiv:2309.04790, 2023 | 9 | 2023 |
TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering F Lei, T Luo, P Yang, W Liu, H Liu, J Lei, Y Huang, Y Wei, S He, J Zhao, ... arXiv preprint arXiv:2310.15075, 2023 | 8 | 2023 |
Spider 2.0: Evaluating language models on real-world enterprise text-to-sql workflows F Lei, J Chen, Y Ye, R Cao, D Shin, H Su, Z Suo, H Gao, W Hu, P Yin, ... ICLR 2025, 2024 | 6 | 2024 |
Answer-based entity extraction and alignment for visual text question answering J Yu, M Jing, W Liu, T Luo, B Zhang, K Lu, F Lei, J Sun, J Liang ACM MM 2023, 2023 | 4 | 2023 |
Hrot: Hybrid prompt strategy and retrieval of thought for table-text hybrid question answering T Luo, F Lei, J Lei, W Liu, S He, J Zhao, K Liu arXiv preprint arXiv:2309.12669, 2023 | 2 | 2023 |
Teaching Small Language Models to Reason for Knowledge-Intensive Multi-Hop Question Answering X Li, S He, F Lei, JY JunYang, T Su, K Liu, J Zhao ACL 2024, 2024 | 1 | 2024 |
DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models Y Huang, J Luo, Y Yu, Y Zhang, F Lei, Y Wei, S He, L Huang, X Liu, ... EMNLP 2024, 2024 | | 2024 |