Dall-eval: Probing the reasoning skills and social biases of text-to-image generation models J Cho, A Zala, M Bansal Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 141 | 2023 |
Dall-eval: Probing the reasoning skills and social biases of text-to-image generative transformers J Cho, A Zala, M Bansal arXiv preprint arXiv:2202.04053 2 (7), 9, 2022 | 113 | 2022 |
Visual programming for step-by-step text-to-image generation and evaluation J Cho, A Zala, M Bansal Advances in Neural Information Processing Systems 36, 2024 | 52 | 2024 |
Videodirectorgpt: Consistent multi-scene video generation via llm-guided planning H Lin, A Zala, J Cho, M Bansal arXiv preprint arXiv:2309.15091, 2023 | 44 | 2023 |
Hierarchical video-moment retrieval and step-captioning A Zala, J Cho, S Kottur, X Chen, B Oguz, Y Mehdad, M Bansal Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 43 | 2023 |
FixMyPose: Pose correctional captioning and retrieval H Kim, A Zala, G Burri, M Bansal Proceedings of the AAAI Conference on Artificial Intelligence 35 (14), 13161 …, 2021 | 21 | 2021 |
Arramon: A joint navigation-assembly instruction interpretation task in dynamic environments H Kim, A Zala, G Burri, H Tan, M Bansal arXiv preprint arXiv:2011.07660, 2020 | 17 | 2020 |
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model H Lin, J Cho, A Zala, M Bansal arXiv preprint arXiv:2404.09967, 2024 | 6 | 2024 |
EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents A Zala, J Cho, H Lin, J Yoon, M Bansal arXiv preprint arXiv:2403.12014, 2024 | 5 | 2024 |
CoSIm: commonsense reasoning for counterfactual scene imagination H Kim, A Zala, M Bansal arXiv preprint arXiv:2207.03961, 2022 | 5 | 2022 |
DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning A Zala, H Lin, J Cho, M Bansal arXiv preprint arXiv:2310.12128, 2023 | 3 | 2023 |
MIRACLE: An Online, Explainable Multimodal Interactive Concept Learning System A Blume, KD Nguyen, Z Wang, Y Chen, M Shlapentokh-Rothman, X Jin, ... Proceedings of the 32nd ACM International Conference on Multimedia, 11252-11254, 2024 | | 2024 |
Supplementary Materials for DALL-EVAL: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models J Cho, A Zala, M Bansal | | |
Supplementary Material for Hierarchical Video-Moment Retrieval and Step-Captioning A Zala, J Cho, S Kottur, X Chen, B Oguz, Y Mehdad, M Bansal, UNCC Hill Health 5, 13, 0 | | |