About me
Hello there! My name is Linyong Nan, and I’m currently in my final-year of Ph.D. studies in Computer Science at Yale University. I have the pleasure of working under the guidance of Prof. Arman Cohan.
My research explores the dynamic field of text generation from multi-modal knowledge sources. These sources include texts, web tables, databases, knowledge graphs, images and more. I’ve delved into several exciting topics in this field: Augmented LLMs/Agents (like RAG and tool-augmented LLMs), Question Answering from tables/databases, Code Generation, Data-to-Text and Summarization.
A pivotal aspect of my work is enhancing the faithfulness and reliability of LLMs/Agents augmented with retrieval and tools. This is particularly crucial for tasks that require the retrieval of extensive, often time-sensitive knowledge from verified sources and are reasoning intensive. I’m particularly drawn to the neural-symbolic approach for achieving these enhancements, which involve generating formal language (code) as intermediate steps. This allows for greater transparency in how LLMs perform information retrieval, reasoning and aggregation, ultimately fostering human trust in the generated content.
My journey at Yale began under the mentorship of the remarkable Prof. Dragomir Radev, who sadly is no longer with us. Drago was the one who recognized and nurtured my passion for NLP research, bringing me into the Ph.D. program. During my first three years at Yale, Drago provided me with invaluable guidance, warmth, unwavering support, and inspiration that have deeply influenced my academic and personal growth. Reflecting on those transformative years, my heart is filled with deep gratitude for Drago. His influence extends far beyond the academic, touching the very core of who I am today. I am profoundly grateful for his wisdom, kindness, and the indelible mark he left on my life and career.
Before Yale, I obtained my M.S. in Computer Science at Columbia University SEAS, advised by Prof. Michael Collins. My undergraduate years were spent at the College of William & Mary, where I graduated Summa cum laude in 2018 with double majors in Mathematics and Computer Science.
Recent News
[03/2024] Area Chair for ACL Rolling Review (ARR), 2023-Present
[10/2023] 3 papers accepted to EMNLP 2023.
Selected Work
For a full list, please refer to my Google Scholar.
On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering
Linyong Nan, Ellen Zhang, Weijin Zou, Yilun Zhao, Wenfei Zhou, Arman Cohan.
ArXiv 2023.Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies
Linyong Nan, Yilun Zhao, Weijin Zou, Narutatsu Ri, Jaesung Tae, Ellen Zhang, Arman Cohan, Dragomir Radev.
Findings of the Association for Computational Linguistics: EMNLP 2023.R2D2: Robust Data-to-Text with Replacement Detection [code]
Linyong Nan, Lorenzo Jaime Flores, Yilun Zhao, Yixin Liu, Luke Benson, Weijin Zou, Dragomir Radev.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.FeTaQA: Free-form Table Question Answering [dataset]
Linyong Nan, Chiachun Hsieh, Ziming Mao, Xi Victoria Lin, Neha Verma, Rui Zhang, Wojciech Kryściński, Nick Schoelkopf, Riley Kong, Xiangru Tang, Murori Mutuma, Ben Rosand, Isabel Trindade, Renusree Bandaru, Jacob Cunningham, Caiming Xiong, Dragomir Radev.
Transactions of the Association for Computational Linguistics (TACL), 2022.DART: Open-Domain Structured Data Record to Text Generation [dataset]
Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, Yangxiaokang Liu, Nadia Irwanto, Jessica Pan, Faiaz Rahman, Ahmad Zaidi, Mutethia Mutuma, Yasin Tarabar, Ankit Gupta, Tao Yu, Yi Chern Tan, Xi Victoria Lin, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2021.Detecting Urgency Status of Crisis Tweets: A Transfer Learning Approach for Low Resource Languages
Efsun Sarioglu Kayi, Linyong Nan, Bohan Qu, Mona Diab, Kathleen McKeown.
Proceedings of the 28th International Conference on Computational Linguistics (COLING), 2020.
Work Experience
- Microsoft Research, Redmond, WA, Summer 2023
Research Intern - Amazon.com, Inc., New York, NY, Summer 2022
Research Intern - Microsoft Research Asia, Beijing, China, Summer 2021
Research Intern - Yale University, New Haven, CT, 2021-2022
Teaching Fellow - Harvard University, Boston, MA, Summer 2019
Research Assistant - Columbia University, New York, NY, 2018-2020
Research Assistant
Miscellaneous
When I’m not immersed in research, I find joy in various hobbies. I’m an avid reader and love exploring new places through travel. Music is another passion of mine, and I enjoy playing the piano, especially classical and jazz pieces.
I grew up bilingual, speaking both Mandarin Chinese and Korean.
Hackathons are another area where I love to engage and challenge myself:
- At PennApps XVIII in Philadelphia, PA, 2018, our team won the Best VR/AR Hack. We developed ARound, an iOS app that enhances city exploration through a phone camera with Augmented Reality (AR) features.
- We were the First Runner-up at TribeHacks IV in Williamsburg, VA, 2018, with our innovative project Cockpit. This program enables real-time drone control using hand gestures.