Interview with Muhao Chen (CCG 2019-2020)

Muhao Chen joined the Cognitive Computation Group in 2019 as a postdoctoral researcher, and he still collaborates with us from time to time.

Muhao Chen at a sign marking the Arctic Circle, positioned as though he's holding up the earth.
Muhao Chen at the Arctic Circle, holding up the earth.

He is currently an Assistant Professor in the Department of Computer Science at UC Davis. He directs the Language Understanding and Knowledge Acquisition (LUKA) Lab. Muhao’s research focuses on robust and minimally supervised data-driven machine learning for Natural Language Processing. Most recently, his group’s research has been focusing on accountability and security problems of large language models and multi-modal language models. Previously, Muhao was a Postdoctoral Fellow at UPenn, from 2019 to 2020. He received his Ph.D. degree from the Department of Computer Science at UCLA in 2019. Before joining UCLA as a Ph.D. student, he graduated with a Bachelor degree from Fudan University in 2014.

Hi, Muhao! Glad to catch up with you. Please tell me where you’re living and what you’re up to these days.

I’m living in Sacramento — the capital city of California. As usual, aside from working, I still enjoy traveling (especially driving to different places for road trips). Luckily, from Sac or the Bay Area it is easy to reach most places in the country through direct flights or driving. 

That’s excellent! I’m glad you’ve been getting to travel.
What are the most rewarding things about your current work?

What I feel most rewarding is to have had a well-established NLP group since I started to work as a faculty member. All students have done excellent jobs building strong academic records of their own. Over a year ago, the first batch of PhD students graduated and have been very successful researchers in the industry. A few more are upcoming and are looking for (or will soon be looking for) faculty positions (wish them the best of luck!). Hope one day, the group can be as successful as CCG and have a lot of successful alumni.

The most surprising? And yes, best of luck to them!

Most of my group members have pets (mostly cats; as we list in one of the sections here https://luka-group.github.io/people.html). I recently got my own pets (two Roborovski hamsters):

They’re very sweet. I like that the nonhumans get special recognition in your lab. (-:

How connected is your work now with what you did in our group?

NLP has been moving way too fast nowadays. But quite a few things we’ve done recently, especially those related to LLM reasoning and indirect supervision, are closely related to what I did at CCG. In fact, I still collaborate with Dan and other (past or current) CCG members like Ben, Wenpeng, Haoyu, Hongming, and Qiang on these topics. We have been giving tutorials every year since 2020 about this research.

What new development(s) in the field of NLP are you excited about right now?

Our group has been focusing on machine learning robustness since it was founded. Particularly, we recently have been very interested in safety issues of LLMs. We build systems that automatically identify safety issues of LLMs, safeguard LLMs from malicious use, and protect LLMs from threats and vulnerabilities when they interact with complex environments. This area is particularly important nowadays considering that LLMs are becoming backbones of more and more intelligent systems and are starting to handle thousands of tasks in real world.

I’m glad to hear you’re focusing on safety issues as LLMs grow.
Thoughts about the state of AI? 

This is an exciting time where AI researchers are building larger and larger learning-based systems and not only solve daily-life problems, but even help with frontier scientific discovery in many other fields like biology, medicine, chemistry, food science, etc. On the one hand, it is a good time for us to work with other fields of study on many scenarios where AI can contribute its force. On the other hand, it is an important time where academia should collaborate more closely with the industry as the AI systems we seek to build recently require significantly more computing and data resources.

How are things outside of work?

I just finished my checklist for traveling to all the national parks in US. Last summer I drove the Dalton Highway to reach the Gates of the Arctic.

Image

Congratulations! That’s fantastic. So, how many national parks have you visited? And which have made a particular impression on you?

I’ve been to 56 national parks (only counting real “National Parks”, and not national monuments or national historical parks etc. though I’ve been to many of these as well). There are 7 national parks I still haven’t been to (3 in Alaska, 1 island in California, 1 island in Florida, and 1 in American Samoa and 1 in Virgin Islands) because all these need to be reached by air taxis or cruise ships, while I just finished all that can be reached by driving. I really love the national park system in the US because almost every one of them is different from each other, with many unique scenes to see and roads to drive on.

Favorite park: if one, then definitely Yellowstone that stands out from all the rest. But I’ve been asked to pick my top 5 in the past and I eventually picked the top 6: Yellowstone (WY), Death Valley (CA), Arches (UT), Carlsbad Cavern (NM), Redwood (CA), Badlands (SD).

Excellent! Do you have a memory to share from your time with the group?

It was campus lock-down time in 2020, but a few of us had hotpot every Friday at my apartment. In fact, a few of us still spent time together in the 3401 Walnut building during the lock-down. A lot of fun happened during that time. There were times where we stayed late in the building before the paper deadline. There were also times where we brought game consoles to play in the room where Dan used to host his group meeting. Most of them have graduated now (except for Haoyu).

Any advice for the current students and postdocs in the group?

One important thing I learned from Dan is to develop a good research taste. Doing meaningful research is not about publishing more and more papers. In fact, only the first paper, the best paper, and probably also the last paper about a topic are the memorable ones.

Thanks for this insight. And thank you so much for this interview!

For more information on Muhao Chen’s research at UC Davis, please visit his website.

Just looked this up: the 414 mile Dalton Highway in Alaska, including the 100+ mile stretch Muhao would have driven to reach the Arctic Circle!

Interview with Celine Lee (CCG 2019-2020)

Celine Lee
Celine Lee, PhD student


Celine Lee joined the Cognitive Computation Group as an undergraduate/masters student researcher in 2019 and graduated from Penn in 2020. She is now a PhD candidate at Cornell Tech. Celine explores questions in structured language, particularly problems in programming language semantics and reasoning.

Hi, Celine! Please tell me where you’re living and what you’re doing these days.

I live in New York City, working on my PhD at Cornell Tech, the campus on Roosevelt Island.

What are the most rewarding things about your current work?

The most fun thing about research is how big the search space of problems is. I get to spend every day thinking about where the interesting open problems are, then talking and working with some of the brightest minds in the field to devise experiments to address them. Most days don’t look the same, because I can’t predict what the path to the solution exactly looks like.

The most surprising?

Something that still surprises me every day is how small this community really is. I’ll meet a friend of a friend or join some colleagues for lunch, and suddenly I’m putting all these new faces to the names on papers that I have been reading for years. And everyone is so excited to talk about what we’re all obsessed with: our shared research interests!

That’s great! I remember your passion for research from your work with us at Penn. How did you originally get involved with the group?  I remember you participated in the Google Explore Research program in early 2020.

This is correct! I started working with Dan after taking his machine learning course, then got involved with the Google Explore Program soon after.

How connected is your work now with what you did in our group?

At CCG, I was working on semantic role labeling systems. Now I’m continuing my work on structured language tasks, but the grammar is that of computer programming languages. The tension of the differing levels of ambiguity between natural language and high level programming languages down the compute stack to compiler IRs all the way to bits leads to interesting questions about correctness, scalability, and adaptability of automatic programming systems. 

What are your thoughts about the state of AI?

Many brilliant people are asking and answering many questions that make computers more adept than I ever imagined possible. I was skeptical at first, then a bit scared, then ultimately excited because now I have more powerful tooling to think bigger– tackle some crazier ideas. 

That does sound exciting!
I know you also parlay your varied interests into creative work alongside your academic work. Will you talk a bit about your writing?

Over the pandemic, I found myself with an unprecedented abundance of time to explore topics only barely related to my work. This coincided with my increasing involvement with NLP research, through which I (1) learned a structured methodology for asking and answering questions, and (2) became extra interested in language. So I wrote and put out my first few blog posts, which turned out to be surprisingly super fun.

snippet from “Donut Wheel”

Fast forward through the the past few years, and this hobby has spiraled out into various formats– academically-leaning blog posts, short and silly illustrated zines, personal musings and essays… Side benefit of writing as a personal creative endeavor: writing papers for work is much less intimidating now.


We have a number of undergrad and master’s students joining us as interns this summer.  Any advice or thoughts about working with the group?

I have two primary pieces of advice. One is that Professor Roth’s expertise and experience make not only him one of the best people you could work with and learn from, but also all the other people around you in the lab. It would be wise to talk to everyone and learn their specialties so that you can maximize your surrounding resources.

The other piece of advice is to practice storytelling as much as possible– who are you as a researcher? Why is your work a compelling piece of science in todays massive volume of NLP work? I think you should be able to convince someone who isn’t personally invested in you but is interested in machine learning to root for your success.

That’s excellent advice.  Thank you so much for this interview!
To learn more about Celine’s research and creative pursuits, please visit her website.

snippet from “Please Be Seated”

Interview with Wenpeng Yin (CCG 2017-2019)

Dr. Wenpeng Yin joined the Cognitive Computation Group as a postdoctoral researcher in 2017. 

Dr. Wenpeng Yin
Dr. Wenpeng Yin

He is currently a tenure-track Assistant Professor at Penn State University, heading the AI4Research lab. In between, he served as Assistant Professor at Temple University (2022) and a Senior Research Scientist at Salesforce (2019-2021). He got his Ph.D. degree from the University of Munich, Germany, in 2017. 

Wenpeng’s research interests span AI for Research, Human-Centered AI, Large Language Models & NLP & Computer Vision, and general machine learning algorithms. He has been the Senior Area Chair for NAACL’2021, ACL Rolling Review, IJCNLP-AACL’23, LREC-COLING’24, and EACL’24. 

Hi, Wenpeng! Great to hear from you. Please tell me where you’re living and what you’re doing these days.

I live in Berwyn, conveniently close to King of Prussia (KOP), PA, maintaining a balanced lifestyle that seamlessly blends my dedication to research with the joy of cleaning my yard, preparing it to embrace the warm spring.

Nice! What kinds of plants grow in your yard?

I want to grow watermelons (we harvested three big watermelons with yellow flesh last summer), tomatoes, and cucumbers in the backyard, and some tulips in the front yard…but I just found deer have eaten all the new leaves of the tulips.

Aw, I’m sorry to hear about the tulips. Hope they recover, and wishing you all the best with your garden!

What are the most rewarding things about your current work?

Two dimensions: i) as a supervisor, I take immense pride in witnessing the remarkable growth of PhD students who initiate their journey in NLP research, eventually evolving into independent contributors to research projects and champions in disseminating our findings; ii) recognizing the impact of our work on the industry, evidenced by their outreach for commercial applications or collaborative endeavors, underscores our tangible contribution to both the industry and society at large.

The most surprising?

NLP research at Penn State only started in 2016 when Prof. Rebecca Jane Passonneau joined, and Penn State does not even have an undergraduate-level NLP course, which is exactly the one I have recently proposed.

I wish you all the best with putting that course together! 
How connected is your work now with the work you did in our group?

I’ve been extending my recent work, building upon my projects in CCG. Initially, my focus in CCG was on textual entailment, which played a pivotal role as indirect supervision for various NLP tasks. One prominent thread in my recent research (Arxiv 2024, CoNLL’22, TACL’22ACL’23 Tutorial) represents a natural extension of this earlier work. Additionally, my involvement in the LORELEI project in CCG, which centered around low-resource language translation, further enriched my research portfolios, such as our research work about machine translation evaluation (ICLR’24), and one of my main research directions, “Human-Centered AI”.

What new development(s) in the field of NLP are you excited about right now?

Yeah, NLP has come a long way, especially with these large language models (LLMs) making waves. But what really gets me pumped are these four things: i) “NLP for Other Disciplines“: Some folks thought NLP research was done for when these LLMs came into the scene, rocking super high performance on tasks we’ve been wrestling with for ages. Surprise, surprise—turns out, now everyone thinks NLP is the bee’s knees. It’s like this golden era where all sorts of disciplines are jumping on the NLP train, not just regular folks but also researchers from other fields who are using it to automate their research game. NLP’s never been in the spotlight like this before. ii) “NLP with Cross-Modalities“: NLP has become more effortlessly integrated with various modalities. It signifies that we’ve discovered a way to seamlessly blend knowledge across different modes, allowing information to flow smoothly between them. This was something hard to fathom just a couple of years ago. iii) “LLM+Agents“: LLM+agent combos are shaping up to be the next big thing. Even though universal LLMs are hogging the limelight, it turns out we still need specialized systems for specific domains. iv) “Open Source“: Open source is the rockstar in NLP research, making things zoom ahead and keeping everything out in the open. It’s like the norm now, making research faster and more transparent.

Thoughts about the state of AI?

Let me first look at the positive side of things: i) The behavior of AI systems today is nothing short of mind-blowing compared to just a couple of years ago. We’re witnessing an influx of potential applications that were once beyond imagination, opening up exciting possibilities. Now, let’s explore the downsides: i) Inequality is widening across the globe. Different fields benefit from AI unevenly, and people in various geographical areas have unequal access to the latest AI products and infrastructures. The dominance of top AI products by a handful of companies and a select few countries contributes to this disparity. ii) Security concerns are intensifying, with issues like forged images and videos becoming more prevalent. While AI systems often showcase unprecedented performance, it’s crucial to acknowledge that researchers still grapple with understanding the inner workings of these systems. The interpretability and control of AI systems remain challenging, leaving room for potential misuse. iii) In academia, there’s a heavy focus on studying Large Language Models (LLMs), and constructing benchmarks to evaluate their performance. Unfortunately, much of this research is heavily influenced by data-intensive and computation-intensive LLMs. This dominance limits the resources available to researchers for delving into the true nature of intelligence.

How are things outside of work?

We’re managing quite well. Our days are primarily occupied with shuttling the kids to a variety of clubs—soccer, dance, piano, gymnastics, and more. Surprisingly, weekends prove to be even busier than weekdays. Fortunately, our proximity to Philadelphia adds a delightful dimension to our lives, offering a diverse array of places like parks, museums, and various activities to explore regularly.

Excellent!  I was glad to hear when you returned to the area, and it’s nice knowing that you’re still nearby.  Do you have a memory to share from your time with the group?

Honestly, one memory that sticks with me from my time at CCG is when my daughter was diagnosed with a brain tumor just five months after joining. It meant spending practically every day at CHOP for about six months. It was a tough period, but what made it bearable was the incredible support from my CCG colleagues. I’m really grateful for their kindness and understanding, especially Dan, who was so flexible with my work during that challenging time. Those experiences, along with the group activities Dan organized, have had a big impact on how I now manage my own group at PSU.

I remember that time well.  I’m glad you felt so supported, and that you’re passing that on!

Please tell me about something you’ve read recently that you would recommend.

My wife and I recently embarked on a shared literary adventure, immersing ourselves in the book ‘A Woman Makes a Plan: Advice for a Lifetime of Adventure, Beauty, and Success‘ by Maye Musk. While the computer community recognizes Elon Musk for his groundbreaking ventures like OpenAI, Tesla, PayPal, and SpaceX, we were intrigued by the legendary status of his mother, Maye Musk. Her story fascinated us, and we eagerly sought wisdom from her book, uncovering the depth of her legendary status as a woman, and the realm of parenting and educating children. Maye Musk’s experiences and insights transcend borders, proving that legends can be forged regardless of one’s country of origin, gender, or age.

Any advice for the current students and postdocs in the group?

It’s a bit tricky to say whether the current students and postdocs are in the best era (thanks to cool stuff like LLMs) or the toughest one (e.g., publishing papers is getting trickier). But the big lesson from Dan that sticks with me is this: think about what kind of AI/NLP system you want to create, instead of just following the research of others. By figuring out your own research tastes and goals and sticking to them, you’re on the best path to stand out in this community.

That’s great.  Thank you so much for this interview!
For more information on Dr. Wenpeng Yin, please visit his website.

A flowerbed with a row of daffodils in the background and featuring tulips in shades of yellow, pale pink, and deep pink.