
Here in the deep snows of winter (okay, one snow, not very deep, but 5 inches is not bad these days), we offer a throwback to summertime, to share the experiences of this year’s summer interns. Mentored by PhD students Yu Feng and Xingyu Fu and postdoc Tomer Wolfson, with input from former postdoc Vivek Gupta, the six students had come to Philadelphia from Arizona, California, China, and Israel to spend some or all of their summer with us, working on their various projects. It was great to have them join our ranks and participate in our research. I’ve asked them to give their thoughts on the experience and what they learned; here are five responses.

[Some responses were compiled or polished using AI.]
Rohit Khoja
Master’s student at Arizona State University
This summer, I worked on improving retrieval and question answering over both unstructured and tabular data. My main focus was on enhancing retrieval precision and LLM reasoning accuracy on the OTT-QA and STaRK datasets, particularly for questions that require multi-step reasoning. As part of this, I built a GraphRAG system from scratch. Instead of relying on conventional knowledge graphs at the entity level, we constructed a chunk-level graph, where each node represented a chunk of 7–8 sentences and explored relationships between chunks. This approach led to improvements in retrieval quality and reasoning performance.

Beyond the technical work, I really enjoyed my time in Philadelphia, the beautiful summer weather, exploring different places around the city, and especially the collaborative atmosphere in the lab. Prof. Dan, Tomer and Jennifer were supportive and always ready to help, and the convenience of the Penn Transit service around campus was great.
During this internship, I learned a lot about the retrieval field and recent advancements in it, gaining a deeper understanding of why retrieval is such a critical component of modern AI systems. I also learned how to handle large datasets efficiently on GPUs and optimize data processing pipelines for large-scale experiments.
Prasham Titiya
Master’s student at Arizona State University

This summer, I had the wonderful opportunity to be a research intern with the Cognitive Computation Group at the University of Pennsylvania. I worked on developing information retrieval systems for structured, semi-structured, and unstructured data, focusing on how building a knowledge graph can improve retrieval performance and help with more efficient multi-hop reasoning. The project was rewarding, exploring how semantic and lexical relationships can be represented and leveraged to make retrieval more accurate and context-aware.
I learned a lot throughout the project, both technically and personally. Meetings and discussions with Prof. Vivek Gupta, Dr. Tomer Wolfson, and especially Prof. Dan Roth were incredibly insightful and formative. Prof. Roth’s feedback and perspective on approaching this problem helped me think more critically and systematically about my work. I also had the chance to meet PhD students and researchers from other labs, which broadened my horizons and gave me a better understanding of the variety of work happening in this field. I am especially grateful to Jennifer Sheffield for being so proactive and helpful throughout the course of my internship.
Outside of research, I really enjoyed my time in Philadelphia. The city was lively and full of great places to explore. The UPenn campus was very beautiful, filled with greenery, and had a historic charm that made it a wonderful place to spend the summer. Overall, it was an amazing experience and definitely one of the highlights of my year. I had a great time learning, working with incredible people, and exploring a new city.



Terry Tong
Undergrad at the University of California, Davis
Hi! I’m Terry, I’m a rising senior at UC Davis. Over the summer, I worked at CCG with my mentors Yu Feng and Prof. Dan Roth. When deciding on a project, I wanted to challenge myself to work on research that is more theoretical in nature, settling on a project on theory behind neuro-symbolic integration in reasoning.

I learned a lot with Dan and Yu. Dan is really good at pruning the research idea search space given his decades of experience in the field, which has saved us a lot from dead ends. We had a discussion on the characteristics that made LMs amenable to tool learning, and I vividly remember Dan bringing up ‘teachability of models’ and how people used to research this, but stopped for good reason. We stopped in our footsteps there and pivoted right away. He’d also bring up relevant papers from long ago (like before I was born) that guided the field into what it is now, e.g. one of his seminal ‘Learning to Reason’ papers. This has always helped me gain perspective on what’s important. While Dan is a busy person, whenever we did meet, I always found it helpful to answer some of the ‘big picture’ questions he asked. I felt challenged to step back from whatever low level details I was implementing and critically think about what we should prioritize—which has improved my research decision making skills overall.
While I periodically met with Dan, I got a lot of help from Yu. My previous mentors have been more hands off, so when Yu would challenge some of my ideas, I actually found I preferred this type of back and forth. She would tease out details perhaps another researcher would ask, flesh out low-level ideas, which really complemented Dan’s style of high-level advising. I used to think research was all really technical math and derivations but actually found that scientific communication was really important, especially when time is limited and you have to pitch an idea, or get feedback on an experiment design. Making sure the other party knows exactly what you’re talking about helps decision making and ultimately the efficiency of the project.
Personally, this was the first time I got to research full time outside of classes. I’ve always struggled with context switching between research and classes, so it was rewarding to just have a big chunk of time to let ideas flow. I think I nurtured a habit of trying to understanding things deeper, to spend time digging into neat ideas and deriving equations from scratch. It was really cool to reuse things like learning theory, or theory of computation, that I’d glossed over in my undergrad classes thinking I’d never use them again. Most importantly, this gave me more time to develop my research training skills. I’d be able to reflect on what went well during research and just do `film study’ (see Jacob Steinhardt’s blog on this) and become a better researcher.
I’m grateful to both Yu and Dan for this opportunity, and all the other CCG members who made my time more enjoyable. The outings we would have w/ Jen to the Penn orchard or the Museum, the Coffee runs I’d have with Tomer, and lunches w/ Alon all helped keep me a happy researcher.


Altar Horowitz
Undergrad at Tel Aviv University

Hey! My name is Altar, and I’m a second-year Bioinformatics student at Tel Aviv University. This summer, I had the privilege of working on a project in Professor Dan Roth’s lab, alongside my incredible research partner, Guy Kouchly.
Our project had two main parts. The first involved building an online tool that allows AI researchers to compare distances between embeddings of different sentences, based on their chosen embedding type. The second part focused on exploring whether prompt enrichment improves AI retrieval performance from a database.
For me, this was a very special experience – it was the first time I built an entire tool from scratch, which, as anyone who’s done this knows, is a truly unique and educational process. Moreover, being part of such a high-level laboratory and creating a tool that can be used by some of the best scientists in the world was incredibly empowering.
Another highlight of the summer was the honor of working with the amazing Dr. Tomer Wolfson, who dedicated so much of his time to advising and helping us. Overall, this was one of the most meaningful experiences I’ve had, and it definitely strengthened my motivation to keep working hard and pursue a path in the academic world!
Guy Kouchly
Undergrad at Ben Gurion University of the Negev

During the summer, I worked together with my research partner, Altar, on developing a demo for comparing text embeddings and visualizing the distances between them under different models. Later on, we joined another project under Tomer’s guidance, focusing on improving retrieval methods for large language models using prompt engineering. We experimented with a subset of questions from the OTT-QA dataset and evaluated GPT’s ability to retrieve the corresponding “gold” documents. Our approach involved generating a fictional document (using GPT) for each gold document and using it as a prompt. While this method didn’t yet improve results, Tomer believes there’s still potential – especially with more challenging datasets.
I really enjoyed working on these projects this summer. NLP is new to me, and I’m grateful for the chance to gain hands-on experience so early in my studies. Just as importantly, the lab atmosphere was wonderful – I always felt comfortable asking for help, and everyone was incredibly kind, patient, and welcoming.
In terms of what I’ve learned—almost everything was new! On the theoretical side, I got to explore concepts like embeddings, dimensionality reduction (PCA), and retrieval-based reasoning. On the practical side, I learned about building demos, using APIs, and the general workflow of conducting research.
Many thanks to Dan, Tomer, Terry, Rohit, Prasham, and you, Jen, for all the support and for making this such a meaningful experience.






















































