When students returned to class after the COVID-19 pandemic, the lecture halls were once again filled, but fewer and fewer questions were being directed to the professors, TAs, or tutors. Instead, they were increasingly fielded by AI chatbots.
Prashanth Arun witnessed this transition from the pre- to post-large language model (LLM) era firsthand as an instructional support assistant for undergraduate computer science courses.
“During my time as a tutor… I realized that one of the biggest benefits… was being able to sit down one-on-one with a student and slowly walk them through a concept or slowly help them understand exactly where they were falling short,” he recounts. “Through that I was [able to] understand it a lot better… The next time somebody came up to me and asked the same question, I was able to provide more creative explanations.”
As a graduate student, Arun wondered whether he could invent a new use case for AI, combining the benefits of an LLM’s personalized responses with the benefits he experienced answering student questions. This led to the creation of Chrysalis, a teachable LLM companion that acts as your very own AI protégé.

The software is built around a “learning by teaching framework” – by teaching their course content to the LLM, students reinforce their own understanding of the material. The students’ knowledge is assessed through the LLM’s subsequent performance on a quiz. Chrysalis explains the reasoning behind its answers using what it was taught, indirectly revealing both strengths and potential gaps in the student’s understanding.
Professor Pascal Poupart, a computer science lecturer specializing in AI, machine learning, and LLMs, saw Chrysalis’ potential and joined as a project lead. He reasons that “education, typically, is very stable and things have not changed in decades, if not centuries, for how teaching is done. But with the rise of LLMs, there’s all sorts of new opportunities.”
Software developer Robert Cai, research associate Arezoo Alipanah, and PhD student Haolin Yu were later welcomed onto the team. Cai had participated in the first trial of the prototype, Alipanah had tried multiple learning apps that failed to live up to her standards, and Yu desired to translate his theoretical research into a “project that helps people in a straightforward way and has a direct impact on the real world.”
A breakthrough occurred when the team was approached by faculty of environment instructor Christine Barbeau, who expressed enthusiasm in incorporating the tool into her class. Pilot studies have also been conducted in two other courses, and Chrysalis is now fully integrated into professor Poupart’s CS 486 class in the form of “chat assignments.” Over 30 instructors showed interest at the team’s latest workshop.
“I personally thought that for instructors it would be a harder sell,” admits professor Poupart, referencing the controversial status of AI in education. “…sometimes they might be a little conservative and not everyone is interested in trying something new… some people hate AI because now their usual way of teaching is disturbed.”
According to Arun, the attention has “been really good traction for EdTech especially… there are a lot of well-established systems that a lot of people use. That makes it hard for new learning systems like ours to gain adoption.”
Of course, creating a tool that flips the script in this way comes with its challenges. For one, getting their model to behave has been an uphill battle since their very first prototype.
“We are trying to get a large language model to do what it wasn’t really designed to do,” explains Arun. “Large language models are designed to provide answers, designed to provide user satisfaction, provide as accurate of a response as possible. [They are] not designed to act ignorant… to withhold information… to act as a less knowledgeable anything, really. Basically, everything that we’re trying is going against the model’s training.”
The project is at the heart of several ongoing research efforts, from memory models that graphically display what is in the LLM’s “brain” to constraints forcing it to forget concepts on the fly. “All of the techniques that we’re working on right now stem from the fact that ultimately, we’re trying to fight the model at the same time as we’re trying to get it to help… students,” says Arun.
Chrysalis joined the Teaching Innovation Incubator in January 2025, with the first minimum viable product being deployed in July of the same year.
Cai describes the benefits and drawbacks of their short timeline, “We were able to quickly iterate through things. We were able to quickly build things… making new radical ideas. But at the same time, a lot of things can’t be rushed… a rushed product… it’s not always the most stable.” To him, “the challenge and the opportunity come hand in hand.”
The team has big plans for future iterations of the software.
Alipanah is hoping to program Chrysalis to play the role of learners with different learning styles and levels of engagement, adding a unique element of challenge. Gamification could engage students in competition to see who can teach most effectively. Yu wants Chrysalis to “notice [false or contradictory information] during the conversation, hint at the users, and allow them to reflect on what they may have taught incorrectly” before it surfaces in the quiz. Future techniques could even allow users to interact with the LLM through other mechanisms, not just chat.
“Collectively, we’re working towards making a teachable agent, not just a simulated agent, but an AI that can actually learn from you interactively,” Arun summarizes.
“I truly believe that we are going through some special era where the world of education is going to change rapidly… In the same way that smartphones went from [allowing us to] talk to somebody elsewhere by calling… [now] we have smartphones assisting us for pretty much everything.” Professor Poupart adds, “I think we’re going to see the emergence of some supertools… one of my dream[s] here is that with the team, we can essentially position Chrysalis as this ultimate assistant.”
Reflecting on the journey to creating a user-facing product, Alipanah reveals, “Many of the research areas that I was tackling [were] very new to me. That’s the biggest thing that I have learned… But the other thing was that… when there is a product that will be shown to people, defining good evaluation metrics for that, it’s not as simple as ‘yeah, it works really well and that’s fine.’”
Arun echoes this sentiment, concluding, “There’s a lot more to software than just the code that is running it… I think above everything, you learn a lot about yourself when you try and go all out to make the software.”






