Startup update 9: Curriculum design

Originally posted 2025-06-23

Tagged: cartesian_tutor, llms

Obligatory disclaimer: all opinions are mine and not of my employer


Progress update

Thank you everyone who participated in helping test the bot! I triaged and fixed a variety of bugs - weirdness around navigation of homepage, the “usage exceeded” messaging, and more. I’ve upgraded a bunch of accounts to “unlimited user” status.

I also implemented a “view as” feature that basically lets any admin pretend to be a user (from the API’s perspective) and navigate the entire site as anyone else. Using this feature, I was able to look through everyone’s conversations in the native UI without needing awkward postgres querying hacks. This solution should basically scale infinitely as it lets me view every future app feature from the users’ perspective without needing any additional implementation effort.

Curriculum design

Now that I’ve viewed everyone’s interactions with the AI, I can see a common trend: People don’t quite know what to put into that blank box when they want to start a conversation

Even if you dumped a lot of detail in there, it still often wasn’t enough to generate a precise conversation teaching you what you wanted to learn. The conversation wandered around, didn’t really have an “aha” moment it was driving towards. Oftentimes, the vibe of the conversation was one of “unprepared tutor improvising on the spot” - which… well, actually, that’s exactly what this system is set up as right now!

The conclusion: I need a curriculum generation step.

It is unclear to me yet what “curriculum” means in the context of an AI tutor - is it just a bulleted list/outline of topics to cover? is it a series of curated problems? Is it a bulleted list/outline of techniques/insights to cover? Is it a detailed script, complete with “how to explain X?”

It is also unclear to me yet how much user involvement I need in curriculum generation. There are two ways this could go:

  • Freeform: Get the AI tutor to interactively work with the parent to scope out what they want their kid to learn.
  • Curated: I develop a series of set curricula (e.g. “AP Chemistry” or “AMC 10 prep”) that has a fixed list of ideas/techniques/topics to be covered.

I am leaning heavily towards Curated right now, because I fundamentally don’t believe that non-experts like the average parent can “vibecode” their way to a quality curriculum. Additionally, Curated has the advantage that it results in a concrete set of artifacts that are searchable and can become ad/search engine targets. With LLM assistance I think I can probably create a compelling AP chemistry course in 2-3 weeks or so. That would involve doing background research to figure out what topics need to be covered, what problems kids need to be able to solve, collecting practice problems, building LLM tooling to process PDFs, extract problems and turn them into LaTeX snippets, etc.. Subsequent courses would probably take 1-2 weeks to curate. This feels like a big investment to me, given that I’ve only been at this game for two months now, and yet in the grand scheme of things it is kind of trivially short!

As a reference point, when I created ChemWOOT with Jacob Sanders over the course of 2-3 years, back in 2013, it took us maybe a collective ~500-1000 hours to create detailed scripts, compile practice problems, write mock exams, and do other miscellany like creating the diagnostic test. ChemWOOT is a 12 + 12 lesson sequence administered over the course of two years, with a 3-month weekly lesson timing leading up to the USNCO locals+nationals, so it’s about twice as long as an AP chemistry course. So this amounts to a 5x speedup over the time it previously took to construct a course.

One way that Freeform could succeed is if I manually work with ~10 parents to interactively scope out what they want their kids to learn, and then extract a Process that an LLM can administer to extract, piecemeal from the parent, the key parameters of what they’re looking for, and then use that to craft a curriculum according to a second Process. (When I say “Process”, I mean a system prompt or tool-calling agent setup tailored to a specific goal-oriented scenario.) But I think that if I go the Curated route, this will also inform what this Process could look like.

I think this mostly agrees with Andrej Karpathy’s recent talk - he has a 15 second aside where he talks about his own AI + education efforts and says that curriculum creation + administration are two separate phases, with an auditable artifact (the curriculum) in between. (I wonder if there’s a world where I work with Andrej on AI tutoring?)