Startup update 11: Bad Claude, bad!
Originally posted 2025-07-07
Tagged: cartesian_tutor, llms
Obligatory disclaimer: all opinions are mine and not of my employer
Progress update
Not much interesting to report this week, continuing the same work. Multitasking between content/tech, implementing features and revising the AP chemistry curriculum while Claude churns out code. I’ve tried taking some of my own lessons and there’s definitely some work to be done, both on the system prompting side and on the raw lesson plan development.
I’ve built out nearly all of the backend features (Unit/Topic/Conversation hierarchy, file storage backend, llm file attachments, course signup + starting/completing Units and Topics), as well as 90% of the UI features built atop this backend. As predicted last week, the camera upload is the only part I haven’t finished yet :D but I’m hopeful I can get this done soon.
On the content side, I built an build caching system, so that when I only change one unit of content, cached LLM responses mean that I can just run “./compile_course.sh ap_chemistry” and it will recompile only the bits necessary. Over time, I suspect this will turn into something like an ad-hoc recompilation system with a UI that allows a user to edit a bunch of fields and have them automatically compile and show up in the window. The AP chemistry content is, overall, starting to get obviously good enough that I’m planning to charge money for this as soon as I launch it. I also want to launch chemistry olympiad level content (a la ChemWOOT).
On the coding agent experience side, I cursed out Claude for the first time ever - it had gotten itself into a confusing database migration state, and decided that it would just overwrite the database’s “current_migration” field to try and force it into a better state. I luckily caught it in time as I was eyeing its commands. I had to go in and manually fix the migration state and tell it to never run migrations again, with some colorful swear words thrown in. Simon Willison’s comment on this is just so on point.
OTOH, Claude knows the gcloud tool’s command line by heart, pulling out a number of incantations to debug my CLI’s login state, which project I was configured on, which auth method I was using, which auth method my app was using, what sorts of permissions were set on various accounts and buckets, and eventually helped diagnosed why my app was failing to upload content to my GCP bucket, despite my gcloud commands correctly uploading content. This is something that would have easily taken a whole day of grumbling, momentum-killing slog in the past, and it mostly just took an hour this time.
This YouTube talk by Armin Ronacher is another good watch from this week.