I have long been a skeptic around using LLMs for anything more than the most basic things possible. They were best suited for natural language searches, but you shouldn’t trust them with anything important. Never was this more apparent then when I tried to work with them for web development. Light simple things could be done fairly easily, but once you stepped outside of the scope of simple and light tasks, LLMs rapidly went downhill, often hallucinating, regressing previously solved problems, and going into debugging spirals that claimed to solve the problem, but which kept expanding the scope of the “changes required” to the point where you end up refactoring the whole thing without actually solving your bug.

I started using Claude Desktop this year, and it surprised me with its capabilities. It was deeply web enabled, had a logical and well organized interface, and seemed better suited for actual development use. And, the more I used it, the more I noticed that it really excelled in one key area: contextual inference of next steps needed. The number of times it suggested a relevant but not directly related set of actions was downright spooky.

But, it wasn’t long before I started running into a lot of the same problems. Try to start a fresh chat in order to free up the context history making it run slow, and it forgets everything. Try to stay in the same context window for long sets of tasks, and it overruns its context window, freezing mid-task and hanging up, or, resetting a conversation entirely. And, don’t get me started on the symptom chasing nature of its default debugging approach. And so, on a whim, I stopped asking it to do things, and started asking it how it could be trained to do things better.

And as it turned out, it had a lot to say about that.

Over the course of the following months, I started taking advantage of the fact that Claude is actually extensible, in some pretty meaningful ways, and I started training it locally to accomplish a few key things:

  • Maintain learned knowledge and context awareness across many different chat sessions.
  • (Somewhat) enforce engineering processes and philosophies.
  • Use Claude to train Claude in highly specified skills for global access.

I also learned some key things myself:

  • Prompt ambiguity will bite you in the ass, and the first prompts that you give when creating a project may have a lot to do with how successful the outcome is. Specificity in language matters, especially at the beginning.
  • Chats are short term memory. They feel permanent when you start using them, but the nuance is lost the second you enter another chat. YOU have to be the brain of the project, and structure a project architecture that uses Claude’s tools for project wide constants, and a process for maintaining ongoing and evolving information.
  • You should take a defensive posture when working with Claude, or any LLM, really. They can be agreeable on the surface while doing the wrong thing entirely underneath it. It’s important to challenge it periodically, and spot check work, or the processes followed in generating output. Assume that it’s taking the shortest and simplest path possible, even if you’ve told it to do otherwise.

While the blog is very specifically Claude focused, there are a number of high level principals that I think will be useful with most LLMs, and I hope you find these posts helpful.

Something additional to note:

Part of this project was creating a writing and editing workflow using Claude itself to demonstrate how that can be achieved for content management. All posts on this site aside from this one, which I have wholly written myself, are drafted by Claude from our sessions, and then edited by me for publishing.

When I am interpolating additional content in paragraph form, I will write in italics to make clear that this is an editorial addition. All other content may be safely assumed to have been drafted by Claude. I’m not going to bother noting small changes I make within Claude generated paragraphs; just assume that you will likely need to do a little polishing to clean up the language and tone here and there, or to correct a word choice to your own preference.

I’m also using Claude to generate the featured images for these posts, something it very much does not enjoy doing, but I think is a useful exercise. While Claude is not an image model (and will readily tell you that), it IS capable of generating various graphical abstracts, and we’ll explore that more in future posts.

And with that, I’m hitting publish, and making this officially the first post of this site.