I Let Claude Code Audit My Own 374 Sessions. The Report Was Inconveniently Honest.

When the tool you use every day turns the lens on you, the results are specific enough to sting.

6 min read

AI models keep getting better. And so do I, apparently. (Not saying I'm in love with Claude. But at 143 messages a day, the line gets blurry.)

The report arrived without warning. I typed /insights between two sessions, while waiting for a build. What I got back five minutes later was 28 days of my own behavior summarized in an HTML file I hadn't asked to critique me. It did anyway.

374 sessions. 3,163 messages. 143 messages per day on average.

And under the "what's hindering you" line: 73 cases of wrong approach. Seventy-three. Not estimates. Sessions analyzed, categorized, counted.

TL;DR: /insights doesn't teach you how to use Claude Code better. It teaches you how you use it. That's a lot more uncomfortable. Here are 3 things the report found that I'd never have seen on my own, and what I changed since.

Professional workspace with code analysis charts, heroic figure revealing software audit insights
When AI turns therapist and your code's dirty laundry gets aired.

The Number That Made Me Close the Tab (And Then Reopen It)

73 wrong_approach out of 374 sessions. Not bugs in Claude's output. Not bad prompts on isolated days. Wrong strategy before writing a single line. Picking the wrong approach, repeatedly, at scale, across SaaS builds and DevOps work (Astro, n8n, VPS, dashboards).

What made the number uncomfortable wasn't its size. It was that none of those 73 sessions felt wrong while they were happening. Each one seemed reasonable. Each decision made sense in the moment.

That's the mechanism. You can't observe your own patterns while you're inside them. Each session is too close. You're making decisions, pushing forward, iterating. The session looks fine from the inside. The aggregate is what accuses.

What /insights does, and what your own memory structurally cannot do, is create analytical distance. 28 days flattened into one report. The patterns that individual sessions masked become visible.

28-Day Friction Breakdown
28-Day Friction Breakdown

73 sessions later, I had a name for something I'd been doing for months without noticing.

You can't spot your own patterns while you're running them.

This number could have stayed abstract. The report broke it into three distinct patterns. Here's the first one, and the most expensive in time.

Blind Spot #1 — I Was Over-Engineering Everything

I asked Claude Code to delete a directory.

I got a four-step plan. Analysis of the directory structure. Dependency check. A proposed backup strategy. Then, eventually, the rm -rf.

(Seven minutes for a four-second operation.)

That's not a one-time failure. The report tagged this pattern as "over-engineering and excessive planning instead of executing" across the entire 28-day period. Fix an endpoint route? Three files modified, architecture note added, folder structure reconsidered. Add a missing env variable? Refactor incoming.

Why does this happen? Short, unscoped requests are structurally ambiguous for an LLM. "Fix the redirect" could mean fix the line, fix the module, reconsider the routing strategy. When scope isn't pinned, Claude Code defaults to thoroughness. That's not a bug. That's the model being careful with incomplete information.

What I changed: a scope-lock prefix on targeted tasks.

Fix ONLY the redirect in /routes/dashboard.ts. No other files. Minimal diff.

Not elegant. Works. The response becomes surgical. Less planning, more executing.

One caveat: constraining too early on genuinely ambiguous tasks produces incomplete fixes. This approach works for scoped, targeted requests. Complex refactoring, let it roam.

Blind Spot #2 — I Was Running a Parallel Operation I Didn't Know Had a Name

Multi-clauding isn't a technique I consciously adopted. It emerged.

One session running a migration. Another handling the frontend bug. A third checking the VPS config. Because waiting felt like wasting.

I didn't call it anything. I just did it.

The report: 35% of my messages were sent while other sessions were actively running. 219 overlap events across 28 days. It called this "multi-clauding."

Right. Apparently that has a name.

This pattern makes sense for high-velocity dev. Latency is real. Long tasks exist. Running parallel sessions extracts productivity from wait time. But the hidden cost isn't in the sessions. It's in the person running them. Each session jump requires a mental context reload. The sessions stay separate. The brain doesn't.

At 35% overlap, that cognitive overhead adds up in ways no single session review would catch.

If you're pushing this further, the tradeoffs of orchestrating multiple Claude instances without losing context control are worth understanding before the overlap rate climbs higher.

Blind Spot #3 — My Projects Were Bleeding Into Each Other

One of my projects is a physical product. QR codes on the packaging redirect to specific pages. Pages only accessible through that QR code. Not through search, not through navigation, not through any public link.

Claude added a "Buy on Amazon" button.

To a page that only people who already bought the product can reach.

(Technically coherent. Completely wrong.)

37 misunderstood_request events across 28 days. The common thread wasn't bad prompts or missing context inside the sessions. It was absent project declaration. Sessions opened cold. Claude inferred the project from recent history, which meant it occasionally imported assumptions from a completely different project.

Each individual session seemed reasonable. Together, the pattern was clear: I wasn't pinning context at session start.

What I changed: a project declaration block at the top of each new session.

Project: [product name]. Physical product. Pages are QR-code-only access. 
No public indexing. No e-commerce flow.

Three sentences. Eliminated the category of wrong that was bleeding between projects. The logic is identical to the prompt contracts approach, defining scope explicitly before writing a single line of code but applied to context instead of task.

(Context pinning works at session start, but doesn't hold through long multi-task sessions. Anything past two hours on mixed tasks: re-pin at the halfway point.)

Thirty-seven times Claude understood the task perfectly. Just in the wrong project.

What I Changed, What I Ignored, and What /insights Gets Wrong by Design

Three changes applied after the report.

Scope-lock prefix on targeted tasks. Dropped over-engineering friction noticeably.

Project declaration block at session start. Eliminated most of the misunderstood_request pattern.

Dedicated sessions per project. No more mixing VPS config and SaaS frontend in the same Claude session.

Results: the category of "this went sideways for reasons I can't explain" dropped. That's qualitative. It's also real.

Now the part the report got wrong.

It calculated my satisfaction rate at 83%. That number is probably off. Not because the measurement is bad, but because it can't distinguish two very different session endings.

Session ends because you hit a wall, you're stuck, you abandon. Genuine friction.

Session ends because you spotted an error fifteen messages in, killed it fast, opened a clean session. That's a good decision, made fast.

/insights counts both as friction. It has to. It doesn't have access to your intent. So 83% is a floor, not a ceiling. Some of my "frustrated" sessions were deliberate corrections I'd make again.

This is a real limit of any behavioral analysis tool. The report sees everything you do. It cannot see why you do it.

I use Claude Code to audit code every day. The idea that it could audit my way of coding had never crossed my mind. /insights doesn't change the tool. It changes the angle of observation. That's not the same thing.

Sometimes being criticized isn't comfortable. But if I set my ego aside and admit that AI is better than me (ugh) at certain things, I win across the board.

Run /insights. Read the report without defending yourself. Find the number that bothers you most. That's probably the only one that matters. Mine was 73. Yours will be different.

But it exists.

In six months, these reports will be more granular, more personal, more uncomfortable. The industry will probably find a way to gamify it. A dev leaderboard. A "zero wrong approach" badge. A weekly score newsletter. Neat.

The builders who looked at their 73 without closing the tab already have a head start.

The best tools are the ones that show you what you didn't want to see.

Sources: Claude Code documentation/insights command reference.

If you build with AI daily and want takes that skip the hype, follow along. New articles when something actually worth writing about happens.

(*) The cover is AI-generated. Prompted into existence — the uncomfortable robot looking at its own reflection felt accurate enough to keep.


Curious how AI can audit your own coding patterns with brutal honesty? My latest deep dive reveals unexpected insights about developer workflows and AI interaction.

Get the Production AI Newsletter