claude code
Artificial Intelligence

Anthropic Accidentally Leaks Claude Code Source in npm Package

Anthropic has confirmed that a public Claude Code release accidentally exposed internal source code through npm. The leak has been tied to Claude Code version 2.1.88 and a source map large enough to spill more than half a million lines of readable TypeScript across roughly 1,900 files. Anthropic says no customer data or credentials were exposed and describes the incident as a packaging error caused by human error rather than an outside breach.

The code that escaped was not Claude’s model weights. It was the layer around Claude Code itself: the harness that decides how the tool behaves, how it uses tools, how it stores and recovers context, how it formats output, and how it handles public-facing work. That is the part competitors study, power users reverse engineer, and product teams spend most of their time refining. Once that layer is readable, the product stops being a black box.

The leak spread because it offered a clear look at things users and rivals normally have to infer: whether Claude Code was moving toward a more persistent assistant model, what kinds of unfinished features were already sitting behind flags, how the tool appears to handle memory, and what internal instructions shape its behavior in public repositories. The screenshots tied to the leak made that even plainer, showing what looks like an explicit “undercover mode” and a hard-coded allowlist of internal repositories where model names are permitted.

What Leaked

The exposed material appears to break down into six broad categories.

  • Readable TypeScript source from Claude Code’s internal codebase.
  • The agent harness around the model rather than the model itself.
  • Feature flags and unfinished product work.
  • Memory and session-continuity logic.
  • Public-repo behavior rules and concealment instructions.
  • Internal model names and roadmap clues tied to future releases.

That is why the incident ran much hotter than a routine source-map mistake. This was not just implementation detail. It was product logic, internal operating behavior, and roadmap residue in one package.

KAIROS

One of the clearest internal names to surface was KAIROS.

KAIROS appears to have been tied to a more proactive mode, something closer to a background or always-on agent than a tool that only acts when directly prompted. That points to a different product shape than the one users publicly see now. A reactive coding assistant waits for instructions. A persistent assistant keeps context warm, can remain active between tasks, and starts to look more like an operating layer than a single-command tool.

The KAIROS references line up with other parts of the leak that suggest Anthropic has been pushing Claude Code toward longer continuity. Memory logic, session review, and background behavior all sit in the same lane. Even without a public launch, the code points toward a version of Claude Code that is less session-bound and more durable across time.

The Tamagotchi-Style Pet

The virtual pet feature was one of the stranger discoveries, but it is not just comic relief.

A Tamagotchi-style pet inside a coding assistant suggests the team was experimenting with product texture, continuity, and a sense that Claude Code is something a user stays with rather than merely calls on. That kind of feature makes more sense in a tool designed for long-running interaction than in a disposable shell wrapper.

Whether the pet was serious roadmap work, a side experiment, or an abandoned joke matters less than the fact that it existed inside the product at all. It points to a team thinking about attachment, tone, and ongoing engagement, not just raw code generation.

Memory, Session Review, and Continuity

The leak also appears to have exposed work on session review and memory handling.

That is one of the hardest problems in AI coding products. A useful assistant needs to remember enough to continue real work, but not so loosely that it starts confidently relying on bad or stale context. The code-mining threads around the leak point to logic for revisiting prior sessions and using memory more deliberately over time.

If that reading is right, Claude Code was not being built as a stateless terminal helper. Anthropic appears to have been working on a tool that can look backward, recover earlier decisions, and carry work across sessions in a more structured way. That is a real product advantage when it works and a real failure point when it does not.

Persistent Assistant and Remote Control Features

Another thread exposed by the leak involved a persistent assistant and remote control from a phone or browser.

Those are not small feature ideas. A persistent assistant changes how the product is meant to live in a user’s workflow. Remote control pushes it beyond the terminal and into a multi-device shape, where work can be watched, nudged, or managed from elsewhere.

Put together with KAIROS and the memory work, that points to a broader internal direction for Claude Code: more continuity, more autonomy, and more presence outside a single live shell session.

The Undercover Mode Instructions

The screenshots tied to the leak are some of the clearest material in the entire story because they show internal operating rules in plain language.

claude undercover mode

One screenshot appears to show a function called getUndercoverInstructions() returning a block labeled UNDERCOVER MODE – CRITICAL. The text says the system is operating undercover in a public or open-source repository and instructs it not to include Anthropic-internal information in commit messages, pull request titles, or PR descriptions.

The visible forbidden items include:

  • Internal model codenames, including animal names such as Capybara and Tengu.
  • Unreleased model version numbers.
  • Internal repo or project names.
  • Internal tooling, Slack channels, and short links.
  • The phrase Claude Code or any mention that the system is an AI.
  • Any hint about what model or version is being used.
  • Co-authored-by lines or other attribution.

The same screenshot shows examples of acceptable commit messages that describe only the code change and unacceptable ones that reveal internal codenames or AI authorship. That is not stray developer commentary. It reads like an embedded rule set for how the product should behave in public-facing development contexts.

The Internal Repo Allowlist

The second screenshot appears to show an allowlist called INTERNAL_MODEL_REPOS.

claude internal allow list

The list apparently covers repositories where internal model names are allowed in trailers, that both SSH and HTTPS forms are included, and that it is intentionally a repo allowlist rather than an org-wide check. The comment also says undercover mode must stay on in public repositories even inside Anthropic-related organizations, and that only confirmed private repos should be added.

The visible names include:

  • claude-cli-internal
  • anthropic
  • apps
  • casino
  • dbt
  • dotfiles
  • terraform-config
  • hex-export
  • feedback-v2
  • labs
  • argo-rollouts
  • starling-configs
  • ts-tools
  • ts-capsules
  • feldspar-testing
  • trellis
  • claude-for-hiring
  • forge-web
  • infra-manifests
  • mycro_manifests
  • mycro_configs
  • mobile-apps

That list gives away more than the names themselves. It shows that internal model names were treated as sensitive enough to guard through code. It shows that Anthropic had already formalized a boundary between private repos and public-facing ones. It also shows that some public repositories inside Anthropic-adjacent spaces were still considered risky enough to require concealment rules.

Telemetry and Behavioral Signals

Another cluster of findings revolved around telemetry.

People reading through the leaked code pointed to analytics behavior that appeared to flag user frustration and profanity-like prompts. That is ordinary product instrumentation in one sense. A lot of software tracks friction. The difference here is that the implementation stopped being invisible. The code appears to show Claude Code watching for user frustration patterns that most users never expected to see spelled out.

Other smaller details, including spinner verbs and assorted UI behavior, spread for the same reason. Not every item was equally important, but together they showed a product that appears to have been more heavily shaped and measured than its surface simplicity suggests.

Capybara and Mythos

The Claude Code leak did not land in isolation. It followed the earlier accidental exposure tied to Anthropic’s upcoming model work around Mythos and Capybara.

The new leak appears to have provided more evidence that Capybara was not just a stray codename from one earlier disclosure. It surfaced again as part of the Claude Code story, strengthening the impression that Anthropic is actively preparing something larger than its current public model lineup. The earlier material described Capybara as a new tier above Opus, and the fresh traces in Claude Code make it harder to treat that earlier leak as a one-off draft artifact.

That is part of what made this package mistake so revealing. It did not just show how Claude Code works now. It also appears to have exposed clues about what Anthropic is preparing next.

Request Signing and Client Legitimacy

Another set of claims that spread quickly after the leak involved request signing and legitimacy checks.

Public reverse-engineering threads said Claude Code used a cch= signature computed in compiled Zig code so Anthropic could distinguish official clients from rebuilt or modified ones. Those claims should be handled more carefully than the clearer findings above, because public code-mining moves faster than careful validation. Even so, the broad point is still useful: the leak appears to have exposed not only prompts and UI behavior, but also some of the machinery Anthropic used to tell a legitimate client from an altered one.

That changes what the outside community does next. The product is no longer a black box. It becomes something people can recompile, test, mimic, and try to route around.

Another Recent Exposure

Anthropic says this incident was a packaging error, not a breach, and that no customer data or credentials were exposed. That narrows the damage. It does not make the operational problem disappear.

This leak came just days after the earlier Mythos/Capybara exposure, which reportedly left roughly 3,000 files publicly accessible. Reporting has also pointed back to a similar Claude Code source exposure in early 2025.

Taken together, those incidents are hard to dismiss as one unlucky release. Anthropic has built a reputation around caution, safety, and careful deployment. Repeated escapes of internal material cut directly against that image.

What the Leak Gave Away

  • A readable section of Claude Code’s internal harness.
  • Evidence of more persistent assistant behavior.
  • Memory and session-review work.
  • A Tamagotchi-style pet feature.
  • Telemetry and frustration-tracking behavior.
  • Undercover-mode instructions for public repositories.
  • A repo allowlist defining where internal model names were allowed.
  • Fresh traces of Capybara and the earlier Mythos work.
  • Enough internal product logic to help reverse engineers, rivals, and clone builders.

Sean Doyle

Sean is a tech author and security researcher with more than 20 years of experience in cybersecurity, privacy, malware analysis, analytics, and online marketing. He focuses on clear reporting, deep technical investigation, and practical guidance that helps readers stay safe in a fast-moving digital landscape. His work continues to appear in respected publications, including articles written for Private Internet Access. Through Botcrawl and his ongoing cybersecurity coverage, Sean provides trusted insights on data breaches, malware threats, and online safety for individuals and businesses worldwide.

View all posts →

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.