Conceptual Integrity in the Age of LLMs by Simon van Dyk

I was first introduced to the idea of conceptual integrity from Aslam Khan, a very influential developer in my early career. He gave a talk on software quality and said “maintaining the conceptual integrity of a codebase is one of a developer’s main responsibilities”. At the time, I nodded along, thinking I understood. Only years later did I realize the depth of what he meant.

What is conceptual integrity?

Conceptual integrity is not a new concept. Fred Brooks wrote as far back as 1975:

I will contend that Conceptual Integrity is the most important consideration in system design. It is better to have a system omit certain anomalous features and improvements, but to reflect one set of design ideas, than to have one that contains many good but independent and uncoordinated ideas.

Twenty years later, he remained convinced:

I am more convinced than ever. Conceptual Integrity is central to product quality. Having a system architect is the most important single step toward conceptual integrity…

In essence, conceptual integrity is about coherence. It’s how accurately the code models concepts from the problem domain, how consistent the abstractions remain, and whether the system reflects a unified vision rather than a patchwork of disconnected ideas.

If the conceptual integrity is not maintained over time, I like to call this phenomenon “concept drift”. Just as in machine learning, where the statistical properties of a hyperplane can change over time, in software, the original design ideas can drift apart as existing features are retrofit to accommodate pivots in the early product direction.

To use a more familiar term from the Ruby world, the code smell of “concept drift” is when those original tight ideas become looser and they start to mean one thing in one module, and something else in another.

The LLM challenge: quality at scale

With the advent of large language models, we can now generate more code, faster than ever before. This amplifies both the opportunity and the risk. An agent can help us maintain patterns and conventions, but it can just as easily introduce subtle inconsistencies that erode conceptual integrity over time.

Conceptual integrity has always been a crucial quality metric, but in an era where code generation is increasingly automated, I wonder if LLm will make the same mistakes developers make when a symbol (I mean a constant or method) in a codebase means two different things. We need ways to measure and maintain the coherence of our systems as they grow not just through human hands, but through AI assistance.

The polysemy problem

Natural language processing researchers have long grappled with polysemy—the phenomenon where a single word has multiple meanings. “Bank” can mean a financial institution or the side of a river. “Pitch” could refer to a musical note, a sales presentation, or the angle of a roof. Context disambiguates meaning.

Code suffers from an analogous problem. When we use names frivolously—giving different meanings to the same identifier in different contexts, or worse, inconsistent names to the same concept—we create cognitive overhead. A User in one module might represent an authenticated account holder, while User elsewhere might mean anyone who interacts with the system, authenticated or not. This semantic drift confuses both human readers and AI models trying to understand or extend the codebase.

The cost is high: engineers must constantly context-switch, rebuilding mental models as they navigate between modules. What should be a straightforward concept becomes a puzzle to decode. This is where conceptual integrity breaks down—not through dramatic failures, but through a thousand small inconsistencies.

Context and audience matter

Of course, conceptual integrity is not absolute. Context and audience can override the goal of making code universally understood in favor of optimizing for the engineers who work in a particular domain.

Consider machine learning codebases, where X in python almost universally represents input features and y represents target labels. To an outsider, this violates conventional naming rules—capital letters typically denote classes or constants. But within the ML community, this convention is so well-established that using inputFeatures instead of X would actually reduce clarity. The shared context of the domain makes the terse notation more readable, not less. I had this exact experience myself.

This illustrates an important point: conceptual integrity is about consistency within a domain or context, not rigid adherence to universal rules. A codebase should optimize for the attention and understanding of the engineers who maintain it, which sometimes means adopting domain-specific conventions that would seem opaque to outsiders.

A timeless principle, rediscovered

Conceptual integrity is not a novel idea. In many ways, it’s a higher-level formulation of the single responsibility principle. Where SRP tells us that a class should have one reason to change, conceptual integrity tells us that a system should reflect one coherent vision. Both principles fight against the entropy that naturally accumulates in software systems.

Through this lens, conceptual integrity has been with us all along, under different names and at different levels of abstraction. It applies not just to code, but to any complex system where multiple people contribute over time: organizations, processes, products, even writing.

The advent of LLMs simply makes the principle more urgent. As we gain the power to create more, faster, we must be more deliberate about maintaining coherence. The tools may be new, but the wisdom is old: a system that reflects one set of ideas will outlast and outperform one cobbled together from many good but disconnected ones.

Questions? I’d be humbled and happy to help.