
Business leaders are actively and passively choosing commodification across every layer of the stack. And it might be the best thing that happens to this industry.
That feels counterintuitive, but as we sprint toward the boardroom’s utopia of faster time-to-production and lower human capital costs, the cybersecurity landscape is heading toward a distinction without a difference. Every vendor, every tool, and every output are converging on the same center.
We’re already seeing it. Every knowledge worker recognizes the pattern in AI-generated output. We’ve even coined a term for it: AI slop. A bit of a glib phrase, to be sure, but it carries the weight of something we all sense and whose impact we haven’t yet triangulated.
Regression to the Mean
Generative AI is inherently prone to regression toward the mean. This is to be expected. Models trained on aggregated human data, constrained by capacity limits, compress the distribution of predictions—a phenomenon known as mode collapse. Researchers at Carnegie Mellon University confirmed this after running nearly 1,100 prompts through a benchmark exercise called NoveltyBench. And while they admit their methodology was prone to outliers, they concluded all frontier models, especially larger ones, underperform humans when it comes to distributional diversity.
There’s a proposed academic name for this: Galton’s Law of Mediocrity. Sir Francis Galton found through experiments in behavioral genetics that extreme characteristics in progenitor peas tended to regress toward the population average in offspring. The name was re-christened in an October 2025 paper by researchers at 55mv Research Lab, Monash University, and Western Sydney University.
Their study examined LLM creativity in a domain where creativity is non-negotiable–advertising. Using a two-phase evaluation framework, they found that models would first shed creative elements and drive toward succinct articulation of product points, then expand with lexically dense responses that appeared novel but lacked substance. In most cases, the models lost sight of the original messaging, and vivid ideas were distilled into generic facts. The researchers concluded that LLMs appear to inhibit human creativity rather than expand it.
Both studies found that even with divergent prompting, this regression is a fundamental design characteristic. Training practices, large corpora, and decoding that favors higher-likelihood tokens all point to how we build models as the culprit. That may not be problematic in isolation, but the practices it enables might be.
The Race to the Bottom
Where regression manifests most visibly is in code. I’m not arguing that any two codebases are identical—I’m arguing that the playing field has leveled. AI is simultaneously raising the floor and lowering the ceiling, enabling the average user to ship software in hours instead of months.
Tools like Malus.sh now allow users to create “clean room” versions of proprietary software via AI, free from copyright infringement. Virtually any application can be open-sourced overnight. And the question this raises isn’t so much about startup viability as it is about what differentiation means when anyone can reproduce your feature set over their morning coffee.
Personally, I believe the answer is operational rigor. A sole proprietor with a Claude Max subscription can ship fast and match features, but will struggle to meet industry-standard SLAs and compliance. And if they turn to AI to solve those problems, they’ll just arrive at the same solutions as everyone else.
This extends beyond code. MITRE ATT&CK vendor evaluation participation is dropping. Forrester notes differences in the EDR space are becoming increasingly marginal. Deepak Gupta traces the same pattern in benchmarks: traditional evaluations drive vendors toward an homogenization that mirrors Galton’s Law as applied to LLM outputs. Initial threat response, as IBM and Palo Alto Networks both report, now happens at line speed, so humans no longer need to operate at the response layer.
The security industry has gone through this before. When antivirus signature databases became commoditized inputs, most AV companies were absorbed by larger players with EDR solutions or pivoted into multi-product plays. But something else happened in that transition. The professionals who could read the context around an alert and make a call about business risk became more valuable, not less. The ones who simply ran the tools became replaceable. AI is accelerating that pattern across the enterprise.
The Judgment Gap
Further research on this phenomenon converges on a paradox I touched on in my last article—the paradox of skill. As AI gets embedded deeper into business processes, the effects on human dependency hinge on whether automation substitutes for low-expertise or high-expertise tasks.
A working MIT paper by David Autor and Neil Thompson suggests that when AI lowers expertise requirements, wages fall but more workers enter those roles. When AI raises the expertise requirement, wages rise but the qualified candidate pool shrinks. So what becomes the premium for human talent in an AI-augmented economy?
It appears to be judgment. Prasad Setty, former Head of Google People Analytics and Stanford researcher, proposed at the Valence AI & The Workforce Summit that organizations are creating a judgment gap AI simply cannot fill. His theory is that the routine work AI automates is precisely the work where humans built pattern recognition, confidence, and professional instinct. When those jobs are offloaded to machines, the developmental pipeline for decision-making collapses. Therefore, the value that humans bring to the table shifts from intellectual capacity to judgment quality.
The National Bureau of Economic Research describes this as the skill premium—where automating a high-value bottleneck skill enhances the productivity of workers with more common skills, making those workers more valuable.
Now, here’s the paradox: the NBER also concluded AI augmentation discourages human learning, depleting general knowledge stock over time. AI dependency erodes cognitive confidence and creates further dependency. The more you need human judgment, the harder it becomes to develop it, so existing deep expertise becomes scarcer by the day.
A Different Kind of Work
If outputs are regressing to the mean and AI augmentation is hollowing out the developmental pipeline for judgment, what kind of work are we actually doing?
In cybersecurity, that answer is shifting. The job most certainly becomes less about triaging incidents, determining response, and executing remediation, and more about ensuring organizational compliance through automation, architecting that automation, and aligning both to business continuity. In other words, management of systems, management of outputs, and management of the gap between what machines produce and what the business needs.
This is where judgment stops being abstract. When automation triages a thousand alerts and resolves nine hundred of them, the remaining hundred require a human who understands the business well enough to determine which ones represent actual risk to the organization. Less a technical skill, and more of a contextual judgment call that no model can make, because the model doesn’t carry the organizational history, the regulatory obligations, or the risk appetite that inform the decision.
Any CIO will tell you their budget isn’t dominated by tools—it’s labor: MSPs, MDR providers, consultants, retainers. As tools regress to the mean in terms of capabilities, any reduction in the cost to deliver outcomes through AI forces a reprice in the services layer. When vendors can’t compete on features, they compete on outcomes. And outcomes demand the judgment that will grow harder to develop.
The Price of Judgment
So, we see humans move up the stack. Now, what does that demand of them?
Risk tolerance can’t be calculated in tokens. Trust models can’t be enforced by machines that don’t understand why they matter. We buy tools based not only on whether they work, but on whether we trust the people who built them. We maintain vendor relationships based on how they show up on a bad day. So, as tools and outputs homogenize, these judgments become the only variable that isn’t regressing.
Autor and Thompson predict this bifurcation in their research. The proverbial middle compresses from both directions until it disappears. Those who understand how outputs are generated, what they mean, and how they should influence decisions—and who can articulate that to a boardroom or to a machine—will command premium positions. Those who offload their thinking will fill the rest.
This is the net positive. Not because commodification is comfortable, but because it forces clarity. When the playing field levels for tools, code, and outputs, the only differentiator left is the one that can’t be reproduced: judgment from context, experience, and the willingness to be wrong. The market hasn’t priced that correctly in a long time, but it’s about to have to.
The irony here isn’t that machines are replacing us. It’s that the work they can’t do is the work we are being encouraged to stop practicing. That gap gets smaller every day. And even more expensive to maintain.