AI Didn’t Break Assessment — It Exposed What Was Already Fragile
For the past two years, tertiary education has been flooded with variations of the same question:
“How do we stop students using AI to cheat?”
Underneath that question sits a growing layer of institutional anxiety.
Educators are increasingly uncertain about what they are actually looking at when they assess learner work. Moderators are encountering evidence that feels harder to interpret confidently. Policy teams are trying to keep pace with technologies evolving faster than governance cycles. Meanwhile, learners are using AI tools in ways that range from legitimate support through to heavy substitution — often somewhere in between.
In many cases, the operational reality has already shifted well ahead of the systems surrounding it.
But there is a deeper issue emerging beneath the immediate AI conversation.
Generative AI may not have broken assessment.
It may have exposed assumptions that were already becoming fragile.

The Hidden Assumption Stack
For a long time, many assessment systems operated with a relatively linear logic chain:
submitted work
→ authorship
→ understanding
→ capability
Not perfectly. Not universally. But often implicitly.
A learner submitted an essay, report, portfolio, workbook, reflection, or project. The artefact was assessed against criteria. From there, capability was inferred.
Under earlier conditions, this worked reasonably well much of the time.
The problem is not that educators were naïve. Nor is it that assessment systems were inherently flawed. Most were designed within environments where producing substantial written outputs still required a meaningful degree of learner effort, synthesis, interpretation, and communication.
That environment has changed rapidly.
Today, high-quality outputs can increasingly be:
- scaffolded
- rewritten
- expanded
- summarised
- translated
- polished
- or generated entirely
often within minutes.
The challenge is not simply that AI can produce text.
It is that the relationship between artefact production and underlying capability has become far less stable than many systems assumed.
AI Accelerated Visibility
There is a temptation to frame this as a sudden collapse caused entirely by AI.
That interpretation is probably too simple.
Generative AI did not invent:
- ghostwriting
- over-scaffolded learning
- shallow compliance
- performative assessment
- proxy evidence
- outsourced thinking
Those dynamics already existed in parts of the system.
What AI did was industrialise ambiguity.
It dramatically lowered the effort required to produce convincing outputs while simultaneously making that assistance harder to detect consistently.
In doing so, it exposed something many educators were already quietly sensing:
high-quality artefacts do not always equal high-confidence evidence of capability.
That distinction matters.
Because the issue is no longer confined to academic misconduct alone.
It increasingly affects how institutions establish confidence in what learners genuinely know, understand, and can actually do.
The Real Pressure Point
Much of the current public conversation still centres on cheating.
But the deeper institutional pressure point may be trust.
More specifically:
what educational evidence genuinely allows us to conclude.
If a polished submission can now emerge from a complex blend of:
- learner thinking
- AI prompting
- AI drafting
- editing support
- external assistance
- iterative refinement
then interpreting capability becomes more complicated.
This does not mean learners are not learning.
Nor does it mean AI use is automatically inappropriate.
In many contexts, AI tools can genuinely support:
- comprehension
- accessibility
- communication
- confidence
- drafting
- idea development
The challenge is subtler than prohibition.
The challenge is distinguishing:
support from substitution,
surface fluency from deeper understanding,
and convincing performance from reliable capability.
That ambiguity becomes especially significant in:
- fully online environments
- asynchronous assessment models
- scaled marking systems
- text-heavy programmes
- workplace-facing qualifications
particularly where direct observation of learner capability is limited.
Why This Matters Beyond Education
The implications extend beyond classrooms and assessment policy.
Credentials ultimately function as trust signals.
Employers, professions, industries, and communities rely on them as indicators that a person can:
- apply knowledge
- exercise judgement
- communicate effectively
- perform reliably
- operate safely
- adapt in real-world contexts
If confidence weakens around what credentials actually verify, pressure eventually flows outward into workforce trust itself.
This is particularly important in vocational, professional, and capability-based contexts where performance matters more than artefact production alone.
The strategic issue is not whether AI exists in professional life. It already does.
The issue is whether educational systems can still reliably determine when meaningful capability is genuinely present beneath increasingly sophisticated outputs.
What The Sector May Be Misreading
One of the risks right now is that institutions respond to an infrastructure problem as though it were only a policy problem.
More rules alone are unlikely to stabilise confidence if the underlying evidence assumptions remain uncertain.
Similarly, detection technologies may help in some situations, but they are unlikely to function as a complete long-term trust architecture on their own. AI systems are evolving too quickly, usage patterns are too varied, and false positives carry their own risks.
At the same time, unrestricted “AI everywhere” approaches can create different forms of ambiguity if institutions lose clarity around what capability standards still matter and how they are verified.
This is why many providers are currently operating in mixed-mode uncertainty:
- partial redesign
- uneven experimentation
- inconsistent guidance
- fragmented educator capability
- localised workarounds
Much of the sector is still trying to reconcile older assessment assumptions with a fundamentally altered evidence environment.
Quiet Adaptation Is Already Happening
Interestingly, some of the most promising responses are not entirely new.
Across parts of tertiary and vocational education, educators are increasingly experimenting with:
- professional conversations
- oral verification
- demonstrations
- staged evidence collection
- reflective explanation
- applied assessment
- process visibility
- iterative submissions
These approaches already existed in many places.
What may be changing is their strategic importance.
As AI increases uncertainty around standalone artefacts, confidence may increasingly emerge through:
- multiple forms of evidence
- contextual judgement
- observable performance
- explanation
- interaction
- and triangulation
In other words:
the system may gradually shift from relying primarily on outputs alone toward building stronger confidence in capability itself.
That is a different orientation.
And potentially a significant one.
A More Useful Framing
None of this means:
- assessment is obsolete
- online learning has failed
- writing no longer matters
- educators should become investigators
- or AI should be banned outright
Those framings are unlikely to help.
The more useful interpretation may be simpler:
the environment changed faster than the assumptions underneath many assessment systems.
What we are now seeing is not just a technology challenge.
It is a trust and evidence challenge.
Educational institutions are increasingly being asked to answer a more difficult question than before:
How do we confidently recognise capability under AI-assisted conditions?
That question is still emerging.
But it may become one of the defining tertiary challenges of the next decade.
If these tensions are surfacing in your organisation as well, I’d be interested in hearing what patterns you’re seeing across assessment, moderation, capability verification, or workforce readiness.
The next post in this series explores why the deeper issue may not actually be cheating — but capability trust itself.
Graeme Smith is the founder of Te Aho Lab and creator of Tertiary Signals, exploring capability, trust, and verification under AI-assisted conditions across tertiary education and workforce systems in Aotearoa New Zealand.