AI-proof your questions. Integrate AI into the task. Reform your grading with oral defense and process artifacts. A complete, research-backed playbook for the modern classroom.
Start the MasterclassThe goal isn't to outrun AI — it's to assess what AI can't do: your student's reasoning, voice, judgement, and ability to defend their choices in real time. This masterclass shows you how, in 5 practical modules.
The single most important move: tell students exactly how much AI is allowed — on each task, in writing, before they start. "No AI" and "AI everywhere" are both policies. Ambiguity is not.
Based on the framework by Perkins, Furze, Roe & MacVaugh (2024). Pick one level per assessment and name it on the task sheet.
Assessment completed entirely without AI assistance. Protects foundational skill demonstration.
Formative checks, foundational literacy/numeracy, invigilated exams, early-unit diagnostics.
In-class, handwritten, or on a locked-down device. Don't give this as take-home.
AI for ideation, outlines, feedback on drafts. The final product must be the student's own words.
Brainstorming, structuring arguments, grammar checks, summarising source material.
Generating sentences/paragraphs that appear in the final work. Students submit their prompts log.
AI co-authors specific sections. Student directs, edits, and is accountable for every claim.
AI-generated drafts the student substantially rewrites, code scaffolds, translation, image generation.
Citation of AI use + reflection on what was accepted, rejected, and revised.
AI does most of the production. Student is assessed on prompting, curation, and critique.
Prompt quality, selection between AI outputs, fact-checking, ethical judgement.
"Use AI to draft 3 marketing pitches — critique which is strongest and why, in 400 words of your own."
Students build with AI — custom GPTs, agents, multi-tool workflows — as the subject of assessment itself.
Capstone projects, EPQ, IB extended essay, senior design, AP research.
Originality of the system designed, not the AI output it produces.
No AI permitted. Use of AI is academic misconduct. Often used for in-class exams and foundational skill checks.
AI permitted for specified steps only — brainstorming, editing, language help. All uses must be cited and reflected on.
AI use expected and encouraged. Graded on how well students direct, critique, and build upon AI outputs.
Never leave AI policy to "department norms" — put the traffic-light colour, a one-sentence explanation, and the citation requirement on every single assignment brief. Students should not have to guess.
AI-detection software is unreliable and biased against multilingual students. Stop trying to catch AI. Design tasks where AI can't silently do the work for students in the first place.
Require references to a specific Tuesday lab, Mr. Ahmed's anecdote, or slide 14 of the deck. AI can't know what happened in your room.
Tie prompts to your city, school sports day, yesterday's headline, or data your students collected themselves.
Grade the revision history — Google Docs version log, drafts, marginal notes — not just the final file.
Students take home a draft; next class they revise it in the room based on peer feedback. Stage 2 is the graded one.
For high-stakes skill checks: return to paper. Pair with short, frequent assessments rather than one big exam.
Ask for a physical model, labelled sketch, whiteboard photo, or hand-annotated diagram as a required component.
"Connect this concept to a decision you made last term." Lived experience can't be LLM-generated convincingly.
Ask for direct quotes with page numbers from specific texts. Fabricated ("hallucinated") citations are the #1 AI tell.
Base prompts on last month's news, a recent school event, or data you generated yesterday. LLMs fumble on very recent context.
Any written submission is followed by a 3-5 min interview. See Module 4 for the full viva voce playbook.
"Why did you choose approach A over B?" AI produces answers; students need to defend choices.
Group projects with individual contribution tracking (Docs suggestion mode, GitHub commits). Each student defends their slice.
Tools like Turnitin's AI detector and GPTZero produce false positives — especially against non-native English writers and neurodivergent students. Use them only as a conversation starter, never as proof of misconduct.
Some of the most powerful assessments in 2026 require AI. The skill being tested is not production — it's direction, critique, and judgement. Here are 9 assessment patterns to use today.
The patterns below are not workarounds. They demand more critical thinking than the essays they replace — students must evaluate, refute, fact-check, and reason against a confident, articulate adversary on every task. Designed well, AI doesn't kill critical thinking. It puts students in a daily debate with one of the most persuasive arguers on the planet.
Give students an AI-generated essay, code, or lab report with deliberate errors. They annotate errors, explain why they're wrong, and rewrite correctly.
Domain knowledge, critical reading, and error detection — higher-order Bloom's.
Students submit a problem + their best prompt + the resulting AI output + a justification of why this prompt worked better than a naive one.
Clarity of instruction, task decomposition, specificity — real professional skills.
Generate 3 AI outputs on the same problem. Students rank them, justify the ranking against a rubric, and improve the best one.
Evaluative judgement, rubric literacy, revision skills.
Students use AI to produce a research brief, then must verify every single citation against primary sources — flagging hallucinations with evidence.
Information literacy, source evaluation, epistemic caution.
Students submit their full conversation with the AI as the primary artefact — you grade the questions they asked, how they pushed back, and what they rejected.
Metacognition, iterative thinking, learning-by-dialogue.
Students design a custom GPT / Gem / project that teaches a younger student, tutors a concept, or solves a niche problem. Grade the system design and instructions.
Pedagogical thinking, systems design, audience awareness, original application.
Student picks a position. AI is prompted to argue the strongest possible counter. Student submits the full debate transcript + a 200-word post-mortem on which AI arguments forced a concession, which they rebutted, and where the AI used sophistry.
Argumentation, intellectual humility, rebuttal craft, recognising fallacies.
Students systematically probe an AI for failure modes — surfacing one verifiable hallucination, one bias (gender, cultural, geographic), and one factual error in their subject area, with primary-source evidence for each. This is the work AI labs literally pay people to do.
Domain expertise, ethical reasoning, evidence collection, healthy scepticism.
Project the AI on the board. Give it the assignment live. Students annotate the streaming output in real time — flagging weak claims, missing evidence, biased framing, and great phrasing. Run it as a 10-minute timed sprint; the student who flags the most defensible issues wins.
Speed of critical reading, content mastery, recognition of AI tells (over-confidence, fake citations, vague hedging).
If your task requires AI, you must provide equal access. Use school-licensed tools, provide shared accounts, or give in-class time. Never assume students have paid plans at home.
Borrow from Italian universities, PhD defenses, and IB orals: the viva voce (Latin for "living voice") — a short interview where students defend their work aloud. It's the single most effective anti-AI measure — and it turns weak work into a coaching moment.
Student sits with you (or a panel), you ask unprepared questions about their submitted work, and you grade the conversation — not just the paper.
Even if AI wrote the essay, the student must understand it to defend it. A 60-second follow-up question separates learning from outsourcing.
Treat the written submission as the starter. The viva makes the mark. Students who can't defend their work, don't own their work.
Pick 3-5 of these at random per student. Keep the tone warm but precise.
| Criterion | Emerging (1-2) | Proficient (3) | Exemplary (4) |
|---|---|---|---|
| Ownership of content | Recites memorised phrases; confused when probed. | Explains main ideas confidently in own words. | Re-explains, reframes, and improves on the fly. |
| Evidence & sourcing | Cannot locate or justify sources. | Names sources and summarises them accurately. | Weighs sources against each other; flags weaknesses. |
| Response to challenge | Becomes defensive; repeats earlier claims. | Accepts feedback; adjusts position reasonably. | Generates counterarguments unprompted; thinks aloud. |
| AI transparency | Vague or evasive about AI use. | Names tools and steps where AI was used. | Explains what AI got wrong and how they fixed it. |
5-10 min. Best for capstones, IB orals, senior projects.
3-min walkthrough of their work + one surprise question by reply.
Student defends to 3 peers using your rubric. Scales to 30+ students.
Pull 5 students at random for a 2-min mini-viva after each submission.
Don't try to viva every student on every task. Viva one in three, randomly chosen after submissions are in. The possibility of being called changes the behaviour of the whole class.
Every time a student uses AI, they should do two things: cite it (like a source) and reflect on it (like a co-author). Reflections themselves must be AI-proofed — otherwise students will just ask the AI to write the reflection.
In-text: (OpenAI, 2026). Include the prompt in an appendix.
Quote the prompt, name the tool version, date it, link it.
A structured 150-word statement students attach to their work. Use these four prompts:
Paste your best 2-3 prompts verbatim.
Which AI suggestions made it into the final piece unchanged?
Name one thing the AI got wrong, misleading, or biased.
One sentence on how using AI changed your thinking about the topic.
Students will absolutely use AI to write an "AI reflection" if you let them. Use one of these four strategies:
Take the first 10 minutes of the lesson after the submission is due. Students handwrite the reflection on an index card. No devices.
Student records a 60-second Flip/Loom video answering the four prompts — verbal fillers and pauses reveal real thinking.
Students submit dated screenshots of their AI chat(s). Chat history on ChatGPT, Claude, and Gemini is timestamped — very hard to fake.
Students interview each other for 3 minutes using the four prompts, then write each other's reflection. Surfaces understanding, not just output.
Everything from this masterclass, on one card. Put it above your desk while you redesign your assessments.
Declare AI level (1-5) and traffic-light colour on every task.
Anchor tasks to this week's class, local context, and personal experience.
Assess prompting, critique, and fact-checking — not production.
Viva voce 1-in-3 students at random. Weight it 30-50% of the grade.
Cite AI + 150-word handwritten reflection on ask / accept / reject / learn.