Skip to main content

Cookie settings

We use cookies to ensure the basic functionalities of the website and to enhance your online experience. You can configure and accept the use of the cookies, and modify your consent options, at any time.

Essential

Preferences

Analytics and statistics

Marketing

AI in Re-imagining Assessments

Avatar: Akshit Mathur Akshit Mathur

Team name
JGU 5
Team members (First name, LAST NAME, University)
Akshit Mathur, Vaibhav Kumar, Kashish Goel, Sarang Phulwaria, Keertika Saini, O.P JINDAL GLOBAL UNIVERSITY
What area does your use case primarily fall under?
Training / education / pedagogy
The AI use case you are working on
Universities and their regulatory bodies continue to rely on plagiarism and academic misconduct frameworks designed long before generative AI made it possible to produce original, undetectable academic work at scale. Higher education's entire evaluation architecture was built on the assumption that submitted work reflects genuine human understanding. Generative AI has made that assumption untenable, yet no institutional system has been redesigned to account for this shift. As a result, evaluation no longer reliably measures learning, but continues to certify it. This situation is neither hypothetical nor distant. It is happening inside every lecture hall and examination centre in Indian higher education right now, largely unacknowledged and entirely unregulated. Consider a student enrolled in a law programme at a mid-tier Indian university, assigned a research paper on constitutional interpretation, due in seventy-two hours. The student opens an AI model, types the assignment question, and receives a structured, well-argued response within seconds. After minor edits, the submission is uploaded. The faculty member finds no red flags on Turnitin, as the content was generated, not copied, and awards a first-division grade. The student has demonstrated no independent legal reasoning, engaged with no primary sources, and formed no original argument. The credential, however, reflects otherwise. This is not an isolated incident. It is the default operating mode of a significant and growing proportion of student submissions across law, management, humanities, social sciences, and increasingly the sciences. What makes this particularly serious is not the use of AI itself, but its structural invisibility within current evaluation systems. The student is often not caught. The faculty member has no mechanism to accurately detect what happened. The institution, even with a policy in place, finds it largely futile. And regulatory bodies, the UGC, BCI, and NAAC, have issued no enforceable guidance that would change any part of this chain. The actors span the entire academic ecosystem. At the centre is the student, not necessarily acting dishonestly, but responding rationally to a system that rewards output quality over demonstrated understanding. Surrounding them is the faculty member, simultaneously under-resourced and undertrained in AI literacy. Critically, AI meets the student at the very end of the academic process, the moment of evaluation. It does not function as a learning tool, a tutor, or a research guide within the current structure. It functions as a substitute for the intellectual work that evaluation was designed to measure. The grade, the transcript, and the degree then certify something that never happened. This reveals a profound structural failure. The problem is not that AI exists, or even that students use it, but that the system provides no mechanism to distinguish between demonstrated understanding and generated output. This use case can be described precisely: a student inputs an assignment prompt into a generative AI system, performs minimal edits on the output, and submits it into an evaluation system that assesses only the final text. Similarity detection finds no violation. A high grade is assigned. At no point is the student required to demonstrate cognitive authorship of their own submission.
Why this use case matters
The evaluation architecture of Indian higher education rests on a foundational assumption that has already collapsed: that submitted work reflects the mind of the student who submitted it. Generative AI exposed an old problem that assessment systems designed to certify understanding can, without structural reform, be made to certify nothing at all. The primary casualty is the diagnostic function of evaluation itself. Academic assessment serves two simultaneous purposes: it produces a grade and it produces information. This information is about what the student knows, where comprehension has failed, and what instruction must still do. When a student submits AI-generated work, both functions break down simultaneously. The grade is an award but the information is absent. The faculty member proceeds on false data, unable to identify gaps in understanding, adjust instruction, or meaningfully gauge the effectiveness of their teaching. This is not a marginal distortion of pedagogy. It is the systematic removal of the feedback mechanism through which teaching and learning are aligned. Further, writing, synthesis, and structured argument are a method for education. The capacity to construct an argument from primary sources, to locate the point where two precedents conflict, to reason from principle to application are all developed through the discomfort of doing them badly and being corrected. A student who consistently outsources this work during the journey of training will graduate having satisfied every evaluation criterion without undergoing the formation that those criteria were designed to produce. The transcript will certify what the student cannot demonstrate. For faculty, the structural consequence is equally serious. The present situation in higher education is that the Faculty are expected to uphold academic integrity using plagiarism detection tools built for a pre-generative-AI world which are instruments that measure copying, not generation, and return clean results on submissions that are wholly humanised but AI-produced. Without updated policy frameworks, enforceable institutional standards, or the time to investigate every submission that passes automated scrutiny, the burden falls on individual faculty judgment in a system that provides no support for exercising it. The result is a predictable institutional equilibrium: suspicion becomes endemic on one side, strategic concealment rational on the other, and the pedagogical relationship between teacher and student is quietly restructured around detection and evasion rather than inquiry and growth. The equity implications must also be explored. Access to premium AI tools tracks closely with socioeconomic advantage. A student with reliable internet, an unshared device, and the financial means to afford a sophisticated subscription will consistently produce more polished AI-assisted submissions than a peer without those resources. The IAMAI-Kantar Internet in India Report 2025 documents rural internet adoption at only 12%, a figure that is not merely a statistic, but a map of who is and is not competing on the same terms. In a system that rewards the quality of the final output without examining how it was produced, this is not a digital divide. It is an evaluation regime that systematically advantages those who already hold structural advantages, while presenting itself as a meritocracy. AI, under current conditions, does not democratise higher education. It amplifies existing inequalities while generating the appearance of a level playing field. The signalling function of educational credentials, which is the mechanism through which institutions communicate student capability to employers, bar councils, and graduate admissions bodies, faces a parallel erosion. Grades and degrees function as information. When the information they carry decouples from the understanding they were designed to certify, their value to every downstream institution diminishes. Employers have begun developing informal screening mechanisms. Alternative credentials have proliferated to fill the gap. The burden of evaluation has begun migrating beyond universities into spaces that are less regulated, less equitable, and less accountable. This is already visible in the hiring practices of law firms and consulting firms that have introduced assessment exercises precisely because they no longer treat transcripts as sufficient evidence of competence. The consequence is not merely a problem for higher education. It is a structural shift in how merit is assessed and who bears the cost of assessing it. The opportunity this disruption creates is specific. The breakdown of existing evaluation systems does not require the elimination of AI from academic life. It requires the construction of an assessment architecture in which demonstrated, real-time human understanding is the irreducible condition for credit. Every grade, every credential, every selection within higher education must be backed by an expression of understanding that cannot be delegated. This is not because AI assistance is inherently illegitimate, but because a system that cannot distinguish between the two has ceased to function as an evaluative system at all. The Integrated Evaluation Matrix our proposal describes is built on that premise: not to ban AI from the academic process, but to ensure that the process always culminates in a moment of genuine human authorship that no model can produce on a student's behalf.
Your team's motivation and learning objectives
As students, we are not just users of AI, or its data subjects, but also subjects of the AI policies framed by universities and regulators. However, to address this deeply structural and foundational problem, we aim to be active participants in shaping the contours of AI policies. The growing misalignment between formal education and the realities of practice especially in the age of AI is something we experience first hand, in each internship, in each assignment, in each grade. Our lived experience, through all these different lenses, allows us to not only identify gaps but also meaningfully contribute to how they are to be addressed. An industry like law specifically is an ever changing one and legal education is struggling to keep up. Specifically from our vantage point as law students today, we want to understand how legal education delivery including examinations and grading systems will impact the lawyers & leaders we become tomorrow. Let's imagine a student, a first generation student from a tier 2 city in India, she excels her coursework but often finds herself studying from an outdated syllabus which has often ignored the pace at which law evolves in practice and through borders. She often finds herself engaging with cases involving AI and tech yet has no structured way to engage with them. Without the access to internships, mentors who can guide her or even paid courses she cannot bridge this gap. Everything right from her gender, her geographical locations and limited resources become a compounding barrier. While her peers in better connected institutions build specialised skills, she graduates with the same degree but never the same opportunity. What we as students stand to gain from this opportunity is first, a practical insight into designing real world AI interventions that are also scalable. Second, international exposure to interdisciplinary collaboration at the intersection of policy, technology and education. Third, this challenge does not limit itself to mere ideation but also translating those ideas into implementable high impact solutions that make an impact. Our engagement is not confined to the duration of the challenge, but is intended to extend beyond it through sustained effort and continued development. This is not an exceptional story. It is the default experience of a significant proportion of students across India, and it is made structurally worse by the current moment in AI. The students who stand to lose most from the collapse of meaningful evaluation are not those with the resources to navigate it strategically. They are the ones for whom a credible degree was the primary instrument of social mobility, and who are now competing, without knowing it, against peers whose submissions are indistinguishable from their own on the surface, but were produced through entirely different means. We are not approaching this challenge as outside observers of a system we have studied. We are inside it. We have sat in examinations designed for a pre-AI world. We have submitted assignments assessed by faculty who have no institutional mechanism to distinguish genuine understanding from generated output. We have watched peers who engage seriously with course material receive the same grades as those who do not, because the evaluation system cannot tell the difference. This is not an abstract concern about the future of education. It is the present condition of our own formation as professionals, and it is one we are in a position to describe with a specificity that neither regulators nor institutions have yet demonstrated. What we bring to this challenge, beyond our analysis of the problem, is a perspective that is systematically absent from the policy conversations shaping AI's role in education. Regulators consult institutions. Institutions consult faculty. Faculty consult research. Nowhere in that chain does the voice of the student, particularly the student from an under-resourced institution, consistently appear. The frameworks being built now, the guidelines being drafted by bodies like the UGC, the integrity policies being adopted by universities, will govern the academic lives of millions of students who had no seat at the table when those frameworks were designed. We intend to change that. What we stand to gain from this opportunity is specific and cumulative. A practical insight into designing AI interventions that are not only conceptually sound but institutionally scalable, capable of functioning within the resource constraints of mid-tier Indian universities, not just the well-funded exceptions. International exposure to interdisciplinary collaboration at the intersection of policy, technology, and education, working alongside peers and practitioners who bring frameworks we do not yet have access to. And the experience of translating a diagnosis into a deployable solution, moving from identifying what is broken to building something that works in its place. Our engagement is not confined to the duration of this challenge. The Integrated Evaluation Matrix we are proposing is a framework we intend to develop, test, and advocate for beyond this process. The problem we are addressing will not resolve itself when the competition concludes. Neither will our commitment to addressing it.
Your initial contribution
PROPOSAL - 1. What is the situation or context you are addressing? This situation and context that we are addressing relates to how examinations are a) conducted, b) how they are perceived and 3) why they need to be reconsidered. University students are currently in a transitional phase where artificial intelligence is no longer external to the learning process but embedded within it. AI tools are routinely used across stages of academic work ranging from research to drafting to submission. However, this shift has not been matched by a corresponding institutional response. The evaluative framework remains largely unchanged which creates a structural imbalance between how students produce work and how that work is assessed. More fundamentally, the issue is not limited to how student work is evaluated, but extends to how assessment itself is designed. Assessment design determines what is incentivised, what is measured, and ultimately what is learned. When the design assumes that output reflects independent cognition, but the environment allows that output to be externally generated, the entire evaluative mechanism fails at its foundation. We attempt to ask certain questions to resolve the problem at hand. How should submissions and examinations be evaluated in an AI-integrated environment? What constitutes “originality” when generative models can produce novel, undetectable outputs? Can existing definitions of academic integrity withstand this shift? More fundamentally, is it normatively defensible to penalise students for using AI when such tools are already embedded across professional domains, including law, finance, and medicine? The absence of clear institutional answers to these questions reflects a systemic failure to realign evaluation with the realities of contemporary learning. At its core, the problem can be stated precisely. Current evaluation systems measure the quality of submitted output without verifying whether that output reflects the student’s own cognitive work. In an environment where generative AI can produce high-quality, original responses on demand, this results in a structural failure: institutions are certifying performance without evidence of understanding. To properly understand the issue, it is necessary to begin from the perspective of the student. Today, artificial intelligence is no longer external to academic work but is embedded within it at multiple stages, including learning, research, and the production of assignments. In this context, the question is no longer whether artificial intelligence should be excluded from education, because such exclusion is neither practical nor desirable. Instead, the focus must shift toward how students can be prepared to engage with and navigate AI-integrated environments that increasingly define professional and academic spaces at a global level. This shift, however, has not been matched by a corresponding transformation in how students are evaluated. Existing assessment methods, particularly written submissions, continue to operate on the assumption that the final output reflects the independent intellectual effort of the student. That assumption is no longer tenable. As a result, current evaluation systems are unable to reliably determine whether a student has genuinely understood the underlying concepts or has relied substantially on AI-generated outputs.. This creates a fundamental misalignment between the process of learning and the mechanisms used to assess it. This misalignment is produced and sustained by the interaction of multiple actors within the academic ecosystem. Students respond rationally to incentives that reward polished output over demonstrated understanding. Faculty operate with limited time, limited training in AI literacy, and tools that are no longer fit for purpose. Institutions prioritise scalable assessment formats that can be administered across large cohorts, even when those formats no longer measure learning effectively. Regulatory bodies such as the UGC, BCI, and NAAC have not issued updated, enforceable standards that reflect the realities of generative AI. Employers, in turn, are beginning to compensate for this breakdown by developing parallel evaluation mechanisms outside the university system. The problem is therefore not located in any single actor, but in the structure that connects them. If the problem lies not only in how submissions are evaluated, but in how assessments themselves are designed, then reform must target the architecture of assessment rather than its outputs alone. Assessment design determines what is rewarded, what is measured, and therefore what students learn. In a context where written outputs can be externally generated, any design that treats such outputs as evidence of understanding ceases to function as a meaningful evaluative mechanism. This shift requires moving toward forms of assessment in which understanding is demonstrated in real time rather than inferred from submitted text. Methods such as structured oral evaluation, Socratic questioning, and academic debate become relevant in this context not as pedagogical preferences, but as mechanisms that allow institutions to directly observe reasoning, rather than assume it from output. We have to revamp student skill sets in a way where they are not dependent on AI but are instead using it as a tool, the way a professional uses any instrument, in service of their own thinking rather than as a replacement for it. A lawyer who cannot reason without AI is not a lawyer. A doctor who cannot think through a diagnosis independently is not a doctor. The degree has to certify that the person can actually do the work, and that certification has to be built on evidence that goes beyond a submitted document. As professors, it is equally important to innovate in how learning happens inside the classroom. Evaluation formats that require students to articulate, defend, and adapt their reasoning in real time such as structured academic debate become increasingly relevant in this context. These formats move assessment away from static written outputs toward the direct observation of reasoning under scrutiny, allowing institutions to evaluate understanding as it is demonstrated rather than inferred from submitted work. None of this is possible, however, without investing in the faculty who are being asked to facilitate it. Shifting from written submission to Socratic dialogue or structured debate requires a different set of skills from a professor than marking an essay does. It requires training in facilitation, in asking questions that genuinely probe rather than lead, and in evaluating a student's reasoning in real time rather than on paper. Institutions cannot ask faculty to adopt new pedagogical methods without giving them the support, the time, and the professional development infrastructure to do it well. The reform of assessment is also, unavoidably, a reform of what it means to teach. The key area where students and institutions must find common ground is this: the goal is not to keep AI out of education. The goal is to ensure that AI does not become a substitute for the development of the human mind. Every assessment reform, every pedagogical innovation, every regulatory intervention should be measured against that standard. Does it produce a student who can think? Or does it produce a student who can manage a tool that thinks for them? The first produces a graduate. The second produces a dependency. What emerges from this analysis is not a marginal problem of academic misconduct, but a systemic breakdown in how higher education defines, measures, and certifies learning. The integration of artificial intelligence has not created this fragility, but has made it visible and impossible to ignore. Any meaningful response must therefore move beyond surface-level regulation toward a reconstruction of assessment itself, where the demonstration of understanding becomes the irreducible condition for academic credit. Without such a shift, the gap between what education claims to certify and what students are actually able to demonstrate will continue to widen, with consequences that extend far beyond the university. - 2. What is your critical analysis of this situation? Students across Indian higher education are using generative AI at every stage of their academic work. They use it to find sources, structure arguments, summarise material, improve their language, and in many cases generate entire drafts of assignments. This is not limited to a few students at elite institutions. Most students today report using AI tools, and the scale of the problem is unlike anything academic institutions have faced before. It goes beyond mere cheating, as it has extreme outcomes for both students and humanity at large. In many cases, it begins to substitute rather than support higher-order thinking, ideation, and independent reasoning. Faculty are also far behind in solving this. Some have begun using AI detection software while others have added brief AI policy statements to their syllabi. Most are operating without any clear institutional guidance on what counts as acceptable use, how to evaluate submissions differently, or how to handle suspected misuse. The gap between how students are using AI and how institutions are responding to it is wide and getting wider. At the regulatory level in India, this gap is even more visible. UGC's 2018 plagiarism regulations remain the governing standard with no formal update to address AI-generated content, even as institutions across the country are clearly grappling with the problem. The All India Council for Technical Education (AICTE), which regulates technical institutions in India, has moved faster by treating unacknowledged AI use as a form of plagiarism. However, the University Grants Commission (UGC), which governs the broader university sector in India, has not adopted a similar approach. This means a student at an engineering college faces a different accountability standard than a student studying law or commerce at a state university. The regulatory landscape is fragmented, and that fragmentation sends a signal to institutions that deferring action is acceptable. This fragmentation is further compounded by disparities in AI literacy across institutions. While some universities have begun integrating AI into coursework through workshops, training sessions, and guided use in research and writing, many others have not. As a result, students are not only unevenly equipped in their subject knowledge, but also in their ability to effectively use and critically engage with AI tools. The gap is therefore not limited to access, but extends to differences in exposure, training, and familiarity with AI-assisted workflows. This creates an uneven playing field in which some students develop the ability to use AI as an aid to thinking, while others remain either dependent on it or excluded from its effective use altogether. The Student Has Quietly Checked Out There is a behavioural shift happening inside classrooms that does not show up in submission data or detection reports. Students are disengaging from the intellectual work of their degrees. When any assignment can be resolved in ten minutes with the right prompt, the incentive to actually sit with a problem, struggle through it, and build understanding from that struggle disappears. Conventional assignments, essays, case analyses, research papers, problem sets, were already easy to game before AI arrived. AI has simply removed the last remaining friction. The student does not need to understand the material to produce a document that looks like they do. And once that is true, many students stop trying to understand it at all. This is not laziness in the way the word is usually used. It is a rational response to a broken incentive structure. If the grade comes from the document and the document can be generated, then engaging deeply with the subject has no practical payoff. Students are not becoming less capable because they lack ambition. They are becoming less capable because the system has stopped asking them to be capable. Assessment was intended to create the intellectual pressure through which learning occurs. Without that pressure, the learning does not happen. The consequences of this are already visible in classrooms. Faculty across disciplines report students who cannot explain the arguments presented in their own submissions or articulate the reasoning behind them, cannot answer basic follow-up questions on topics they have just been assessed on, and struggle to engage in any discussion that requires thinking on their feet. Higher order thinking skills, the ability to analyse, question, challenge assumptions, and construct original positions, are not being developed because the conventional assignment structure never really demanded them in the first place. AI has simply made that pre-existing failure impossible to ignore. What is genuinely concerning is the long-term trajectory. A student who outsources three years of academic thinking to a language model graduates with a credential but without the intellectual muscle that credential is supposed to represent. They have not learned how to sit with complexity, tolerate ambiguity, or reason their way through a problem they have not seen before. These are not soft skills. They are the foundation of professional competence in every field, from medicine to law to engineering to public policy. The degree says one thing. The graduate can do another. That gap will not stay inside the university. It will show up in every institution and industry that hires from it. What issues and tensions does this create? What transformations does it lead us to think of? 1. Assessment has stopped working as a measurement tool A submitted essay is supposed to show that a student can construct an argument. A research paper is supposed to demonstrate their capacity for independent inquiry. When a machine can produce that same output in minutes, the instrument stops measuring what it was designed to measure. Generative AI allows students to complete assessment tasks without demonstrating genuine capability, and most institutional responses focus on communicating rules to students rather than redesigning how assessments actually work. A policy that tells students how much AI they may use does not fix the underlying problem. The assessment still does not tell you whether the student understood anything. 2. Detection as a process has already failed as a response We feel that the Institutions reached for detection as their first answer, but it has not held up. Paraphrasing attacks, where words are replaced with synonyms, can bypass several detectors, and tools specifically designed to make AI text undetectable are widely available online. More fundamentally, AI detection produces probabilistic estimates that do not meet the evidentiary standard required for academic misconduct proceedings. This creates a genuinely unfair situation. An institution that uses detection scores to penalise students risks punishing students who did not use AI while failing to catch students who did. It is technically unreliable and also procedurally indefensible. AI models are also updated continuously, and each update makes generated text harder to flag. Detection is in a permanent race it cannot win. We also feel that the assignment becomes more of an AI Detection Evasion activity instead of actively engaging with the true spirit of the work that was assigned. 3. The professor-student relationship has also eroded As student numbers in Indian universities grow, institutions have shifted toward scalable written assessments because they were easier to administer. Tutorial-based learning, oral examination, structured dialogue between a student and their professor, and the Socratic tradition of learning through questioning were all progressively sidelined. AI did not directly cause this erosion but it has made the cost of it visible and urgent. The Socratic method is the most resilient form of assessment available right now precisely because it cannot be delegated to a machine. A student who submitted AI-generated work cannot, in real time, defend the reasoning behind it, respond to a challenge, or apply the argument to a new scenario. Research shows a significant positive correlation between exposure to Socratic questioning and performance on higher-order reasoning tasks. Socratic dialogue moves students from basic recall toward analysis, synthesis and evaluation, with measurable improvements in their performance on assessments requiring critical thinking. The conversation between a student and a faculty member is not a relic, It is the one form of evaluation AI genuinely cannot replace. 4. Equity is more complicated than it looks AI tools are often described as democratising because they are widely available. This framing is too simple. A student at a well-resourced private institution with access to a premium AI model is not using the same tool as a student at a state college relying on a free-tier platform with unreliable internet. The gap is not just about access to the tools. It is also about the capacity to use them well. Evaluating AI output critically, identifying its errors, and building genuine reasoning on top of it are skills that track closely with the quality of a student's prior schooling. Research on AI policy in universities identifies disparities in tool access and uneven enforcement as among the most prominent student-facing harms of poorly designed integration frameworks. A framework that permits AI use without equalising the conditions of that use does not reduce inequality. It formalises the advantage of those who were already ahead. 5. AI companies have interests that institutions are not accounting for The adoption of AI tools in academic settings is not a neutral pedagogical choice. It is entry into a commercial relationship with companies whose business models depend on data accumulation and user acquisition. Student submissions, reasoning patterns, and academic work are commercially valuable training data. The Digital Personal Data Protection Act 2023 is directly relevant to platforms processing student data, but the subordinate regulations needed to govern AI use in educational contexts have not yet been developed. When UGC or BCI adopts an AI integration framework without accompanying data governance requirements, it does not just permit this dynamic. It gives it institutional legitimacy at scale. 6. The degree is losing its signal value All of these tensions converge on one outcome. When grades increasingly reflect AI-assisted output rather than a student's own reasoning, employers, postgraduate institutions, and professional bodies lose the ability to trust what a degree actually represents. Faculty have already begun resorting to informal viva-style conversations to check whether a student actually understands what they submitted, precisely because submitted work alone no longer tells them. That informal workaround points toward the structural reform the system needs. The problem is that it has not yet been formalised, standardised, or scaled. The core tension in this situation is the gap between what assessment is supposed to do and what it is currently doing. Assessment is supposed to verify that a student has genuinely understood something and can use that understanding independently. What it is currently doing is verifying that a student has submitted a document. Those two things are no longer the same. That is the situation that demands a structural response. - 3 What perspectives were discussed and how were they debated or arbitrated within your team? The discussion within the team was structured around two competing paradigms: integration as regulation and prohibition as equalisation. Rather than treating these as mutually exclusive policy preferences, the deliberation focused on identifying which framework more accurately responds to the structural transformation introduced by generative AI in higher education. 1. Integration as a Structural Necessity One line of argument within the team positioned AI integration not as a policy choice but as a descriptive necessity. This perspective emphasised that generative AI tools such as GPT-4 have already entered the epistemic core of academic work, with students using them across research, drafting, and analytical tasks. From this standpoint, prohibition was characterised as normatively ineffective because it attempts to regulate a practice that is already widespread yet difficult to detect or enforce. This position further argued that integration enables the reconstruction of assessment frameworks, shifting evaluation away from output-based metrics toward process-based and performance-based verification. Mechanisms such as staged submissions, reasoning disclosures, and viva voce examinations were seen as restoring the evidentiary value of assessment by directly testing understanding rather than relying on written artefacts that AI can replicate. The argument was therefore not merely in favour of AI use, but in favour of restructuring academic evaluation around demonstrable cognition. 2. Prohibition as a Claim to Equity A competing perspective within the team advanced the claim that prohibition offers a more equitable framework by creating a uniform baseline. This argument rested on the observation that access to AI is stratified across three axes: tool quality, AI literacy, and institutional capacity. Premium models outperform free-tier tools, students differ in their ability to critically engage with AI outputs, and institutional capacity to evaluate AI-assisted work is uneven across institutions. From this perspective, integration—particularly when paired with disclosure requirements—was seen as insufficient because it renders inequality visible without correcting it. The concern was that formalising AI use risks entrenching advantage for already well-resourced students, while overburdened faculty in large public institutions lack the capacity to meaningfully assess disclosures or distinguish between assisted and independent work. 3. Contestation on the Question of Equity The central point of debate between these perspectives was the meaning of equity in an AI-mediated academic environment. The prohibition-oriented view treated equity as formal uniformity, where identical rules apply irrespective of background conditions. In contrast, the integration-oriented view defined equity as substantive fairness, requiring that disparities be made visible and subject to institutional correction. The team’s internal arbitration resolved this tension by recognising that prohibition does not, in fact, eliminate unequal advantage. Instead, it pushes AI use into informal and unregulated domains, where differences in access and capability continue to shape outcomes without institutional oversight. Integration, while not eliminating inequality, was seen as creating the only viable framework within which disparities can be addressed through standardised access, AI literacy training, and redesigned assessment methods. 4. Commercial and Data Governance Concerns Another axis of debate concerned the commercial implications of AI integration. One strand of argument highlighted that integrating AI tools effectively places institutions in vendor relationships with private companies, raising concerns about data extraction, consent, and monetisation. The example of Turnitin was invoked to demonstrate how educational platforms can accumulate and commercially leverage student-generated data without meaningful transparency or compensation. This concern was not dismissed but reframed within the integration model. The counter-position argued that regulatory leverage is only possible under conditions of formal integration. Legal frameworks such as the Digital Personal Data Protection Act 2023 can only be operationalised when institutions explicitly recognise and govern AI use. Prohibition, by contrast, was seen as relinquishing institutional control and leaving students subject to private contractual regimes without oversight. 5. Pedagogical and Professional Alignment A further dimension of the discussion focused on the purpose of higher education, particularly within professional fields such as law. One perspective cautioned that AI risks displacing higher-order thinking, leading to intellectual dependency. However, the integration-oriented response reframed this concern by locating the problem in assessment design rather than tool availability. The team ultimately converged on the view that integration allows institutions to align academic training with professional reality, where AI is already embedded in practice. The objective shifts from restricting tool use to developing the capacity to critically engage with AI outputs, thereby preserving the centrality of human reasoning while acknowledging technological change. 6. Final Position and Internal Resolution The internal debate concluded with a clear preference for the integration framework, not because it is free from limitations, but because it offers a structurally responsive model. Prohibition was recognised as normatively appealing in its promise of equality but practically unsustainable and analytically insufficient in addressing the core problem: the breakdown of assessment as a measure of understanding. Integration, by contrast, was adopted as the team’s position on the grounds that it: Aligns policy with the empirical reality of AI use; Enables reconstruction of assessment mechanisms; Provides a framework for addressing, rather than obscuring, inequality; and Retains institutional and regulatory control over data governance and commercial engagement. The debate within the team was therefore not resolved by dismissing opposing concerns, but by demonstrating that those concerns are better addressed within an integration framework than through prohibition. 4. What contribution are you proposing, and under what conditions could it be implemented? Our proposal is the Situated Learning Assessment Model (SLAM) — a governance framework built on a single premise that the current system has never adequately operationalised laying down that what a credential certifies is not the ability to produce a document, but the capacity to demonstrate genuine understanding. SLAM does not modify how submissions are structured, staged, or disclosed. It moves evaluation away from submissions entirely, and rebuilds it around in-context, application-based demonstration of understanding which is calibrated to the subject being studied, the year of the student's development, and the intellectual direction the student has chosen. This distinction matters because every reform that has attempted to work within the submission model has failed on the same ground. Staged drafts can be staged with AI. Annotated bibliographies can be annotated with AI. Disclosure logs record what a student says they did, not what they actually did. The submission, in whatever form it takes, remains an artefact and an artefact can be generated. The SLAM’s departure from the existing framework is therefore not incremental. It is structural. The unit of evaluation shifts from what a student produces outside the room to what a student can demonstrate inside it. The Organising Logic: Start From the Learner, Not the Framework Most evaluation reform in higher education follows the same sequence: design a universal framework, then insert flexibility into it. SLAM inverts this. It begins from the learner including their year of study, their subject, their chosen area of study and builds the evaluation method upward from there. The result is not a single system applied uniformly across a degree programme. It is a principled architecture within which different subjects, different years, and different learners are assessed through mechanisms genuinely suited to what they are being asked to demonstrate. This is not administrative complexity for its own sake. It reflects a reality that uniform evaluation frameworks have always obscured. This reflects the reality that the intellectual demands placed on a first-year student building foundational knowledge of constitutional law are fundamentally different from those placed on a fifth-year student analysing the contested boundaries of technology regulation. Asking them to demonstrate that understanding in the same format is not fairness. It is a refusal to take either of their learning seriously. The same logic applies to language. A student who received their foundational education in Hindi, Tamil, or Bengali and now reasons through complex legal problems in English is not demonstrating lesser understanding when they hesitate over formal academic register - they are navigating a medium that was never designed around how they think. SLAM addresses this not through remediation but through structural design. Before enrolment, institutions map the medium of prior schooling for each student, identifying where a gap between schooling language and instruction language exists. In the first semester, all students, not only those flagged, participate in academic discourse coaching through credited or co-curricular workshops on formal academic conversation, designed as a universal preparation rather than a deficit intervention. Before any graded viva, faculty run low-stakes practice sessions that normalise the format across the cohort, removing the compounding disadvantage of unfamiliarity for students with less prior exposure to formal oral academic settings. Throughout the programme, peer mentoring pairs connect senior and junior students across language backgrounds, building oral academic communication as a shared institutional culture rather than an individual burden. Within the viva itself, students may initially articulate their reasoning in their first language before self-translating because the argument, not the language of its first expression, is what is being assessed, and faculty training must explicitly reflect this. Where language anxiety remains significant, viva questions are made available thirty minutes before the session, removing vocabulary retrieval pressure without reducing the spontaneity of reasoning being tested. After every viva, faculty are required to give feedback that separates content understanding from expression quality, so that students can distinguish what they know from how fluently they currently articulate it and develop both, on their own terms. The end goal isn’t just a simple transition to AI integration into academics, but it is an overhaul of the existing structures in place to better adapt and help students utilise and learn from a new pedagogy. Differentiation by Subject Type SLAM classifies subjects into three broad categories, each requiring a distinct evaluative logic. Foundational and theory-based subjects are those that establish the conceptual vocabulary and doctrinal architecture of a discipline are best assessed through in-classroom application tasks. A student is given a problem, a scenario, or a set of facts they have not seen before and is asked to reason through it in real time, in the room, without external assistance. The task is not designed to test memory. It is designed to test whether the student can take what they know and use it, whether they can identify what principle applies, why it applies, and what it requires in this specific situation. This format is structurally resistant to AI substitution not through detection but through design. The reasoning happens in the student's mind, in the present moment. It cannot be produced in advance and carried into the room. Applied and clinical subjects are those concerned with professional competence and disciplinary practice that demands evaluation through performance. Moot court exercises, case simulations, structured problem-solving under observation, live client scenarios. These formats make understanding visible through action. A student cannot delegate a live advocacy performance. A student cannot outsource the reasoning a clinical scenario demands in real time. The assessment instrument is the performance itself, and performance, by definition, cannot be generated in advance. Elective and specialisation subjects which a student selects on the basis of intellectual interest or professional direction require a different kind of evaluation altogether. Here, the appropriate mechanism is structured faculty-led inquiry conducted within the classroom or in a defined academic setting, where the student is asked to engage with the material in the specific domain they have chosen. A student who has elected to focus on technology law should be assessed on their capacity to reason within that field. This need not be done through a submitted essay on it, but through a directed intellectual exchange in which the faculty member probes the depth and independence of their thinking in real time. Differentiation by Year of Study Across all subject types, SLAM requires that evaluation design account explicitly for where a student is in their intellectual development. In the early years of a degree programme, the evaluative priority is conceptual grounding. The student needs to demonstrate that they have genuinely internalised the foundational logic of their discipline and not just produced sophisticated arguments about it. In-class application tasks at this stage should be designed around core knowledge: can this student identify the relevant principle, apply it correctly, and explain why it leads to the outcome they have reached? The question is not complex. The test is whether the answer is genuinely theirs. In the middle years, the priority shifts to analytical development. Students are beginning to form positions, engage with competing arguments, and reason across different areas of doctrine. Evaluation at this stage should require the student to do more than apply a rule correctly, it should require them to explain why one position is stronger than another, what the argument they find most persuasive cannot account for, and where the reasoning breaks down under pressure. These are precisely the questions AI can simulate in a document but cannot answer when a faculty member pursues them in real time. In the final years, the priority is intellectual ownership. A student who has spent years building a degree should be assessed not just on what the discipline requires of everyone, but on what they have made of it. SLAM proposes that final-year evaluation operate on two tracks simultaneously. The first track tests shared disciplinary competence, including, the core knowledge and analytical capability that every graduate of the programme should be able to demonstrate, regardless of their chosen direction. The second track is learner-specific: it opens up around the student's chosen elective focus, their area of specialisation, the intellectual territory they have elected to develop. A student interested in technology law is assessed on their capacity to reason within that field. A student focused on real estate law is assessed on theirs. The evaluation instrument in both tracks is the same which is, in-context, real-time demonstration of understanding, but the content of what is being demonstrated reflects the actual intellectual work the student has done. This dual-track model is not a concession to student preference. It is a recognition that genuine intellectual engagement is most verifiable when the evaluation is directed at what the student actually knows, rather than at a generalised syllabus that no individual student has engaged with identically. A student who has spent a year thinking seriously about technology regulation will demonstrate their analytical capability more authentically when assessed within that domain than when assessed on a topic they have only superficially encountered. The Fairness Question The central challenge this framework must answer honestly is also the one it cannot afford to evade: how do you justify giving two students different evaluation formats and still claim the resulting grades mean the same thing? SLAM resolves this through a distinction between format and standard. The format of assessment that is the specific task, scenario, or inquiry through which a student demonstrates understanding, is variable and context-responsive. The standard against which understanding is evaluated, such as the cognitive depth, analytical rigour, and disciplinary competence expected at a given year level, is fixed and institutionally defined. Regulatory guidance from UGC, NAAC, and BCI must articulate year-level competency standards for each discipline constituting what a second-year law student should be able to demonstrate, and what a fourth-year student should be able to demonstrate, irrespective of which format they demonstrate it through. Two students assessed through different formats are not being held to different standards. They are reaching the same standard through paths appropriate to their subject and their stage of development. Different roads to the same destination is not inequity. It is a more honest account of how genuine understanding is diverse in its expression. There is a second fairness concern the framework must address directly. In-classroom application tasks and real-time faculty-led inquiry can disadvantage students who process more slowly, who experience anxiety in high-pressure academic settings, or whose prior schooling did not prepare them for this mode of intellectual engagement. SLAM does not treat these as edge cases. Faculty development must include explicit training in designing application tasks that allow for variation in response pace and style without reducing the rigour of what is being tested. The goal is demonstrated understanding. The format is the means to that goal, not the goal itself, and where a particular format creates an inequitable barrier to demonstrating genuine understanding, the institution has an obligation to address it. Conditions for Implementation SLAM is realistic under the following conditions: regulatory definition of discipline-specific, year-level competency standards by UGC, NAAC, and BCI; institutional investment in faculty development for application-based and inquiry-led evaluation design; equity audits conducted at each phase of implementation to identify where differentiated formats are producing unintended disadvantage; and compliance with the Digital Personal Data Protection Act 2023 governing any digital tools used within the evaluation process. Implementation must be phased. In the first year, pilot institutions implement subject-type differentiated evaluation in two disciplines, with mandatory outcome reporting. In the second year, the framework extends across all year levels at pilot institutions, with equity audits conducted before further expansion. In the third year, regulatory guidance enables national rollout with institutional adaptation flexibility built in — the framework's logic scales, but its specific mechanisms must be calibrated to disciplinary and institutional context. Limitations SLAM’s most significant constraint is institutional capacity. Designing and facilitating in-classroom application tasks and real-time inquiry-based evaluation requires a different set of skills from grading a submitted essay, and at institutions where faculty manage cohorts of over a hundred students, the resource demands are genuine. The phased implementation timeline and equity audit requirements exist precisely to surface where the framework is straining before expansion proceeds. These are constraints the proposal does not pretend to resolve in advance. They are constraints that honest implementation must confront directly. What This Asks of Higher Education SLAM does not ask higher education to keep AI out of the learning process. It asks higher education to build an evaluation architecture in which AI's presence in that process is simply irrelevant to the question the assessment is designed to answer. The question is not whether a student has produced a document that appears to demonstrate understanding. It is whether the student can, in that moment, demonstrate that understanding themselves. A graduate who can answer that question across their foundational subjects, their applied practice, and their chosen area of specialisation is a graduate whose credential means something. That is what the degree is supposed to certify. That is what SLAM is designed to make possible.
Comment

Confirm

Please log in

The password is too short.

Share