/
/
7 tips to close the AI trust gap in testing
Blog

7 tips to close the AI trust gap in testing

Sean Gasperson

September 6, 2024
Share:

We are at a pivotal moment in the adoption of artificial intelligence (AI) tools in assessment. AI innovations hold enormous potential across the assessment lifecycle – from new job analysis methodologies and standard setting in test design, to new item types and AI classifications in test development, to personalized practice tests and AI-enhanced proctoring options in test delivery.

However, our industry is still emerging from a period of distrust around the use of AI, largely born from the rapid and, at times, ill-considered adoption of fully automated remote proctoring during the pandemic. This legacy, combined with high profile stories about the impact of AI models built on unrepresentative and biased data, means we have some work to do to close the AI trust gap in testing.

With many ethical and practical implications for the adoption of AI tools in assessments, taking steps in seven key areas will help testing organizations close the gap.

1. Transparency and explainability

Building trust with AI starts with transparency and explainability. Clearly communicate to test takers and stakeholders how AI is being used throughout the assessment lifecycle. This includes the role AI plays in test content development, AI-enhanced test delivery, and assessing test outcomes. As well as what data is collected and how it is processed. Transparency demystifies AI, making it feel less like a ‘black box’ and more like a supportive tool.

Explainability is equally important. Testing bodies should ensure that decisions made by AI – such as flagging unusual test taker behaviors or scoring responses – can be understood and justified. Use language that’s accessible to non-experts, outlining how AI helps maintain test security, fairness, and accuracy.

When people understand how AI works in testing, they’re more likely to trust that it’s being used ethically and fairly. Openness in these areas reassures test takers that AI is a tool for enhancing – not compromising – the integrity of the certification or license they’ve worked so hard to earn.

2. Human oversight and accountability

While AI can enhance efficiency and accuracy, it should never operate without human intervention, especially in high-stakes environments that might affect a test taker’s future. Testing organizations should implement clear protocols around when and how human reviewers will intervene in AI-driven processes, such as reviewing flagged anomalies or finalizing test outcomes.

Accountability mechanisms also need to be established, ensuring AI decisions can be traced and explained. This includes documenting the rationale behind AI-generated outcomes and providing channels for test takers to challenge or appeal decisions.

Emphasize that human experts are always involved in critical decisions, ensuring the final judgment is both fair and contextually appropriate. By blending AI with human expertise, testing organizations will foster a more balanced, trustworthy system where AI serves as an aid rather than a sole decision-maker.

3. Bias mitigation

AI systems, like any tool, are only as fair as the data and algorithms they rely on. No matter how advanced the technology, flawed source data leads to flawed outputs. With this in mind, start by rigorously evaluating and curating the data used to train your AI models, ensuring it represents a diverse range of test taker demographics. Regular audits of AI outputs are also crucial to identify and correct any unintended biases that could disadvantage certain groups.

Collaborating with diverse experts, including psychometricians and AI ethics experts, can further help in designing AI systems that promote equity. It’s also important to openly share the steps taken to address bias with stakeholders, demonstrating a commitment to fairness. By proactively managing bias, testing organizations will reassure test takers that AI is being used to enhance fairness and inclusivity, rather than perpetuate existing inequities.

Read our guide to diversity, equity and inclusion across the assessment lifecycle.

4. Robust testing and validation

Before deployment, AI models must undergo rigorous testing under various scenarios to assess their performance across different populations and conditions. This includes stress-testing the AI with edge cases and diverse test taker profiles to ensure consistent results.

Validation doesn’t stop at initial implementation. Ongoing monitoring and recalibration are essential to address any emerging issues or shifts in data patterns. To increase robustness, testing organizations can also engage independent experts to review and validate AI models, providing an additional layer of scrutiny. Again, transparency about these validation processes is key – communicate the thoroughness of your approach to stakeholders, reinforcing their confidence in the AI’s reliability.

5. Ethical governance and oversight

Individual testing organizations and the testing industry as a whole need to establish clear ethical guidelines that govern how AI is developed, deployed, and monitored throughout the assessment process. To aid in this effort, industry leading associations such as the Association of Test Publishers (ATP) have published guidelines for organizations seeking to establish policies around AI implementation and use.

For organizations, this might include creating an ethics board or committee responsible for overseeing AI practices, ensuring they align with broader ethical standards and respect for test taker rights. These bodies should regularly review AI systems to ensure they are not only compliant with legal requirements but also ethically sound, considering the potential impact on all stakeholders. Whether it is through an ethics board or another governance structure, there needs to be a mechanism in place that continues to ask three key questions: Is it legal? Is it ethical? Is it moral?

Open communication about these governance structures is crucial – clearly articulate your commitment to ethical AI use, including how you handle sensitive data, mitigate risks, and protect the interests of test takers.

6. Stakeholder engagement and education

It’s important to proactively involve key stakeholders, including Subject Matter Experts (SMEs) and test takers, in discussions about AI integration. By seeking input and addressing concerns early in the process, organizations will build a sense of collaboration and transparency.

Education plays a crucial role in demystifying AI. Providing clear, accessible resources that explain how AI is used in testing, the benefits it offers, and the safeguards in place will help to alleviate fears and misconceptions. Offering workshops, webinars, or informational sessions tailored to different stakeholder groups can further enhance understanding and trust. For example, including a section in your SME training on how generative AI tools such as ChatGPT should and shouldn’t be used in the item writing process.

By actively involving stakeholders and equipping them with the knowledge they need, testing organizations not only foster trust but also empower stakeholders to confidently engage with AI-enhanced tests. This collaborative approach ensures all parties feel informed, respected, and included in the evolution of testing practices.

Listen to our podcast exploring the future of AI in assessment with Kara McWilliams.

7. Continuous improvement and adaptation

The landscape of AI in testing is rapidly evolving, making it crucial for testing organizations to stay ahead of new developments and challenges. This means regularly updating AI systems to incorporate the latest advancements and best practices, ensuring they remain accurate, fair, and secure.

Feedback loops are key to this process. Organizations should actively gather input from test takers and other stakeholders to identify areas for improvement. Additionally, ongoing research and development efforts should focus on refining AI models, addressing any emerging biases, and enhancing overall system performance.

By committing to a culture of continuous improvement, testing organizations will demonstrate their dedication to maintaining the highest standards in AI-powered assessment. This proactive approach not only strengthens the reliability of their services but also reinforces stakeholder confidence in their ability to adapt to the future needs of the industry.

Read our guide to quality assurance in testing.

Constant questioning

Despite the immense potential for creating efficiencies, AI is still limited by a lack of consistency, depth and reasoning to address the complex needs of our industry. That’s why it’s important to be clear with our test takers and other stakeholders that when it comes to testing, AI is capable of augmenting humans but not replacing them.

In addition, we need to clearly communicate our ongoing commitment to constantly monitor the impact and integrity of the tests we deliver. And this includes constantly evaluating the validity and accuracy of any AI predictions. Equally, our use of AI must always respect individual rights and privacy, avoiding unintended harm. These are shared goals and challenges for the testing industry – we need to work together to close the AI trust gap in testing.

Share:

We're here to help

Whatever your testing needs, our friendly, experienced team is here to provide guidance and answer your questions.

Stay informed

Join our newsletter and stay tuned with the newest insights