Hybrid AI
Interactive Inspection: Enhancing the full inspection lifecycle through safe, trustworthy collaboration between inspectors and AI.
At its core, the inspectorate’s role is to assess whether institutions comply with legal and regulatory standards. This can involve on-site inspections, such as visits to ships, companies, or schools, document analysis, or a combination of both. This project focuses on harnessing AI technologies, particularly Large Language Models (LLMs), to support inspectors and analysts in their work. The emphasis is on interactive, hybrid collaboration, where AI augments, but does not replace, human expertise, to enhance the quality, efficiency, and trustworthiness of inspections.
Key Challanges in this domain
One of the key challenges for inspectorates is working with sensitive data, which often includes personal or confidential company information. While this doesn’t rule out the use of AI, it does impose significant constraints, particularly considering the EU AI Act. For example, sending sensitive data to external APIs, such as those hosted by large tech providers, is often neither permissible nor advisable. This project therefore also explores the use of locally deployable AI models that align with legal and ethical requirements. In addition to model capabilities, we focus on how well existing datasets used by inspectorates support these approaches, and whether results from academic research translate effectively to real-world conditions. Finally, a central theme is the human-AI interaction itself: how inspectors engage with AI systems, how their workflows evolve, and how to mitigate risks like overreliance, ensuring that humans remain in control, using AI as a tool rather than a replacement for people.
Research Questions
-
How can locally deployable LLMs be used effectively within inspectorates while preserving data privacy and complying with legal frameworks such as the EU AI Act?
-
To what extent do existing datasets used by inspectorates support the effective application of Large Language Models (LLMs)?
-
How well do findings from controlled, academic AI research generalize to real-world inspection contexts involving complex documents and operational constraints?
-
How does the integration of AI systems into inspection workflows affect human decision-making, task focus, and role perception?
-
What is effective interface or interaction designs that help mitigate overreliance on AI while still leveraging its strengths?
-
How can trust in AI support systems be calibrated to reflect actual performance, especially when AI suggestions may appear confident but be incorrect?
-
What are the measurable impacts of AI assistance on inspection efficiency, accuracy, and consistency across different types of tasks?
-
How do inspectors perceive and adapt to hybrid collaboration with AI tools before, during, and after an inspection?
Solutions
The first paper (currently work in progress) explores how language models of varying sizes can collaborate with humans to solve a text-based task. We evaluate both the models' performance and the human user experience during the experiment. The key objective is to identify a level of “good-enough” performance, understanding how much performance users are willing to trade off while still achieving acceptable results. All models used are either deployable locally or on lightweight infrastructure, such as that available to inspectorates via Microsoft Azure.
In the second phase, the focus shifts from model performance to the data itself, examining how the quality, structure, and availability of data impact outcomes. The third phase integrates insights from both earlier phases to develop, ideally, an AI assistant capable of supporting inspectors in their tasks in a practical and sustainable way.
Meet the researcher
Lennard Froma
Leiden University
In 2023 I obtained my master’s degree in Artificial Intelligence at the Rijksuniversiteit Groningen. Before joining the AI4Oversight lab in 2024 I worked for one year as a data scientist for the RDI (Rijksinspectie Digitale Infrastructuur), a Dutch inspectorate focused on digital infrastructure (e.g., telecommunication, satellite communication, and cybersecurity).
Results
Inspectorate Use Cases
Description of use cases that have been executed within this work packages
Publications
Check out the publications related to Hybrid AI