Let’s get straight to the point: evaluating your AI systems is a necessity. It’s as simple as that. We’ve chosen to implement a framework for assessing the quality of AI systems in our projects, based on my work over the last few years with ISO standards. But don’t worry, it’s not black magic, nor is it a bottomless pit.
As you know, at Moov AI, everything we do is concrete. We don’t fluff clouds. This is also the case when it comes to delivering quality AI systems. That way, we avoid too much red tape and keep it practical!
Why Moov AI decided to implement a framework for assessing the quality of AI systems
We firmly believe that artificial intelligence can transform businesses, dramatically improving their efficiency and exploding their productivity and competitiveness. However, for this transformation to be successful, it is crucial to ensure that AI systems meet the highest quality requirements.
This is why we decided to implement a framework for assessing the quality of AI systems. In this article, I explain the different reasons that led us to take this decision. You will also find the different stages of our framework and the benefits for your respective companies of applying these principles.
AI system quality: an unavoidable necessity
The quality of AI systems is an unavoidable necessity. I obviously have a bias given my role as chair of the Canadian ISO/IEC delegation on artificial intelligence. I’ve been involved in creating standards since 2020 to guarantee the reliability and robustness of AI systems.
The quality of an AI system is defined by the degree to which the system’s features (software, AI model and data) meet customer requirements. There are many ways of estimating quality, but it is often very complex since it involves an analysis of all an organization’s processes in creating and using AI.
I prefer to use a much more tactical notion, that of simply evaluating system requirements. This more tactical framework is called SQuaRE, which stands for Software Quality Requirements Evaluation. Roughly speaking, to evaluate quality according to this framework, the ISO standards define around 11 main characteristics, while the standard proposes methods for assessing the attainment of each of these characteristics using different techniques such as different testing methods and benchmarking.
In non-technical language, here’s how we go about assessing system quality.
The steps in our framework for assessing the quality of AI systems
Our approach to assessing the quality of AI systems breaks down into several key steps:
1. Define the most important features
The first step is to define the key features for a project from the outset. This can be done at a workshop or even before the project is launched. The characteristics defined can include aspects such as :
- Achievement of functional expectations: ensure that all functionalities required by the customer are present and operational.
- Performance: evaluate the speed and efficiency of the AI system in performing tasks.
- Compatibility: check that the AI system works correctly with other systems and environments.
- Security: ensure that the AI system protects data and information against unauthorized access and cyber-attacks.
- Maintainability: ensuring that the AI system can be easily updated and modified to correct errors and improve performance.
- Reliability: measure the ability of the AI system to operate without failure over a given period.
- Usability: assess the ease with which users can interact with the AI system.
- Portability: verify that the AI system can be transferred and used in different environments without requiring major modifications.
- Efficiency: measure the use of resources (time, memory, etc.) by the AI system to accomplish its tasks.
- User satisfaction: gather feedback from users to ensure that the AI system meets their expectations and needs.
- Compliance: check that the AI system complies with current standards and regulations.
2. Generate an evaluation plan
Once the main features have been defined, we generate an evaluation plan. This plan details the activities required to evaluate each feature. For example, to evaluate functional completeness, we can use unit tests and integration tests. To evaluate performance, we can use benchmarks and load tests.
3. Integrate evaluation activities into the customer backlog
Evaluation plan activities are then added to the customer backlog, either as stories or as success conditions (CoS). Adding these items to our process ensures that evaluation activities are integrated into the development process, and that they are carried out at the right time. No AI system should see the light of day without evaluation.
4. Measure each element of the evaluation plan
Finally, each element of the evaluation plan is measured at the appropriate time. The results of these measurements are analyzed to identify the strengths and weaknesses of the AI system. Weak points are then corrected to ensure that the system meets the defined quality requirements.
Benefits for our customers
The implementation of this AI system quality assessment framework offers many benefits for our customers:
1. Improved reliability and performance
By systematically assessing the quality of AI systems, we can identify and correct problems before they affect end-users. This improves system reliability and performance, leading to greater customer satisfaction.
2. Risk reduction
Quality assessment also helps to reduce the risks associated with the implementation of AI systems. By identifying weak points early on, we can take steps to correct them before they become problematic.
3. Compliance with international standards
By following the ISO/IEC 25059 and ISO/IEC TS 25058 standards, we guarantee that your AI systems comply with international best practice. These standards are also referenced as assessment methodology for management standards such as ISO/IEC 42001 and even under certain laws such as the European AI law. This reinforces the credibility of your solutions and reassures your end-users about the quality of your AI systems.
A crucial step for AI adoption
Implementing a quality assessment framework for AI systems is a crucial step in ensuring that AI systems meet the highest quality requirements. At Moov AI, we are committed to providing our customers with AI solutions of the highest quality, compliant with international standards and capable of transforming their businesses. By following our framework, our customers can rest assured that their AI systems are reliable, high-performing and ready to meet the challenges ahead.
Moov AI uses generative AI to create this blog.
We’ve used generative AI to speed up the production of this blog.
For the text, Olivier used internal documentation explaining our framework that he had previously created. Using Copilot Work, he rephrased the internal documentation into a blog. Here’s his prompt: “Using the following documentation , as well as what you know about Moov AI and our projects, can you write a the basis of a blog post that summarizes why we decided to implement a framework for assessing the quality of AI systems, as well as defining the different stages of our framework, as well as the benefits for our customers.”
Olivier then enriched the text by hand, adding context (introduction, conclusion, etc.), missing elements, details and his legendary style.
For the images, the marketing team used MidJourney to generate the header image using this prompt: « An abstract illustration used for a header of a blog about a framework for assessing the quality of artificial intelligence systems. Use overlapping translucent geometrical shapes symbolize the interconnected, multi-faceted nature of the framework. Use simple graphics. The mood is bright and optimistic. –ar 2:1 »
Olivier is co-founder and VP of decision science at Moov AI. He is the editor of the international ISO standard that defines the quality of artificial intelligence systems, where he leads a team of 50 AI professionals from around the world. His cutting-edge AI and machine learning knowledge have led him to implement a data culture in various industries.