Your browser doesn't support javascript. This means that the content or functionality of our website will be limited or unavailable. If you need more information about Vinnova, please contact us.

Automating data quality monitoring at scale through LLMs and causal inference

Reference number
Coordinator Validio AB
Funding from Vinnova SEK 1 950 000
Project duration May 2024 - March 2025
Status Completed
Venture Ground-breaking technology solutions
Call Groundbreaking and scalable technology solutions in 2024

Important results from the project

Yes, the project largely met its goals by delivering significant advancements in automated data quality monitoring. Achievements include enhanced ML models for improved anomaly detection and streamlined user workflows ("one-click" setup), boosting setup efficiency for data quality checks and thereby data reliability. Other important results include a major optimisation of the code to enable deeper root cause analysis and novel LLM-based methods for outlier detection and forecasting.

Expected long term effects

The project´s long-term effects include democratizing data quality management, making it accessible to more diverse users. It promotes sustainability by identifying and removing false and unused data, reducing storage, compute, and carbon emissions. The project seeks to enhance data reliability and reduce bias in data-driven decisions, contributing to fairer outcomes (UN SDGs 5, 10, 12, 13). Additionally, it aims to strengthen Validio´s position as a Nordic and European AI leader in this field.

Approach and implementation

The 10-month Agile project was planned in 5 work packages: AI R&D, integration, UX, testing, and management. Initially focused on custom LLMs for auto-setup/RCA, it was adapted based on pilot feedback. Execution prioritized optimizing existing ML models for anomaly detection and enhancing user workflows (one-click setup, bulk check creation), while LLM research continued for forecasting and description generation. Resource allocation increased in the final months to complete work packages.

External links

The project description has been provided by the project members themselves and the text has not been looked at by our editors.

Last updated 18 April 2025

Reference number 2024-00505