EQS AI Benchmark Volume 2: Latest Frontier Models Make Agentic Compliance Workflows a Practical Reality
Second benchmark edition shows major gains in open-ended compliance work, shifting the focus from model choice to real-world deployment
MUNICH, DE / ACCESS Newswire / May 11, 2026 / AI has crossed a practical threshold in compliance & ethics. The EQS AI Benchmark Volume 2 shows that the latest generation of AI models not only improves performance, but can now reliably handle multi-step compliance workflows - a capability that was out of reach just six months ago.
Building on the first volume published in October 2025, EQS Group tested four newly released frontier AI models on the same set of 120 real-world compliance tasks. The updated benchmark, created in collaboration with the German association Berufsverband der Compliance Manager e.V. (BCM), now compares a total of ten leading models, providing a direct view of how the latest generation performs against last year's frontier.
Frontier models converge at the top
In Volume 2, OpenAI's GPT-5.4 now leads the benchmark with a score of 87.6%, closely followed by Google's Gemini 3.1 Pro (87.4%) and Anthropic's Claude Opus 4.6 (86.1%). The leading models are now separated by little more than one percentage point. This clustering signals a clear shift: while performance gains continue, leading models are approaching a practical ceiling for general compliance tasks, making deployment strategy more important than marginal differences in model capability.
Biggest gains in open-ended compliance work
The most meaningful improvements are seen in open-ended tasks such as drafting reports, policies, or investigation plans - tasks that closely mirror the work compliance teams deliver to internal stakeholders, management, and regulators. Across all vendors, performance in these tasks increased significantly, with improvements of up to +17-18 percentage points compared to the first report, moving outputs from "usable with heavy editing" to "usable with light review."
Agentic compliance workflows cross a key threshold
The most important finding of the benchmark lies beyond individual task performance: AI models are now approaching the capability needed to support multi-step compliance workflows end-to-end. In a simulated Conflict of Interest process - covering classification, risk assessment, review routing, and mitigation - a single frontier model (GPT-5.4) achieved above 90% performance across each individual workflow step. While the benchmark did not test a fully connected agentic workflow, the results indicate that such workflows are becoming significantly more feasible than they were just six months ago.
"The benchmark shows how quickly AI is becoming a real driver of innovation in Compliance", said Dr. Martin Benda, President of BCM. "The opportunity now is to translate these capabilities into practical applications - in a way that strengthens both effectiveness and responsible oversight."
"Six months ago, the question was whether AI could support real compliance work. Today, the question is how we design workflows around it," said Moritz Homann, Head of AI at EQS Group. "Agentic compliance is no longer a question of feasibility, but of design, especially where to place the right human oversight. The latest models are strong enough to handle multi-step processes, but the real differentiator is the context around them: the tools and checkpoints that make AI reliable in practice."
From model performance to real-world deployment
The findings in Volume 2 point to a broader shift for compliance teams: improvements in model capability are becoming incremental, while the biggest gains now come from how AI is deployed.
The results suggest that context, system integration, and workflow design are becoming more important than the choice of model itself. Organizations that embed AI into real processes - with the right data, tools, and oversight - will see significantly stronger results than those treating it as a standalone tool.
Practical recommendations for compliance teams
The findings translate into a clear set of priorities for compliance professionals - not to experiment more, but to operationalize AI:
Move from pilots to production for proven use cases
The findings translate into a clear set of priorities for compliance teams - not to continue experimenting in isolation, but to operationalize AI responsibly within real compliance processes:
Select models based on task fit, not just leaderboard rankings
Invest not only in prompts, but in the broader AI "harness" - including context, systems, tools, and workflow orchestration
Design human checkpoints deliberately around escalation, judgment, and employee-impacting decisions
Start designing agentic workflows for structured, high-volume processes
Continuously reassess capabilities, as model performance evolves rapidly
The full EQS AI Benchmark Report Volume 2 is available to download here: https://www.eqs.com/compliance-wpapers/eqs-ai-benchmark-report-vol-2/
Methodology
The EQS AI Benchmark evaluates leading AI models on 120 tasks across ten core Compliance & Ethics domains, including risk assessment, policy development, investigations, and reporting.
The benchmark combines structured and open-ended tasks based on real-world documents provided by customers, with open-ended outputs evaluated by a human jury of Compliance professionals, including members of the Berufsverband der Compliance Manager (BCM).
Press contact
Christina Jahn
Tel.: +49 89 444430133
E-Mail: [email protected]
About EQS Group
EQS Group is a leading international cloud provider for compliance & ethics, data privacy, sustainability management, and investor relations. More than 14,000 companies across the world use EQS Group's products to build trust by reliably and securely meeting complex regulatory requirements, minimizing risks and transparently reporting on business performance and its impact on society and the environment.
EQS Group's solutions are bundled in a cloud-based platform. This allows compliance processes for whistleblower protection and case handling, policy management, and approval processes to be managed just as professionally as business partners, third parties and risks, insider lists and reporting obligations. In addition, EQS Group provides software to fulfill human rights due diligence requirements across corporate supply chains, ensure compliance with data privacy regulations like GDPR and EU AI Act, and support efficient ESG management and compliant sustainability reporting. Listed companies also benefit from a global newswire, investor targeting and contact management, as well as IR websites and webcasts for efficient and secure investor communication.
EQS Group was founded in Munich in 2000. Today, the group employs around 600 professionals worldwide.
About the BCM
As the leading professional association exclusively for in-house compliance officers from companies, associations, and other organizations, the BCM represents the interests of its members in dealings with policymakers, business, and society. The BCM focuses on providing information, fostering networks, and strengthening the compliance profession. It offers a wide range of free services designed to keep members informed about current compliance issues and to promote and continuously develop knowledge-sharing within its network.
SOURCE: EQS Group GmbH
View the original press release on ACCESS Newswire
F.Thill--RTC