·
9 min
Test Automation in Regulated Banking: Staying Audit-Ready Under DORA and FINMA

Roman Kirchmeier - Autemos

Since 17 January 2025, the EU Digital Operational Resilience Act (DORA) has been binding on financial entities (ECB, 2025). In parallel, FINMA Circular 2023/1 has tightened expectations around ICT risk and operational resilience since 1 January 2024 (Grant Thornton, 2024). For banks, that changes what testing is: no longer just an engineering concern, but evidence presented to a regulator. In this environment, automated testing needs more than green pipelines. It needs traceability.
TL;DR: In regulated banks, test automation has to do two things at once — test faster and more reliably, and document every test as auditable evidence. DORA and FINMA demand recurring resilience testing and end-to-end traceability. AI helps with speed, but it introduces new audit questions a platform must answer.

Why regulated banks are a different playing field
In an ordinary SaaS team, test coverage decides product quality. In a bank, it also decides regulatory standing. DORA requires financial entities to run a documented, recurring resilience-testing programme; significant entities must perform threat-led penetration testing (TLPT) at least every three years under the TIBER-EU framework, and use external testers for every third test (EBA, 2024).
Layered on top is the EU AI Act, which entered into force on 1 August 2024 and applies in phases: prohibited practices since February 2025, most high-risk obligations from August 2026 (European Commission). The moment AI writes or repairs tests, the question of how that AI component is classified becomes real.
FINMA, for its part, still treats outsourcing of significant functions, including public cloud, as one of the most important operational risks. Roughly one in five supervised institutions already outsources significant data or functions to public-cloud providers (FINMA, 2024). Where your test environments and test tooling run is therefore itself a regulatory decision.
What is at stake: the cost of testing gaps

The consequences of inadequate testing are expensive and public in financial services. TSB's failed 2018 IT migration disrupted service for much of its 5.2 million customers, and normal operations were not restored until December 2018. In December 2022, the FCA and PRA jointly fined TSB £48.65 million, on top of £32.7 million in customer redress (FCA, 2022).
Outages add up in day-to-day operations too. According to the Observability Forecast for financial services, high-impact IT outages cost financial firms an average of $1.8 million per hour, and 29 percent of respondents report high-impact outages at least weekly (New Relic, 2026).
Behind this is a familiar pattern: the later a defect is found, the more it costs. The often-cited IBM Systems Sciences Institute model puts the relative fix-cost curve at roughly 1x in design rising to 60–100x after release (Functionize). The exact multipliers are debated and come from older sources, but the direction is not: catching defects early lowers cost and lowers the probability of the kind of operational-resilience incident that produced the TSB fine.
Manual testing does not scale for regulated release cycles
DORA does not ask for a one-off audit but for an ongoing programme: annual basic testing plus periodic TLPT for significant entities (ECB, 2025). At modern release frequencies, that expectation is hard to meet manually without turning the test organisation into a bottleneck.
At the same time, maintaining existing tests already absorbs much of the available capacity. Industry analysis places test-maintenance effort at around 40 percent of QA time, with the share of teams hitting flaky tests rising (these figures come from lower-authority secondary sources and should be read as directional). More robust is the World Quality Report 2024-25: 68 percent of organisations use generative AI in quality engineering or have it on a roadmap after successful pilots, and 72 percent report accelerated automation from AI (Capgemini, 2024).
The message is clear: banks need more test coverage, not more testers. That is exactly where AI-assisted test automation comes in, turning a requirement or a Jira ticket into an executable test without manual scripting.
AI-assisted test automation, but auditable

The hard part is not speed, it is auditability. As soon as an AI generates tests or repairs them through self-healing, a non-deterministic actor enters the validation chain. Yet both FINMA 2023/1 and DORA demand traceability of changes. The decisive question becomes: who tests the AI tester?
In a regulated environment, AI-generated tests must satisfy the same properties as any other auditable artifact:
Human approval: every AI suggestion is confirmed by a person rather than executed blindly. Human-in-the-loop here is not a convenience feature, it is an audit requirement.
Versioning and exportability: tests must be versionable, readable, and exportable, with no proprietary format. Only then does it stay clear what was tested, when, and why.
Self-healing but documented [locators](/en/features/self-healing): when the UI changes and locators are stabilised automatically, that adjustment must be logged, not disappear into the background.
A complete audit trail: every change to a test belongs in the same change-management chain as any other regulated change.
That turns the apparent risk of AI into a compliance advantage: tests are created faster and still stand up as evidence within the DORA resilience programme.
On-premise and data residency as a compliance feature
A common misconception is that test environments and test data are less sensitive than production. The EDPB Guidelines 01/2025 make clear that pseudonymised data still counts as personal data and remains fully in scope of the GDPR (EDPB, 2025). Copying production data into test environments therefore carries the full residency and security obligations with it, including any AI that processes that data.
Combined with FINMA's focus on cloud-concentration risk, it becomes clear why deployment matters. From FINMA's perspective, a cloud-based AI testing platform is an outsourcing decision under Circulars 2018/3 and 2023/1. An on-premise or data-resident option is therefore not a mere technical detail but a compliance feature. For Swiss banks, that can be exactly what decides whether a platform is even viable.
What an audit-ready platform must deliver
When evaluating test automation for a regulated bank, look beyond green dashboards and check for:
Tests from natural language, a recorder, or a specification, so business and engineering can work together on visual test workflows.
Self-healing for stable tests despite UI changes, with the adjustment logged rather than hidden.
Human-in-the-loop approval for every AI-generated step.
Exportable, versionable code with no lock-in, so existing Playwright or Selenium suites survive.
One platform for web, mobile, API, and desktop instead of four separate tools.
A choice of cloud or on-premise, with role-based access and an audit trail.
These are exactly the requirements Autemos is built for: AI speed in test creation, stability through self-healing, and the traceability a regulated operation needs.
Regulatory requirements at a glance

Framework | In force since | Core testing requirement |
|---|---|---|
EU DORA | 17 Jan 2025 | Recurring resilience testing, TLPT |
FINMA Circular 2023/1 | 1 Jan 2024 | ICT risk, scenario testing, traceability |
EU AI Act | from Aug 2024 | Risk tiers for AI tooling |
GDPR / EDPB | 2025 | Pseudonymised test data is still personal data |
Frequently asked questions
Does AI in testing make compliance harder?
Not necessarily. AI becomes a problem when it operates as a black box. With human approval, versioning, and an audit trail, an AI-generated test becomes auditable evidence that meets DORA and FINMA expectations for traceability.
Do we have to throw away our existing tests because of DORA?
No. Existing Playwright, Selenium, and Appium suites can be reused. What matters is that tests stay exportable and versionable and sit inside a documented, recurring testing programme.
Why is on-premise relevant for banks?
Because test data counts as personal data even when pseudonymised, and FINMA treats cloud outsourcing as a significant operational risk. An on-premise or data-resident deployment reduces outsourcing and residency risk.
What does self-healing mean for auditability?
Self-healing stabilises tests automatically when the UI changes. In a regulated environment, each such adjustment must be logged so it stays part of the change-management chain rather than happening invisibly in the background.
Conclusion
For regulated banks, test automation has become the interface between engineering and the regulator. DORA and FINMA demand recurring, documented testing; the AI Act and the GDPR draw clear lines around AI and data. A platform that combines AI speed with human-in-the-loop, self-healing, exportable code, and an on-premise option turns that obligation into an advantage. See in a short demo how Autemos delivers audit-ready test automation in practice.


