End-to-End Testing: Secure Full User Flows

Jun 3, 2026

8 min

End-to-End Testing: Secure Full User Flows

Roman Kirchmeier - Autemos

QA professional tests an end-to-end user flow across smartphone, laptop and monitor

End-to-end testing checks a complete business process the way a real user experiences it: from the first input to the visible result, across UI, backend, database and connected third-party systems. That breadth is what makes E2E tests valuable – and fragile. They run slowly, break on the smallest UI change and tend toward flakiness. Flaky tests already consume at least 2.5% of productive development time (TUM/CQSE, ICST, 2024). This article shows where E2E sits in the ISTQB model, how it differs from integration and system testing, and how AI-driven methods tame its typical weaknesses.

In brief: End-to-end testing is a test approach, not a separate level in the ISTQB level model. It verifies complete user flows across system and application boundaries. Its biggest problems – slow, brittle, flaky, high maintenance – can be tackled at the root with self-healing locators and Vision-AI.

Layered diagram of an end-to-end test: user input flowing through the UI (web/mobile/desktop), backend, database and on to third-party systems like payment and core banking, ending in the result.

Figure 1: What an end-to-end test verifies across every layer.

What is end-to-end testing?

End-to-end testing verifies a complete business process from end-user input to final result, across all layers and systems. That includes the UI (web, mobile, desktop), backend, database and connected third-party and integration systems such as payment, core banking or CRM. The perspective is the user's, not the technology's.

This sets end-to-end testing apart from an isolated functional test. The point is not a single screen or interface, but a bigger question: does the entire process work in the real system landscape? A classic banking example is the bank transfer – more on that below.

For the full taxonomy of test types, see the pillar article on software testing types.

Is E2E a separate test level in the ISTQB model?

No. The ISTQB glossary does not list E2E as its own test level – unlike component, integration and system testing. E2E is a test approach, or test type, typically carried out at the system or system integration test level. This precision separates serious QA practice from loose terminology.

In practice, many vendors equate E2E with system testing outright. That is a shortcut. A system test checks one application against its specification. An E2E test follows a real user flow that may well touch several applications and external integrations. The level (system or system integration) stays the scaffolding – the approach defines *how* you test.

Why does this distinction matter beyond semantics? Because it governs what you actually secure. Treating E2E as a system test overlooks the transitions between systems – precisely where most production defects appear.

How does E2E differ from integration and system testing?

Comparison table of integration test, system test and end-to-end by focus and scope; the end-to-end column is highlighted.

Figure 2: Integration test, system test and E2E compared side by side.

The core difference is scope: integration testing checks technical interfaces, system testing checks a complete application, and end-to-end testing checks a continuous user flow across several systems. The table below contrasts the three levels and surfaces each perspective – technical, functional or business-process oriented.

	Integration test	System test	End-to-end
Focus	interplay of components/interfaces	entire integrated system vs. spec	complete real user flow across system and application boundaries
Scope	technical interfaces	one application	multiple systems + external integrations
Perspective	technical	functional	end user / business process
Banking example	booking API ↔ payment	online banking meets requirements	login → account → transfer → OTP → confirmation → receipt

For a deeper look at interface testing, see the sibling article on integration testing.

Horizontal vs. vertical E2E tests

E2E tests run in two directions – and each view speaks to a different audience:

Horizontal: a business process runs across several systems (for example frontend, payment gateway and core banking). This view matters to CTOs and QA leaders because it reflects end-to-end accountability across system boundaries.
Vertical: a path runs through every layer of *one* application – from UI through API and service layer down to the database. This view is primarily relevant for architects.

In banking reality, the most interesting defects are usually horizontal: they emerge in the handover between systems that each work correctly on their own.

Why are E2E tests so hard to automate?

Three metrics on the cost of flaky tests: 2.5 percent of development time, a rise from 10 to 26 percent of teams, 5.67 dollars per manual re-run.

Figure 4: What flaky tests cost (TUM/CQSE 2024, Bitrise 2025).

E2E tests are widely considered the most unstable and slowest level of test automation because they traverse the entire system. Four core problems recur – and they reinforce each other as the suite grows.

The first problem is slowness: every test case runs through the full chain of UI, backend, database and third-party systems. The second is brittleness: even small DOM or UI changes break hard-coded locators. The third is flakiness – tests pass or fail unpredictably depending on timing, network or test data. The fourth is maintenance effort, which grows disproportionately with suite size.

Recent figures show how costly this becomes. The share of teams with flaky tests rose from 10% to 26% – a 160% jump in three and a half years (Bitrise Mobile Insights, 2025), based on more than 10 million builds. Each manual investigation of a failed run costs around $5.67 versus $0.0002 for an automatic re-run (TUM/CQSE, ICST, 2024).

How do AI and self-healing solve the flakiness problem?

AI-driven methods address the root of the four core problems rather than masking symptoms. Two mechanisms are central: self-healing locators against brittleness, and Vision-AI where the DOM offers no stable reference. Both attack the costliest item – disproportionately growing maintenance.

Self-healing locators re-resolve element references automatically when the UI changes. If an attribute changes or an element shifts in the DOM, the healer finds a valid alternative instead of letting the test break. That tackles the *brittle* problem directly. The article on self-healing locators describes how this works technically.

Vision-AI steps in where DOM-based methods fail. A convolutional neural network (CNN) locates elements purely visually – for instance in canvas renderings or with dynamically generated attributes that lack stable IDs. Details in the post on visual testing with Vision-AI.

In customer projects, the pattern is clear: the biggest lever is not the single repaired locator, but the disappearance of manual triage. The market is shifting accordingly. Synthetic test data rose from 14% to 25% usage, 29% of organizations have fully integrated GenAI and a further 42% are evaluating it (Capgemini World Quality Report, 2025). Autemos applies these mechanisms across web, mobile, API and desktop – including a visual recorder for horizontal E2E flows spanning multiple channels.

Which E2E scenarios are critical in banking?

Six-step banking E2E flow of a bank transfer: login, account, transfer, OTP, confirmation, receipt.

Figure 3: The bank transfer as a critical horizontal E2E flow in banking.

In banking, a few business-critical user flows decide security and compliance – and they justify every E2E investment. Two scenarios stand out: the bank transfer and KYC onboarding. Both touch several systems and external integrations, making them classic horizontal E2E cases.

The bank transfer runs through a long chain: login → account overview → transfer form → OTP approval → confirmation → receipt. Every transition can break, for instance between frontend and payment gateway or between the OTP service and core banking. A pure system test of the online banking application would not cover these transitions at all.

KYC onboarding is equally integration-heavy: identity verification, document upload, credit check and account creation interlock across several services. Flows like these are candidates for the thin E2E tip – not every click, but the few paths whose failure causes real harm.

How many end-to-end tests belong in the test pyramid?

Few. E2E belongs at the thin tip of the test pyramid and should be limited to essential flows – checkout, login, registration and critical integrations (Frontiers, "Test Pyramid 2.0", 2025). The broad base consists of fast, stable unit tests, with integration tests above them.

This distribution is not dogma but an economic consequence. Because E2E tests are slow and maintenance-heavy, their count scales poorly. A pyramid with a bloated tip – sometimes mocked as an "ice-cream cone" – produces exactly the flakiness and cost problems we saw above.

So the right question is not "how many E2E tests can we manage?" but "which flows must never fail?". In banking, that means transfers, onboarding and critical payment integrations – kept compact, yet made reliable through AI stabilization.

FAQ

Is end-to-end testing the same as system testing?

No. System testing is an ISTQB test level and checks one application against its specification. E2E is a test approach that follows a real user flow across multiple systems and external integrations. E2E is often performed at the system or system integration level, but it is not itself a separate level in the ISTQB model.

Why are E2E tests so often flaky?

E2E tests traverse the entire system and therefore depend on timing, network and test data. The share of teams with flaky tests rose to 26% (Bitrise Mobile Insights, 2025). Hard-coded locators make it worse, because even small UI changes break them.

What do flaky tests actually cost?

Flaky tests tie up at least 2.5% of productive development time – split into investigation (1.1%), fixing (1.3%) and tooling (0.1%) (TUM/CQSE, ICST, 2024). On top of that, each manually investigated failed run costs around $5.67 versus $0.0002 for an automatic re-run.

Does AI help against brittle locators?

Yes. Self-healing locators re-resolve element references automatically when the UI changes, reducing maintenance. Vision-AI adds visual recognition where the DOM fails – for instance with canvas or dynamic attributes. The post on self-healing locators explains both methods.

How many E2E tests make sense?

As few as possible, as many as necessary. E2E belongs at the thin tip of the test pyramid and covers only essential flows (Frontiers, "Test Pyramid 2.0", 2025). In banking, that typically means transfers, KYC onboarding and critical payment integrations.

Conclusion

End-to-end tests secure what users actually do – complete business processes across system and application boundaries. The clean classification stays essential: E2E is a test approach, not a separate ISTQB test level. That precision keeps teams from overlooking the critical transitions between systems.

The four weaknesses – slow, brittle, flaky, maintenance-heavy – are real and expensive. But they are not a law of nature. Self-healing locators and Vision-AI work at the root, keep the thin E2E tip stable and free teams from manual triage. A fragile suite becomes a reliable safety net for transfers, onboarding and critical integrations.

Want to secure your critical user flows with stability and low maintenance? Talk to our team about AI-driven E2E automation with Autemos.

More Blogs for You

Audit-ready test automation inside a regulated Swiss bank

Test Automation in Regulated Banking: Staying Audit-Ready Under DORA and FINMA

Jun 16, 2026

Test Automation in Regulated Banking: Staying Audit-Ready Under DORA and FINMA

Jun 16, 2026

AI Test Automation: The Complete Guide for 2026

May 22, 2026

AI Test Automation: The Complete Guide for 2026

May 22, 2026

What Is AI Testing? Definition, Types, and Honest Limits

Jun 11, 2026

What Is AI Testing? Definition, Types, and Honest Limits

Jun 11, 2026

End-to-End Testing: Secure Full User Flows

What is end-to-end testing?

Is E2E a separate test level in the ISTQB model?

How does E2E differ from integration and system testing?

Horizontal vs. vertical E2E tests

Why are E2E tests so hard to automate?

How do AI and self-healing solve the flakiness problem?

Which E2E scenarios are critical in banking?

How many end-to-end tests belong in the test pyramid?

FAQ

Is end-to-end testing the same as system testing?

Why are E2E tests so often flaky?

What do flaky tests actually cost?

Does AI help against brittle locators?

How many E2E tests make sense?

Conclusion

More Blogs for You

Test Automation in Regulated Banking: Staying Audit-Ready Under DORA and FINMA

Test Automation in Regulated Banking: Staying Audit-Ready Under DORA and FINMA

AI Test Automation: The Complete Guide for 2026

AI Test Automation: The Complete Guide for 2026

What Is AI Testing? Definition, Types, and Honest Limits

What Is AI Testing? Definition, Types, and Honest Limits

Experience Autemos. In just 30 minutes.

Experience Autemos.
In just 30 minutes.

Experience Autemos.
In just 30 minutes.