Website QA intelligence for teams who ship
Guides Tool Comparisons QA Glossary Archive RSS Feed
HomeGlossaryFlaky Test

Flaky Test

A flaky test is an automated test that produces inconsistent results when executed multiple times against unchanged code, sometimes passing and sometimes failing without any actual defects present. This intermittent behavior undermines the reliability of test results and creates false positives that waste development time and erode confidence in the entire test suite.

Flaky tests represent one of the most significant threats to effective website QA automation. Unlike deterministic tests that consistently pass or fail based on actual code behavior, flaky tests introduce randomness that makes it impossible to distinguish between genuine failures and test infrastructure problems. They typically stem from timing dependencies, asynchronous operations that complete at different speeds, shared test data that creates interdependencies between test runs, or reliance on external systems like third-party APIs that may be temporarily unavailable. In web applications, common sources include JavaScript execution timing, DOM rendering delays, network latency variations, and browser-specific behaviors that manifest differently across test environments.

For website QA teams, flaky tests create a cascade of operational problems that directly impact delivery velocity and product quality. When tests fail intermittently, teams must investigate each failure to determine if it represents a real issue or a test artifact, consuming valuable QA resources that should focus on actual defects. More dangerously, teams often develop tolerance for test failures, creating an environment where genuine bugs get dismissed as likely flakiness. This is particularly problematic for e-commerce sites where payment processing failures or checkout flow disruptions could go undetected, and for regulated industries where compliance violations might slip through compromised test coverage.

The most common mistake teams make is accepting flaky tests as an inevitable consequence of web automation complexity. Many organizations set arbitrary failure retry limits or ignore tests with known flakiness rather than investing in proper remediation. Another critical error is treating flakiness as purely a technical problem when it often reflects broader issues with test design, environment management, or deployment practices. Teams frequently underestimate the cumulative cost of flaky tests, focusing on individual test fixes rather than systematic improvements to test architecture and data management strategies.

Flaky tests fundamentally undermine the feedback loop that enables rapid, confident deployment of website changes. When QA teams cannot trust their automated test results, they must rely more heavily on manual verification, slowing release cycles and increasing the risk of production issues. This creates particular challenges for continuous deployment pipelines where automated test success gates determine whether changes reach users. The resulting loss of confidence often leads teams to reduce test coverage or abandon automation entirely for critical user journeys, paradoxically making websites less reliable in pursuit of more predictable test results.

Why It Matters for QA Teams

Flaky tests train teams to ignore failures. Left unchecked, they undermine the entire purpose of automated testing and let real defects reach production undetected.

Example

An e-commerce team's automated checkout flow test intermittently fails during payment processing validation, passing roughly 70% of the time on the same code. Investigation reveals the test clicks the 'Place Order' button immediately after entering credit card details, but the payment gateway integration includes client-side validation that takes varying amounts of time to complete depending on network conditions. On faster connections or when the gateway responds quickly, the validation completes before the button click and the test passes. On slower connections, the click occurs before validation finishes, triggering an error message that causes the test to fail. The QA team initially dismisses these failures as infrastructure issues, but when a code change introduces a real bug that prevents payment validation from ever completing, the genuine failure gets ignored because the team assumes it's another flaky result. The bug reaches production, causing checkout failures for customers with slower internet connections until customer support reports identify the pattern.