Introduction
You already know why E2E tests matter. You also know why your team doesn't write them -- they're slow, flaky, and the frameworks are annoying. Playwright fixes the "annoying" part. The other two are on you.
So skip the pitch. This covers the hard parts: authentication across tests, dealing with flaky selectors, mocking APIs, visual regression, and CI that doesn't take 40 minutes. Simple examples stay short. The complicated stuff gets the space it deserves.
Setting Up Playwright
One command. Pick TypeScript when it asks.
# Create a new Playwright project
npm init playwright@latest
# Or add Playwright to an existing project
npm install -D @playwright/test
npx playwright installYou get a playwright.config.ts at the root. Here's the part that actually matters:
import { defineConfig, devices } from'@playwright/test';
export defaultdefineConfig({
testDir: './tests',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
reporter: 'html',
use: {
baseURL: 'http://localhost:3000',
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'retain-on-failure',
},
projects: [
{
name: 'chromium',
use: { ...devices['Desktop Chrome'] },
},
{
name: 'firefox',
use: { ...devices['Desktop Firefox'] },
},
{
name: 'webkit',
use: { ...devices['Desktop Safari'] },
},
{
name: 'mobile-chrome',
use: { ...devices['Pixel 5'] },
},
],
webServer: {
command: 'npm run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
},
});fullyParallel runs individual tests within a file in parallel, not just files. The webServer block auto-starts your dev server -- no more "forgot to boot the app" failures. And trace: 'on-first-retry' captures a full execution trace when a flaky test fails. Every network request, DOM snapshot, console log. On a timeline. Best debugging tool in the ecosystem.
npx playwright test --ui opens the visual runner. Worth keeping on a second monitor while writing tests.
Page Interactions and Locators
This is where most E2E suites break. Not in the test logic. In the selectors.
Locators are lazy -- they don't touch the DOM until you act on them or assert against them. So they always reflect current page state, even after navigations. The real question is which locator strategy you pick. Role-based and label-based locators survive refactors. CSS class selectors don't. A migration from Material UI to Radix barely touches Playwright tests when they locate by role rather than by .MuiButton-root.
import { test, expect } from'@playwright/test';
test('user can search for products and add to cart', async ({ page }) => {
await page.goto('/products');
// Prefer role-based locators - they mirror how users interactconst searchInput = page.getByRole('searchbox', { name: 'Search products' });
await searchInput.fill('wireless headphones');
await searchInput.press('Enter');
// Use getByText for visible text contentawaitexpect(page.getByText('3 results found')).toBeVisible();
// Use getByRole with name to find specific buttonsconst firstProduct = page.getByRole('article').first();
await firstProduct.getByRole('button', { name: 'Add to cart' }).click();
// Use getByLabel for form elements tied to labelsawait page.getByRole('link', { name: 'Cart (1)' }).click();
// Use getByTestId only as a last resort for elements without semantic rolesconst cartTotal = page.getByTestId('cart-total');
awaitexpect(cartTotal).toContainText('$79.99');
// Chain locators to narrow scope - find within a specific regionconst cartItem = page.getByRole('region', { name: 'Shopping cart' });
awaitexpect(cartItem.getByRole('heading')).toContainText('Wireless Headphones');
// Filter locators to pick one from many matchesconst removeBtn = page.getByRole('button', { name: 'Remove' })
.filter({ hasText: 'Wireless Headphones' });
await removeBtn.click();
});.filter() is worth learning well. When several elements share a role, filtering by surrounding text or child elements pins down the one you need. No fragile nth-child selectors. No XPath.
Unfamiliar codebase? npx playwright codegen records browser interactions and outputs locator code. Click through the flow once, clean up the generated selectors. Faster than inspecting the DOM manually.
Assertions and Auto-Waiting
Explicit waits, sleep calls, retry loops. That's Selenium. Playwright replaces all of it.
A click() waits for the element to be visible, enabled, stable. An expect() retries until it passes or hits the timeout. These are not synchronous Jest assertions. They poll the DOM repeatedly, so your tests naturally tolerate loading spinners and network latency.
import { test, expect } from'@playwright/test';
test('dashboard loads and displays user data correctly', async ({ page }) => {
await page.goto('/dashboard');
// Auto-retrying assertions - Playwright polls until pass or timeoutawaitexpect(page).toHaveTitle('Dashboard | MyApp');
awaitexpect(page).toHaveURL(/\/dashboard/);
// Element visibility assertionsconst welcomeBanner = page.getByRole('heading', { name: /welcome back/i });
awaitexpect(welcomeBanner).toBeVisible();
// Text content assertions with regex supportconst statsCard = page.getByTestId('revenue-card');
awaitexpect(statsCard).toContainText(/\$[\d,]+\.\d{2}/);
// Attribute and CSS assertionsconst statusIndicator = page.getByTestId('server-status');
awaitexpect(statusIndicator).toHaveAttribute('data-status', 'healthy');
awaitexpect(statusIndicator).toHaveCSS('background-color', 'rgb(34, 197, 94)');
// Count assertions - wait until the list has the expected itemsconst notifications = page.getByRole('listitem');
awaitexpect(notifications).toHaveCount(5);
// Negation - confirm elements are NOT presentawaitexpect(page.getByText('Error loading data')).not.toBeVisible();
// Soft assertions - continue test even if this failsawaitexpect.soft(page.getByText('Last updated')).toBeVisible();
awaitexpect.soft(page.getByText('v2.4.1')).toBeVisible();
});Soft assertions record failures but keep running. You see everything broken on a page in one pass instead of fixing one thing, re-running, finding the next. Good for dashboard smoke tests where you want the full picture.
Default timeout is five seconds. Override per-assertion with { timeout: 15000 }. But if you're routinely setting timeouts above 10 seconds, the problem is the application. Fix the slow endpoint. I've seen suites where every assertion had a 30-second timeout and the full CI run took 45 minutes. The tests weren't slow. The app was.
Authentication and State Management
This is the thing that determines whether your E2E suite actually scales. If every test starts by filling in credentials, you've already lost. The suite will be slow. Flaky too, because now every test depends on the login page working.
Playwright's answer: authenticate once, save cookies and local storage to a JSON file, reuse it everywhere. A dedicated setup project runs first, logs in, saves the state. All other projects load that file and start already authenticated.
import { test as setup, expect } from'@playwright/test';
import path from'path';
const authFile = path.join(__dirname, '../.playwright/.auth/user.json');
setup('authenticate as standard user', async ({ page }) => {
// Navigate to the login pageawait page.goto('/login');
// Fill in credentialsawait page.getByLabel('Email address').fill('[email protected]');
await page.getByLabel('Password').fill('SecureP@ss123');
await page.getByRole('button', { name: 'Sign in' }).click();
// Wait for the redirect to confirm login succeededawait page.waitForURL('/dashboard');
awaitexpect(page.getByRole('heading', { name: /welcome/i })).toBeVisible();
// Save the authenticated state to a fileawait page.context().storageState({ path: authFile });
});
// In playwright.config.ts, reference this setup:// projects: [// { name: 'setup', testMatch: /.*\.setup\.ts/ },// {// name: 'chromium',// use: {// ...devices['Desktop Chrome'],// storageState: '.playwright/.auth/user.json',// },// dependencies: ['setup'],// },// ]dependencies: ['setup'] enforces ordering. Storage state lives on disk, survives across test files and parallel workers.
Multiple user roles? Separate setup file for each. Save to different JSON paths. Tests declare their role by belonging to the right project.
Token expiration will bite you. If your staging environment issues JWTs with a short lifetime and your suite runs longer than that, tests fail midway through with auth errors. Either configure the test environment for longer-lived tokens, or login via an API call to keep it fast. Playwright replays saved cookies exactly as-is. It won't refresh tokens for you.
Keep test credentials in environment variables. Use process.env.TEST_USER_PASSWORD. Never hard-code passwords in committed files.
API Mocking and Network Interception
route() intercepts, fulfill() fakes the response. That's it.
How does the UI handle a 500? An empty dataset? A four-second response? You can't trigger these through the real server reliably. So you mock them.
import { test, expect } from'@playwright/test';
test('displays products from mocked API response', async ({ page }) => {
// Intercept the API call and return mock dataawait page.route('**/api/products', async (route) => {
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify([
{ id: 1, name: 'Keyboard', price: 129.99, inStock: true },
{ id: 2, name: 'Mouse', price: 59.99, inStock: true },
{ id: 3, name: 'Monitor', price: 449.00, inStock: false },
]),
});
});
await page.goto('/products');
awaitexpect(page.getByRole('article')).toHaveCount(3);
awaitexpect(page.getByText('Out of stock')).toBeVisible();
});
test('shows error state when API returns 500', async ({ page }) => {
await page.route('**/api/products', async (route) => {
await route.fulfill({
status: 500,
contentType: 'application/json',
body: JSON.stringify({ error: 'Internal server error' }),
});
});
await page.goto('/products');
awaitexpect(page.getByRole('alert')).toContainText('Something went wrong');
awaitexpect(page.getByRole('button', { name: 'Retry' })).toBeVisible();
});
test('shows loading skeleton during slow responses', async ({ page }) => {
// Simulate a delayed responseawait page.route('**/api/products', async (route) => {
awaitnewPromise((resolve) =>setTimeout(resolve, 3000));
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify([{ id: 1, name: 'Keyboard', price: 129.99 }]),
});
});
await page.goto('/products');
// Verify the loading state appears while waitingawaitexpect(page.getByTestId('product-skeleton')).toBeVisible();
// Then verify it resolves to real contentawaitexpect(page.getByText('Keyboard')).toBeVisible({ timeout: 5000 });
});The slow response test. Most teams skip it and later regret it. Loading states and skeleton screens are features users actually see, but they're nearly impossible to test without controlling response timing.
You can also modify real responses. Call route.fetch() to get the actual server response, tweak one field in the JSON, fulfill with the modified version. Useful when you need realistic data but want to set a date to today for time-sensitive logic. Or abort third-party scripts to measure core app performance in isolation.
Visual Regression Testing
Functional tests miss CSS regressions entirely. Visual tests catch them by comparing screenshots pixel by pixel against approved baselines.
Built in. No third-party service.
import { test, expect } from'@playwright/test';
test('homepage matches visual baseline', async ({ page }) => {
await page.goto('/');
// Wait for all images and fonts to loadawait page.waitForLoadState('networkidle');
// Full-page screenshot comparisonawaitexpect(page).toHaveScreenshot('homepage-full.png', {
fullPage: true,
maxDiffPixelRatio: 0.01, // Allow 1% pixel difference
});
});
test('pricing cards render consistently', async ({ page }) => {
await page.goto('/pricing');
// Screenshot a specific component instead of the full pageconst pricingSection = page.getByRole('region', { name: 'Pricing plans' });
awaitexpect(pricingSection).toHaveScreenshot('pricing-cards.png', {
maxDiffPixels: 50, // Allow up to 50 pixels to differ
animations: 'disabled', // Freeze CSS animations
});
});
test('dark mode renders correctly', async ({ page }) => {
await page.goto('/');
// Emulate dark color schemeawait page.emulateMedia({ colorScheme: 'dark' });
await page.waitForLoadState('networkidle');
awaitexpect(page).toHaveScreenshot('homepage-dark-mode.png', {
fullPage: true,
maxDiffPixelRatio: 0.01,
});
});
// To update baselines when changes are intentional:// npx playwright test --update-snapshotsFont rendering differs between operating systems. Always generate baselines in the same environment where tests run. Generating baselines on your Mac and expecting them to match a Linux CI runner will produce nothing but false failures.
Animations wreck screenshot stability. animations: 'disabled' freezes all CSS animations at their end state before capture. Use it everywhere.
Dynamic content -- timestamps, randomized avatars, ad banners -- causes false positives. Mock those elements to return deterministic data, or mask them with the mask option during capture. Masking is quick. Mocking is more thorough. Pick one.
Intentional change? npx playwright test --update-snapshots accepts the new baseline.
Parallel Execution and CI Configuration
A slow suite is a skipped suite. Twenty minutes? Developers stop running it.
Playwright runs test files in parallel across worker processes. Each worker gets its own browser. Fully isolated. CI needs different settings -- fewer cores, so fewer workers, plus retries for infrastructure-level flakiness.
name: Playwright Tests
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
test:
timeout-minutes: 30
runs-on: ubuntu-latest
strategy:
fail-fast: falsematrix:
shard: [1/4, 2/4, 3/4, 4/4]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Install Playwright browsers
run: npx playwright install --with-deps
- name: Run Playwright tests
run: npx playwright test --shard=${{ matrix.shard }}
- name: Upload test results
uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: playwright-report-${{ strategy.job-index }}
path: playwright-report/
retention-days: 7--shard=1/4 splits the test set into four parts. GitHub Actions spins up four runners in parallel. Total CI time drops to roughly a quarter.
fail-fast: false matters. You want to see every failure, not just the first. And upload the HTML report as an artifact -- essential for debugging failures that don't reproduce locally.
Cache Playwright browser binaries with actions/cache. Browser downloads take time. Make the webServer start your app in production mode for CI. Dev servers with hot module replacement eat more memory than a production build.
You don't need full cross-browser coverage on every PR. Run Chromium on pull requests. Run the full matrix nightly. Tag tests with @smoke or @regression and run subsets with npx playwright test --grep @smoke. A smoke suite covering critical paths finishes in under two minutes.
Getting Started Right
The test suite your team actually runs is worth more than the comprehensive one they skip.