Why Validate HTML at All?
The argument against HTML validation is older than this blog: "the browser tolerates errors, my page renders fine, validation is pedantry." Each part is technically true. None of them is a good reason to skip validation.
Invalid HTML still works the way it does only because every browser implements the same parser error-recovery rules from the HTML Living Standard. That recovery is deterministic but not free — when you nest a <div> inside a <p>, the parser silently closes the paragraph before the div, breaking your selectors and your CSS specificity. When you forget to close a list item, the next one absorbs whatever came after it. These bugs ship to production and surface as "weird CSS issue on Safari" tickets six months later.
The deeper case is accessibility. Screen readers, voice control, and switch devices rely on the parsed DOM matching the markup author's intent. A duplicate id breaks aria-labelledby. A missing label for= means the input has no announced name. A button inside an anchor is a click-target ambiguity that NVDA and VoiceOver handle differently. Validation surfaces these before a real user with a real assistive technology hits them.
The W3C Nu Html Checker
The W3C's Nu Html Checker (often shortened to "Nu validator") is the closest thing to a reference implementation of HTML5. It is what powers validator.w3.org/nu/ and the green-check badges of the early 2010s. Despite the static-page reputation, it is alive and updated quarterly to track the HTML Living Standard.
Nu validates against the spec — not against a style guide and not against accessibility rules. It tells you whether the document is conformant HTML5: tag nesting is legal, required attributes are present, deprecated elements are flagged, character encoding matches the declared charset. It will not tell you that your contrast ratio is wrong or that your heading hierarchy skips levels.
In 2026 the most useful way to run Nu is locally as a JAR file (vnu.jar) or as a Docker image. Both bypass the rate limits on the hosted site and let you validate a build directory in seconds:
# One-off validation of a built site
java -jar vnu.jar --skip-non-html dist/
# Or via npm wrapper
npx vnu-jar "dist/**/*.html"
# Docker for CI
docker run --rm -v $(pwd):/data \
ghcr.io/validator/validator:latest \
vnu --skip-non-html /data/distThe Five Errors You Will Actually See
In ten years of running validation against production codebases, the same handful of errors account for nearly all real findings. Recognize them on sight and you can fix most validation failures without consulting the spec:
Two patterns explain ninety percent of these. The first is template loops emitting markup conditionally — an {% if %} that opens a tag in one branch and closes it in another, so the markup is balanced only when both branches fire. The second is component composition where a child component emits a wrapper element the parent did not anticipate. Both surface only at runtime, which is exactly why a static validator catches them before users do.
Accessibility Validation — axe and WAVE
Spec conformance and accessibility are not the same thing. Valid HTML can still have a contrast ratio of 2:1, a heading order that jumps from h1 to h4, or a navigation landmark with no accessible name. Accessibility validators check the rendered DOM against WCAG 2.2 success criteria.
Two tools dominate. axe-core by Deque is the engine behind nearly every commercial accessibility tester (Lighthouse, the Chrome DevTools accessibility tab, BrowserStack, Sauce Labs). It produces zero false positives by design — every reported violation is a real WCAG failure — at the cost of catching only the roughly 50% of issues that can be detected automatically. The remainder require human judgment.
WAVE by WebAIM takes the opposite approach: it overlays icons directly on the page so a sighted reviewer can quickly scan for missing alt text, broken landmarks, and contrast issues. It is excellent for design review and education but harder to automate.
The combination most teams settle on: axe in CI for the always-detectable failures, plus a WAVE overlay during design QA for the human-judgment cases. Add NVDA or VoiceOver passes on critical user flows once per release.
HTMLHint and Project Lint Rules
Below the spec lives style: which conventions does your codebase enforce? Tab indent or spaces, lowercase or PascalCase for custom elements, single or double quotes for attributes. None of these affects rendering, but consistency across a 200,000-line codebase is worth enforcing automatically.
HTMLHint is the de-facto linter. Its rule set is small (around 25 rules), every one is opt-in, and its config file is one of the simplest in the lint ecosystem:
// .htmlhintrc
{
"tagname-lowercase": true,
"attr-lowercase": true,
"attr-value-double-quotes": true,
"doctype-first": true,
"tag-pair": true,
"spec-char-escape": true,
"id-unique": true,
"src-not-empty": true,
"attr-no-duplication": true,
"alt-require": true,
"title-require": true
}For framework-specific concerns — Vue templates, Svelte components, JSX — the linter to reach for is the framework's own ESLint plugin (eslint-plugin-vue, eslint-plugin-jsx-a11y, eslint-plugin-svelte). These understand the template language and catch issues HTMLHint misses because the markup is generated, not static.
Wiring It Into CI
Validation that runs only when a developer remembers to run it might as well not exist. The win comes from making invalid HTML a build failure. A typical four-stage GitHub Actions job:
# .github/workflows/html-validate.yml
name: HTML validation
on: [push, pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20' }
- name: Build site
run: npm ci && npm run build
# 1. W3C Nu validator on every built HTML file
- name: W3C validation
run: npx vnu-jar dist/**/*.html
# 2. HTMLHint with project rules
- name: HTMLHint
run: npx htmlhint "dist/**/*.html"
# 3. Accessibility audit with axe
- name: Start preview server
run: npx serve dist &
- name: axe-core CLI
run: npx @axe-core/cli http://localhost:3000 --exit
# 4. Lighthouse budgets
- name: Lighthouse CI
run: npx @lhci/cli autorunThe four stages map to four classes of bug: spec conformance (Nu), code style (HTMLHint), accessibility (axe), and the catch-all bundle audit (Lighthouse). Each runs in parallel, each fails the build independently, and the total runtime stays under two minutes for a typical site.
For larger sites where running every check against every page is too slow, the pragmatic compromise is to validate on a sampled set: every page changed in the PR plus a fixed list of canonical pages (homepage, top three landing pages, account flow). This catches regressions without paying the full cost on every commit.
Semantic HTML Mistakes That Pass Validation
The last and most insidious class of issue is the markup that validates cleanly but uses the wrong element. The browser is happy, the W3C validator is happy, and your screen reader users are not.
The classic offender is <div onClick> in place of <button>. Both render. Only the button is keyboard-focusable, announces a role to assistive tech, and triggers on Space or Enter. A div with an onClick is invisible to anyone not using a mouse. This category is where lints like jsx-a11y/no-static-element-interactions earn their keep.
Other common semantic mistakes: using <b> for emphasis (use <strong>), <i> for italicized prose (use <em>), nested <section>s without headings (each section needs an h1-h6 for landmark navigation), or wrapping an entire card in an anchor when only the title needs to be the link target.
The defensive pattern: every interactive element should be a native interactive element (button, a, input, select, summary, dialog) unless you have an explicit reason and an ARIA fallback strategy.
Tool Comparison
No single tool covers all four classes of issue. The realistic stack picks one from each row of the table and runs them in CI:
| Tool | Type | Focus | CI | Notes |
|---|---|---|---|---|
| W3C Nu Validator | Spec validator | HTML/SVG/MathML conformance | CLI via vnu.jar | Authoritative source of truth |
| HTMLHint | Linter | Style and quality rules | npm package | Fast, configurable, plugin-friendly |
| axe DevTools | Accessibility | WCAG 2.2 violations | @axe-core/cli | Industry standard for a11y |
| WAVE | Accessibility | Visual a11y overlay | Browser extension + API | Best for manual review |
| Pa11y | Accessibility | Automated WCAG audit | Headless Chrome | Easy CI integration |
| Lighthouse | Bundle audit | Perf + a11y + SEO + best practices | lhci CLI | Catches the same a11y issues axe does |
Practical starter stack: Nu validator + HTMLHint + axe-core. That covers spec, style, and accessibility for under thirty seconds per build. Add Lighthouse if you need performance budgets and Pa11y if you need to test a flow that requires authentication.
Validate any HTML in your browser
Paste markup, get errors and warnings instantly — no upload, no rate limit.