HTML to Table Online — Extract HTML Tables to Data

Paste HTML containing one or more <table> elements and extract clean rows. Copy as JSON, CSV, or TSV — sortable, filterable, 100% client-side.

search
Paste HTML containing a <table> above or click Load Sample.

What is an HTML to Table extractor?

An HTML to table extractor finds <table> elements in pasted HTML, pulls out their rows and columns, and lets you export the data as JSON, CSV, or TSV. It saves the trouble of writing a one-off scraper for tables on web pages, documentation, Wikipedia, or internal admin UIs.

Most useful tabular data on the web sits inside HTML <table> elements: pricing pages, comparison charts, sport results, financial data, API documentation. View Source, copy the table fragment, paste it here, and you have the data ready for code or a spreadsheet.

How to extract HTML tables — 4 steps

  1. Paste HTML. Drop HTML containing one or more tables — a fragment, the entire page source, or copy from DevTools Elements panel.
  2. Pick a table. If multiple tables are detected, switch between them with the tab strip.
  3. Sort and filter. Click any column header to sort. Type in the filter to narrow rows live.
  4. Export. Copy as JSON, CSV, or TSV — the export reflects current sort and filter.

Sample input and output

<table>
  <thead>
    <tr><th>Country</th><th>Capital</th></tr>
  </thead>
  <tbody>
    <tr><td>Japan</td><td>Tokyo</td></tr>
    <tr><td>Brazil</td><td>Brasília</td></tr>
  </tbody>
</table>

renders to a 2-row table with columns Country, Capital. Copy as JSON returns [{"Country":"Japan","Capital":"Tokyo"}, …]; copy as CSV returns the comma-separated rows ready for Excel.

Sortable Columns

Click any column header to toggle ascending or descending. Numeric ordering means sales figures and prices sort naturally.

Live Filter

Substring filter checks every cell. Useful for pulling out a subset before exporting — for instance, "show me only rows where country contains land".

JSON / CSV / TSV Export

Three export formats from a single paste. JSON for code, CSV for spreadsheets, TSV for clipboard-pasting into Excel without quoting hassles.

Common use cases

  • check_circleScraping Wikipedia or reference tables without writing a Python script
  • check_circlePulling pricing or feature comparison tables off competitor websites
  • check_circleExtracting sports schedules, league standings, or fantasy data
  • check_circleConverting documentation tables (e.g. API parameter lists) to JSON
  • check_circleCapturing financial data from broker or analytics dashboards
  • check_circleSaving the output of internal admin UIs that render data as HTML tables
  • check_circleMigrating legacy HTML reports into modern data pipelines
  • check_circleQuick one-off conversions when curl + jq is overkill for a single page

How the extraction works

The browser's built-in DOMParser with text/html mode parses your input and exposes the same DOM tree your browser sees. The tool finds every <table>, then for each one looks for headers in <thead> first, falling back to the first row if all cells are <th>. Body rows come from <tbody> if present, otherwise every <tr> not inside <thead>. Cell text is extracted via textContent and trimmed — formatting is intentionally stripped so the output is paste-clean. colspan and rowspan are not expanded; each <td> contributes one cell.

Need to do more with HTML?

Format, convert, and explore HTML and tabular data with the rest of the OpenFormatter toolkit.

Frequently Asked Questions

Can it extract multiple tables?

Yes. The tool finds every <table> element in the pasted HTML and lists them in a tab strip — switch between tables with one click. Each table is extracted independently with its own headers, rows, and exports.

Does it preserve cell formatting?

No — only text content is extracted. Inline styles, spans, links, images, and bold/italic markers are stripped down to their text. If you need styling preserved, copy the table directly into a rich-text editor; this tool is for data extraction.

How are headers detected?

In order of preference: (1) cells inside <thead><tr>, (2) the first row if all cells are <th>, (3) Column 1, Column 2, … as fallback. This handles the most common semantic-HTML and legacy patterns. If your headers are misdetected, wrap them in <thead> for clarity.

What about colspan and rowspan?

colspan and rowspan are not expanded — each <td> contributes one cell. Spanned tables produce ragged rows where the spanned cells are simply missing. For pivot-style tables with spans, plan to clean up the export in Excel or a script.

Can I paste a whole web page?

Yes — paste the entire HTML source. The tool searches for every <table> tag and ignores everything else, so navigation, headers, and body copy are filtered out automatically.

Why is my Wikipedia table missing rows?

Wikipedia uses heavy rowspan and colspan in many infoboxes and statistic tables. The basic extractor here treats each <td> as one cell, so spanned cells produce ragged output. For complex Wikipedia tables, copy the rendered table into Excel and export from there.

Is the HTML uploaded?

No. Parsing happens in the browser with DOMParser. Pasted HTML, including any text content, never leaves your device. Verify in DevTools Network tab — no requests are made when you paste or click extract.

Can I export only the visible (sorted/filtered) rows?

Yes — Copy as JSON, CSV, or TSV all use the currently displayed rows after filtering and sorting. To export the original unsorted, unfiltered set, clear the filter and re-click any sorted column to remove the sort indicator.

HTML to Table Online — Extract HTML Tables to Data