Blogchevron_rightEngineering
Engineering

When to Escape vs Encode — A Developer's Guide

Escape, encode, encrypt — three operations that look alike from the outside and do completely different jobs underneath. Mixing them up is how the same string ends up double-escaped, the same URL ends up unreachable, and the same XSS bug ships three times.

December 11, 2026·7 min read·Try HTML escape →

The One-Sentence Definitions

Escaping makes a value safe inside another syntax by neutralizing characters that would otherwise be interpreted as syntax. Encoding translates a value into a different alphabet so it survives transport through a system that has restrictions on what it accepts. Encrypting hides the contents from anyone without the right key.

Escaping and encoding are both reversible and lose nothing. They are not security primitives by themselves — anyone with the inverse function can recover the original value. Encryption is the only one of the three that provides confidentiality, and only when paired with a properly managed key.

The most common production bug in this space is treating Base64 as if it were encryption. It is not. aGVsbG8= is one decoding step away from hello — anyone who copies the ciphertext can read it.

HTML Escaping — XSS Prevention 101

When you take user input and inject it into HTML, the browser parser will treat any <, >, ", ', or & as markup or attribute boundaries. A malicious user who controls the input can inject a <script> tag and run code in your origin. This is cross-site scripting (XSS) and it has been the #1 web vulnerability for two decades.

The fix is HTML escaping: replace each dangerous character with its HTML entity equivalent before injection. The five-character escape set (&, <, >, ", ') covers all attack surfaces inside a normal text node or attribute value:

# Input from a user
<script>alert('XSS')</script>

# After HTML escape
&lt;script&gt;alert(&#x27;XSS&#x27;)&lt;/script&gt;

# What the browser shows: literal text, not executable code
<script>alert('XSS')</script>

Two important nuances. First, the context matters: the escape rules for "inside an attribute value" are stricter than "inside a text node," and the escape rules for "inside a <script> block" are different again. Use a context-aware library (DOMPurify in JS, Bleach in Python, OWASP Java Encoder) rather than rolling your own. Second, modern frameworks (React, Vue, Svelte, Angular) escape by default — opting out via dangerouslySetInnerHTML or v-html is the bug, not the absence of escaping.

URL Encoding — RFC 3986

URLs cannot contain arbitrary characters. RFC 3986 defines a small alphabet of "unreserved" characters that can appear literally (A-Z a-z 0-9 - _ . ~) and a longer list of "reserved" characters that have syntactic meaning (: / ? # [ ] @ ! $ & ' ( ) * + , ; =). Anything else, including spaces and most Unicode, must be percent-encoded as %XX where XX is the hex of the UTF-8 byte.

Two functions in JavaScript handle this: encodeURI encodes only characters that would break URL syntax (it leaves /?#& alone), while encodeURIComponent encodes everything that is not unreserved. The rule of thumb: use encodeURIComponent for individual query-string values and path segments; use encodeURI for whole URLs that already contain syntax.

// A query string built right
const params = new URLSearchParams({
  q: 'hello world & friends',
  filter: 'a/b',
});
`/search?${params}` // "/search?q=hello+world+%26+friends&filter=a%2Fb"

// Wrong
`/search?q=${'hello world'}`  // "/search?q=hello world" — INVALID URL

// Wrong (whole URL passed to encodeURIComponent)
encodeURIComponent('https://api.com/path?q=1')
// "https%3A%2F%2Fapi.com%2Fpath%3Fq%3D1" — now broken everywhere

Base64 — Binary-Safe Transport

Base64 is an encoding (not encryption) that maps any byte sequence to an alphabet of 64 ASCII characters: A-Z, a-z, 0-9, +, /, with = as padding. The output is roughly 33% larger than the input. Its purpose is to let binary data ride through systems that only accept text — email bodies, JSON strings, HTTP headers.

Common cases: embedding a small image in a CSS data URI, sending a binary file inside a JSON API response, writing a JWT (which is three Base64 segments separated by dots), or stuffing a credential into an HTTP Basic Auth header. None of these involve secrecy — Base64 is not protecting anything.

Two variants matter: standard Base64 (RFC 4648 Section 4, uses + and /) and URL-safe Base64 (RFC 4648 Section 5, uses - and _). The URL-safe variant exists because + and / have meaning in URL paths and query strings. JWTs use URL-safe Base64. Most other contexts use standard.

JSON String Escaping

JSON has its own escape syntax for string contents: backslash followed by a single character. The required escapes are \", \\, \/ (optional), \b, \f, \n, \r, \t, and \uXXXX for arbitrary code points. The control characters U+0000 through U+001F must be escaped; the rest of UTF-8 may appear literally.

Every language has a working JSON serializer (JSON.stringify, json.dumps, json.Marshal). Use it. The classic anti-pattern is building JSON by string concatenation — '{"name":"' + user + '"}' — which fails the moment user contains a quote, a backslash, or a newline. The same bug class as SQL injection, with the same fix: use a real serializer.

Comparison Table

Five operations, five purposes. The thing they all share is "turning one string into another"; the things that distinguish them are reversibility, what they preserve, and what threat model they belong to:

OperationPurposeReversible?ExampleInformation loss
EscapeMake a value safe inside another syntaxYes< → &lt;None
EncodeTranslate to a different alphabet for transportYesspace → %20None
CompressShrink size while preserving contentYesgzip, brotliNone (lossless) or some (lossy)
HashMap to fixed-length fingerprintNoSHA-256All — one-way
EncryptHide content from anyone without the keyYes (with key)AES-256-GCMNone — but unreadable

The decision tree: ask "what system is this string about to enter, and what characters does that system reserve?" If you are about to insert into HTML, escape for HTML. If you are about to put it in a URL, encode for URL. If you need binary in a text channel, Base64. If you need confidentiality, encrypt — and Base64 is not encryption.

When NOT to Escape

Double-escaping is the most common bug in this space. The string gets escaped once at the producer, then escaped again at a layer that assumes the input was raw. Result: &amp;lt; instead of <, visible to the user. A few common situations where you should not escape:

block
Inside a JS template literal that is rendered as text content via React: React already escapes via the {} expression. Double-escaping turns < into &amp;lt; and shows the literal HTML entity to users.
block
On data coming back from a parameterized SQL query: The DB driver already returned strings, not SQL fragments. Re-escaping for SQL is what creates double-encoding bugs.
block
Inside an HTTPS URL path you constructed with the URL constructor: new URL() handles encoding. Manually calling encodeURIComponent on the full URL will break it.
block
Inside JSON.stringify output: JSON.stringify already produces a valid JSON string with quotes escaped. Wrapping its output in another escape is a bug.

The defensive principle: escape exactly once, at the boundary closest to the consumer. If you escape early and pass the escaped string through three more layers, every layer along the way risks getting it wrong.

Cross-Language Reference

The same three operations in the four languages most teams use day-to-day. Note how every language ships these in the standard library or a near-standard package — there is no need to roll your own:

// JavaScript / TypeScript
const html = String(input)
  .replace(/&/g, '&amp;')
  .replace(/</g, '&lt;')
  .replace(/>/g, '&gt;')
  .replace(/"/g, '&quot;')
  .replace(/'/g, '&#39;');

const url = encodeURIComponent('hello world & friends');
// "hello%20world%20%26%20friends"

const b64 = Buffer.from('hello').toString('base64'); // "aGVsbG8="

# Python
import html
import urllib.parse
import base64

html.escape("<script>alert('x')</script>")
# "&lt;script&gt;alert(&#x27;x&#x27;)&lt;/script&gt;"

urllib.parse.quote("hello world & friends")
# "hello%20world%20%26%20friends"

base64.b64encode(b"hello").decode()  # "aGVsbG8="

// Go
import (
  "html"
  "net/url"
  "encoding/base64"
)

html.EscapeString("<b>x</b>")               // "&lt;b&gt;x&lt;/b&gt;"
url.QueryEscape("hello world & friends")    // "hello+world+%26+friends"
base64.StdEncoding.EncodeToString([]byte("hello")) // "aGVsbG8="

// Java
import java.net.URLEncoder;
import java.util.Base64;
import org.apache.commons.text.StringEscapeUtils;

StringEscapeUtils.escapeHtml4("<b>x</b>");        // "&lt;b&gt;x&lt;/b&gt;"
URLEncoder.encode("hello world", "UTF-8");        // "hello+world"
Base64.getEncoder().encodeToString("hi".getBytes()); // "aGk="

One detail to watch: in Java's URLEncoder.encode and Go's url.QueryEscape, a space encodes to + rather than %20. Both forms are valid in query strings but only %20 is valid inside the URL path. For path segments use PathEscape (Go) or URLEncoder with manual +-to-%20 substitution (Java).

Escape, unescape, encode in your browser

Paste any string, pick the format — see the result instantly. No upload, no install.

Open Escape Tools →

Related Tools

When to Escape vs Encode — A Developer's Guide