Base64 is one of those bits of plumbing that quietly shows up in every part of the web — embedded images in CSS, JSON Web Tokens, email attachments, OAuth flows, data URLs in HTML. It looks like gibberish, behaves like text, and grows your payload by about a third. If you've ever wondered why it exists and when to reach for it, this is the long answer.
What Base64 actually is
Base64 is an encoding, not an encryption. It takes a sequence of bytes (any bytes — image data, a UTF-8 string, raw binary) and represents them using only 64 ASCII characters: A-Z, a-z, 0-9, +, and /. The trailing = you sometimes see is padding.
The recipe is mechanical:
- Take the input bytes three at a time. Three bytes = 24 bits.
- Slice those 24 bits into four 6-bit chunks.
- Each 6-bit chunk has a value from 0 to 63 — look it up in the Base64 alphabet.
That's the whole thing. Three input bytes become four output characters. Always.
Why four characters per three bytes — and where the 33% comes from
Three bytes carry 24 bits of information. Four Base64 characters carry the same 24 bits, but each character costs a full byte to store (because text is one byte per ASCII character). So you've turned 3 bytes of input into 4 bytes of output. That's a 33% increase. Math doesn't care if the input was an image, a string, or a binary blob — the overhead is constant.
If the input isn't a clean multiple of three, the encoder pads. Two leftover bytes become three Base64 characters plus one =. One leftover byte becomes two Base64 characters plus two ==. The padding lets a decoder reconstruct the exact original byte count.
Why it exists at all
Base64 solves one specific problem: getting binary data through systems that only handle text safely. That's a real constraint in more places than you'd think.
- Email (MIME): SMTP was designed for 7-bit ASCII. Attachments are Base64-encoded so they survive every relay along the way.
- JSON: JSON has no native binary type. To embed an image, a certificate, or any raw bytes, you Base64 them and store the string.
- Data URLs:
data:image/png;base64,...lets a browser render an inline image without a separate HTTP request. - JSON Web Tokens: A JWT is three Base64URL-encoded segments separated by dots. The header, payload, and signature are each binary in spirit but travel as text.
- HTTP headers: Basic Auth credentials, X.509 fingerprints, and a long tail of other headers use Base64 because the header line is a text channel.
The common thread: you have a transport that's safe for printable ASCII but not for arbitrary bytes, and you need bytes to get through.
Base64 vs Base64URL
Standard Base64 uses + and /. Both characters mean something in a URL — + is sometimes interpreted as a space, and / is a path separator. So the standard library defines a URL-safe variant where:
+becomes-/becomes_- Padding
=is often dropped (decoder infers length)
If you're using Base64 in a query string, a path segment, or a JWT, you almost certainly want Base64URL. Our Base64 Encoder & Decoder supports both — and the URL Encoder is where to go if you're escaping non-Base64 strings for URLs.
When Base64 is the wrong tool
Base64 is overused. Three patterns to push back on:
Storing binary in a database. Most databases have a binary type (Postgres bytea, MySQL BLOB). Storing the same data as Base64 wastes 33% disk, breaks indexable equality on the raw bytes, and complicates client-side decoding. Use the binary column.
"Encrypting" anything. Base64 is encoding — fully reversible without a key. It is not a hash, not a cipher, not a signature. If you need confidentiality, use real crypto (AES-GCM, age, libsodium). If you need integrity, use a hash. If you need a token that's hard to guess, use a CSPRNG.
Inline images in HTML emails or web pages, for everything. Data URLs skip an HTTP request but get bigger by 33%, can't be cached separately, bloat the HTML, and force the browser to re-decode on every render. They make sense for tiny icons (sub-1KB), maybe a few SVGs above the fold. For anything larger, serve a real file.
Common bugs
A few classes of bug that crop up in code that deals with Base64:
- UTF-8 round-trip errors.
btoa("résumé")throws in browsers becausebtoaexpects each character to be in the Latin-1 range. The fix is to encode the string as UTF-8 bytes first (TextEncoder) and then Base64. The inverse on decode. - Padding mismatches. Some libraries emit padding (
=,==), others don't. JWTs strip padding by convention; some Base64URL encoders don't. A decoder that's strict about padding will reject input that another encoder produced. If you control both ends, pick one rule. - Whitespace inside Base64. PEM-encoded keys and email MIME bodies break Base64 across 64- or 76-character lines. Strict decoders reject the embedded newlines. Most permissive decoders strip whitespace silently. If you're writing your own, strip first.
- Treating Base64 as opaque. A leading
eyJis almost always a JSON object's{"Base64-encoded — a quick way to spot a JWT or a config blob without running a decoder.
A worked example
Take the string Cat. Three bytes — exactly one group, no padding.
ASCII: C a t
Decimal: 67 97 116
Binary: 01000011 01100001 01110100
Regroup: 010000 110110 000101 110100
Decimal: 16 54 5 52
Base64: Q 2 F 0
Cat becomes Q2F0. Three bytes in, four characters out, no padding needed.
For one-byte input A:
Binary: 01000001 (pad with zeros to fill 12 bits)
Regroup: 010000 010000
Base64: Q Q ==
Two characters, two = padding. Total still four — Base64 always emits a multiple of four characters.
When to reach for Base64
Use it when you need to put bytes into a text channel and you control the consumer. Use Base64URL specifically for anything that touches a URL, a query string, or a JWT. Don't use it as a security measure — it doesn't hide anything from anyone looking. And don't use it as a default for binary storage when you have a real binary column.
For everything else, the encoder is on this site — paste your input, get the output, move on with your day.