Text to Binary Converter
Convert text to binary and binary back to text — ASCII and UTF-8, multiple delimiters, all in your browser
Encoding
Delimiter
Bit Grouping
Type or paste text above to convert it to binary. Each character is encoded as a sequence of 0s and 1s based on the selected encoding.
You’re teaching a CS fundamentals class and need to show students how the letter A becomes 01000001 in binary, how Hello becomes 01001000 01100101 01101100 01101100 01101111, and how UTF-8 encodes é as two bytes 11000011 10101001. You need a bidirectional converter that shows the encoding process clearly.
Why This Converter (Not the Number Base Converter)
PureDevTools has a Number Base Converter for converting numbers between bases (binary, octal, hex). This tool converts text to binary and binary to text — supporting ASCII and UTF-8 encoding, multiple delimiters (space, comma, none), and 7-bit or 8-bit grouping. Use the number converter for numeric base conversion; use this tool for text-to-binary character encoding.
What Is Binary Encoding?
Binary encoding is the process of representing text as a sequence of 0s and 1s — the two states that underlie all digital computing. Every character stored or transmitted by a computer is ultimately represented as a series of binary digits (bits).
In binary encoding, each character maps to a numeric value (its code point), and that number is expressed in base-2 (binary) notation. For example, the letter A has the ASCII value 65, which in binary is 01000001.
Understanding binary encoding is fundamental to:
- Low-level programming — reading memory dumps, debugging binary protocols, working with bitfields and flags
- Network engineering — inspecting packet headers, understanding IP addresses and subnet masks
- Embedded systems — manipulating hardware registers, configuring GPIO pins
- Cryptography and security — analyzing binary data, understanding cipher transformations
- Computer science education — teaching number systems, data representation, and character encoding
ASCII vs UTF-8 Encoding
This tool supports two encoding modes that determine how text characters are converted to bytes before the binary transformation.
ASCII (7-bit encoding)
ASCII (American Standard Code for Information Interchange) assigns values 0–127 to 128 characters: the 26 uppercase letters, 26 lowercase letters, digits 0–9, punctuation marks, and 33 control characters.
In ASCII mode, each character maps directly to a single byte (0–127). This is the simplest encoding and works correctly for standard English text.
A → 65 → 01000001
B → 66 → 01000010
! → 33 → 00100001
Limitation: ASCII cannot represent characters outside the Basic Latin block. Accented characters, CJK ideographs, emoji, and most non-English scripts are not supported. Characters with code points above 127 are handled by taking only the lowest byte.
UTF-8 (variable-length encoding)
UTF-8 is the dominant encoding for text on the web and in modern software. It is a variable-length encoding that can represent every code point in the Unicode standard (over 1.1 million characters).
- Characters in the ASCII range (0–127) encode as a single byte — UTF-8 is a superset of ASCII.
- Characters from code point 128 to 2047 encode as 2 bytes.
- Characters from 2048 to 65535 (the Basic Multilingual Plane, including most CJK characters) encode as 3 bytes.
- Characters above 65535 (supplementary planes, including most emoji) encode as 4 bytes.
A → 0x41 → 01000001
é → 0xC3 0xA9 → 11000011 10101001
中 → 0xE4 0xB8 0xAD → 11100100 10111000 10101101
🚀 → 0xF0 0x9F 0x9A 0x80 → 11110000 10011111 10011010 10000000
Use UTF-8 when working with any text that may contain non-ASCII characters. Use ASCII when you need a simple, one-byte-per-character mapping and your input is limited to standard English text.
Delimiter Options
The delimiter separates the binary groups in the output, making it easier to identify individual byte boundaries.
| Delimiter | Example output | Best used when |
|---|---|---|
| Space | 01000001 01000010 | Human-readable output, debugging |
| None | 0100000101000010 | Compact storage, streaming protocols |
| Dash | 01000001-01000010 | Alternative separator where spaces are not allowed |
| Comma | 01000001,01000010 | CSV output, array initialization |
When decoding binary back to text, the delimiter setting must match the format of the input. If your binary string uses spaces, select the Space delimiter. If there are no delimiters, select None and ensure the bit grouping matches the group size used during encoding.
Bit Grouping (8-bit vs 7-bit)
The bit grouping controls how many binary digits are shown per byte.
8-bit grouping (default)
Each byte is displayed as exactly 8 binary digits (an octet). This is the standard representation in most contexts — a byte is universally 8 bits.
H → 01001000
e → 01100101
7-bit grouping
Each byte is padded to 7 binary digits. This is useful when working with 7-bit ASCII data (values 0–127), which is common in:
- Serial communication — early UART configurations using 7 data bits
- Teletext and ANSI/TIA standards — some legacy systems use 7-bit frames
- MIME quoted-printable — related encoding concepts
H → 1001000 (72 in 7 bits)
e → 1100101 (101 in 7 bits)
Note: characters with byte values above 127 (as in multibyte UTF-8 sequences) require more than 7 bits and will be displayed with additional digits when using 7-bit grouping.
How Binary to Text Decoding Works
When converting binary back to text:
- The binary string is split into groups using the selected delimiter (or split into fixed-size chunks for the “None” delimiter).
- Each group of binary digits is parsed as a base-2 integer to produce a byte value (0–255).
- The resulting byte sequence is decoded as UTF-8 or ASCII to produce the final text.
For UTF-8 decoding, the byte sequence must form valid UTF-8. Truncated multibyte sequences or invalid continuation bytes will result in a decoding error.
Common Use Cases for Developers
Debugging binary protocols: When inspecting network packets, serial frames, or binary file formats, viewing data in binary reveals individual bit states that hex notation obscures.
Bitfield and flag visualization: Registers, permission masks, and status flags are often described at the bit level. Converting a byte value to binary makes it trivial to see which flags are set.
Teaching number systems: The binary representation helps learners understand the relationship between decimal, hexadecimal, and binary numbering systems.
Encoding verification: After encoding a string in UTF-8, inspecting the binary representation shows the actual byte structure — including the leading bits that identify single-byte, 2-byte, 3-byte, and 4-byte sequences.
CRC and checksum analysis: Some checksums operate at the bit level; binary representation makes bit patterns visible for manual verification.
Frequently Asked Questions
How many bits represent one character? It depends on the encoding and the character. In ASCII, every character is exactly 8 bits (one byte). In UTF-8, ASCII characters (0–127) are 8 bits, while non-ASCII characters use 16, 24, or 32 bits (2, 3, or 4 bytes). For example, a common emoji like 🚀 requires 32 bits (4 bytes) in UTF-8.
Why are there leading zeros in the binary output?
Binary groups are padded with leading zeros to reach the selected bit width (8-bit or 7-bit). This padding makes it easy to identify byte boundaries and compare values. Without padding, a space character (binary 100000) would look like a 6-digit number instead of the expected 8-digit 00100000.
What is the difference between binary and hexadecimal?
Both are ways to represent the same byte values. Binary uses base 2 (only 0 and 1), while hexadecimal uses base 16 (0–9 and A–F). Each hexadecimal digit represents exactly 4 binary digits (a nibble), so 0xFF in hex is 11111111 in binary. Hex is more compact; binary shows every individual bit.
Can this tool handle emoji and special characters? Yes, when using UTF-8 encoding. Emoji and other characters outside the Basic Multilingual Plane are encoded as 4-byte UTF-8 sequences and displayed as 32 binary digits (in 8-bit mode) or as 4 groups of 8 digits. ASCII mode handles only characters with code points 0–127.
Is my text sent to a server? No. All conversion happens entirely in your browser. No text is transmitted to any server, and no data is stored or logged.