PureDevTools

Punycode Converter

Convert internationalized domain names between Unicode and Punycode (xn--)

All processing happens in your browser. No data is sent to any server.

Type or paste a Unicode domain name above to convert it to its Punycode (ACE) representation.

Example Conversions

Click an example to load it into the encoder.

Your company is registering a domain name in Chinese — 例え.jp — and the registrar shows xn--r8jz45g.jp in the confirmation. Are those the same domain? Or you’re investigating a phishing email where the link appears to go to apple.com but the URL bar shows xn--pple-43d.com. You need to decode and compare Punycode to understand what domain you’re actually visiting.

What Is Punycode?

Punycode is an encoding system defined in RFC 3492 that represents Unicode characters using the limited ASCII character set allowed in domain names. It is the foundation of Internationalized Domain Names (IDN) — the system that allows domain names to contain non-ASCII characters like Chinese, Arabic, or accented Latin characters.

The Domain Name System (DNS) only supports ASCII labels — letters a–z, digits 0–9, and hyphens. To support domain names in other scripts, the IDN standard (RFC 5891) uses Punycode to encode Unicode labels into ASCII-Compatible Encoding (ACE), prefixed with xn--:

Unicode:  muenchen.de (with u-umlaut)
Punycode: xn--mnchen-3ya.de

Unicode:  zhongwen.com (in Chinese characters)
Punycode: xn--fiq228c.com

Each label (the part between dots) is encoded independently. If a label is already pure ASCII, it remains unchanged.

How Punycode Encoding Works

Punycode uses a clever algorithm called Bootstring that achieves compact encoding by exploiting the pattern that most IDN labels contain a mix of ASCII and non-ASCII characters:

  1. Extract and preserve ASCII characters: All ASCII characters in the label are copied to the output first, in their original order.
  2. Encode non-ASCII positions: The positions and codepoints of non-ASCII characters are encoded as a sequence of integers using a variable-length encoding with an adaptive bias.
  3. Separate with a hyphen: A hyphen - separates the ASCII portion from the encoded non-ASCII portion.

The algorithm is deterministic: the same Unicode input always produces the same Punycode output, and decoding always recovers the exact original Unicode string.

IDN and the Homograph Attack Problem

Punycode’s existence enables IDN homograph attacks — one of the most subtle phishing techniques. Many Unicode characters look identical or nearly identical to ASCII characters:

UnicodeLooks likeCodepoint
Cyrillic aLatin aU+0430
Cyrillic eLatin eU+0435
Cyrillic oLatin oU+043E
Cyrillic pLatin pU+0440

An attacker can register a domain using Cyrillic characters that renders identically to apple.com in most fonts but resolves to a completely different xn-- domain.

Modern browsers mitigate this by displaying the Punycode form (xn--...) in the address bar when a domain mixes scripts or uses characters from suspicious combinations. This tool helps you verify: paste a suspicious domain and see its true Punycode representation, or decode a xn-- domain to see what Unicode characters it actually contains.

Where Punycode Appears

Domain registration: When you register an IDN through a domain registrar, the system stores the Punycode form. WHOIS lookups, DNS records, SSL certificates, and HTTP headers all use the Punycode form internally.

SSL/TLS certificates: Certificate authorities issue certificates for the Punycode form of IDN domains. If your certificate is for the Punycode version of your domain, your server must be configured to serve that exact domain.

Email addresses: The domain part of internationalized email addresses (EAI, RFC 6531) uses Punycode in SMTP. The Unicode domain is transmitted as its xn-- equivalent at the protocol level.

DNS configuration: When setting up DNS records (A, AAAA, CNAME, MX) for IDN domains, you must use the Punycode form in your zone file.

Web crawlers and SEO: Search engines index both forms, but the canonical URL in Google Search Console will show the Punycode form. Understanding the mapping is essential for international SEO.

IDNA 2003 vs IDNA 2008

Two versions of the IDN standard exist, and they differ on several characters:

Most modern browsers and libraries use IDNA 2008, but some older systems still use IDNA 2003. This can cause subtle incompatibilities — a domain registered under IDNA 2008 rules may not resolve correctly on IDNA 2003 clients.

Common IDN Examples

Unicode DomainPunycodeScript
German city .dexn—mnchen-3ya.deGerman
Chinese characters .comxn—fiq228c.comChinese
Russian .rfxn—h1alffa9f.xn—p1aiRussian
Hindi .bharatxn—h2brj9c.xn—h2brj9cHindi
Japanese .jpxn—wgv71a309e.jpJapanese

Privacy

All conversion happens in your browser. No domain names or text are sent to any server.

Related Tools

More Encoding & Crypto Tools