URL Encoding Complete Guide: Percent-Encoding, Reserved Characters & Best Practices
Comprehensive Guide๐Ÿ“– 18 min read๐Ÿ“… December 15, 2024

URL Encoding Complete Guide: Percent-Encoding, Reserved Characters & Best Practices

Rajesh Kumar
Rajesh Kumar
Senior Web Standards Architect

1. What is URL Encoding?

URL encoding, also known as percent-encoding, is a mechanism for encoding information in a URI. It replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.

๐Ÿ“˜ Info

๐Ÿ“– Official Definition (RFC 3986)

"A percent-encoding mechanism is used to represent a data octet when that character is outside the allowed set or used as a delimiter."

20+
Reserved Characters
%XX
Encoding Pattern
RFC 3986
Current Standard
128
ASCII Characters

2. Why URL Encoding is Essential

URLs support limited ASCII characters. Without encoding, special characters can break structure, cause vulnerabilities, or corrupt data during transmission.

โŒ Without Encoding: /search?q=hello world & co

Breaks at space and &

โœ… With Encoding: /search?q=hello%20world%20%26%20co

Safely transmitted

3. Reserved Characters in URLs

Reserved characters have special meaning in URIs. When they need to be used as data (not as delimiters), they must be percent-encoded.

Character Purpose Encoded Form
: Scheme separator %3A
/ Path separator %2F
? Query delimiter %3F
# Fragment %23

4. Unreserved Characters

These characters can be used without encoding.

  • Alphabetic (A-Z, a-z)
  • Digits (0-9)
  • Hyphen (-)
  • Period (.)
  • Underscore (_)
  • Tilde (~)

A-Z a-z 0-9 - . _ ~

Safe characters

5. Percent-Encoding

// Encoding formula

encoded_char = '%' + char_code.toString(16).toUpperCase()

Step 1: ASCII value
Step 2: Convert to hex
Step 3: Pad digits
Step 4: Add %

6. RFC Standards

RFC 1738
  • ASCII only
  • Space = +
  • Old standard
RFC 3986
  • UTF-8 support
  • Space = %20
  • Modern standard

8. Security Implications of URL Encoding

โš ๏ธ Warning

โš ๏ธ Common Security Issues

  • Double Encoding Attacks: %2527 instead of %27 to bypass filters
  • SQL Injection: %27 = ' can break SQL queries
  • XSS Attacks: %3Cscript%3E = <script>
  • Open Redirects: %2F%2Fevil.com = //evil.com

โœ… Good to Know

โœ… Mitigation Strategies

  • Validate and normalize URLs after decoding
  • Use allow-lists for domains and paths
  • Canonicalize input (decode once, then validate)
  • Never trust user input

9. Common Encoding Mistakes & Fixes

โŒ Mistake 1: Double Encoding Input: "hello world" โ†’ encodeURI() โ†’ "hello%20world"
encodeURI() again โ†’ "hello%2520world" โ† WRONG!
โœ… Fix: Only encode once
โŒ Mistake 2: Using + for spaces "hello world" โ†’ "hello+world" โ† Deprecated!
โœ… Fix: Use %20
โŒ Mistake 3: Not encoding query params url = "https://api.com/search?q=" + userInput
// userInput = "hello&evil=param"
// URL breaks
โœ… Fix: Use encodeURIComponent()
โŒ Mistake 4: Encoding full URL encodeURIComponent("https://example.com")
โ†’ "https%3A%2F%2Fexample.com" โ† WRONG!
โœ… Fix: Encode only query/path

10. Programming Examples

// JavaScript

const encodeURIComponent = (str) => {

return encodeURIComponent(str); // Built-in

};

// Example

encodeURIComponent("hello world & co"); // "hello%20world%20%26%20co"

// Python

from urllib.parse import quote, unquote

encoded = quote("hello world & co")

# "hello%20world%20%26%20co"

// Java

import java.net.URLEncoder;

String encoded = URLEncoder.encode("hello world & co", StandardCharsets.UTF_8);

// "hello+world+%26+co" (Note: + for spaces)

// PHP

$encoded = urlencode("hello world & co");

// "hello+world+%26+co"

$encoded = rawurlencode("hello world & co");

// "hello%20world%20%26%20co" (RFC 3986 compliant)

// C# / .NET

using System.Web;

string encoded = HttpUtility.UrlEncode("hello world & co");

// "hello+world+%26+co"

// Ruby

require 'cgi'

encoded = CGI.escape("hello world & co")

// "hello+world+%26+co"

11. Production Best Practices

โœ… DO
  • Always encode user input before using in URLs
  • Use UTF-8 encoding for non-ASCII characters
  • Normalize URLs before validation
  • Use built-in library functions
  • Decode only when displaying to users
  • Use encodeURIComponent() for query params
โŒ DON'T
  • Don't encode already encoded strings
  • Don't use custom encoding logic
  • Don't mix encoding standards
  • Don't forget to decode before validation
  • Don't trust encoded input
  • Don't encode full URL protocol

Ready to Encode URLs?

Use our professional URL encoder/decoder tool (RFC 3986 supported)

Try URL Encoder Tool โ†’

12. Tools & Reference Tables

Character ASCII Hex Encoded Safe?
space 32 20 %20 No
! 33 21 %21 Yes
# 35 23 %23 No
& 38 26 %26 No
= 61 3D %3D No
_ 95 5F _ Yes

Share Article

Rajesh Kumar

Rajesh Kumar

Senior Web Standards Architect

Rajesh is a web standards expert with 14+ years of experience at W3C and IETF. He has contributed to RFC 3986 and teaches URL encoding at industry conferences worldwide.

Article Details

๐Ÿ“… PublishedDecember 15, 2024
โฑ๏ธ Read Time18 min read
๐Ÿ“‚ CategoryComprehensive Guide
#urlencoding#percentencoding#rfc3986#urlencode#percentencoding#urlescape
๐Ÿ”—

Ready to Encode or Decode URLs?

Encode or decode text in URL format (percent-encoding) or Base64 instantly - free, no registration.

Start URL Encoder/Decoder โ†’