1. What is URL Encoding?
URL encoding, also known as percent-encoding, is a mechanism for encoding information in a URI. It replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.
๐ Info
๐ Official Definition (RFC 3986)
"A percent-encoding mechanism is used to represent a data octet when that character is outside the allowed set or used as a delimiter."
2. Why URL Encoding is Essential
URLs support limited ASCII characters. Without encoding, special characters can break structure, cause vulnerabilities, or corrupt data during transmission.
/search?q=hello world & co
Breaks at space and &
/search?q=hello%20world%20%26%20co
Safely transmitted
3. Reserved Characters in URLs
Reserved characters have special meaning in URIs. When they need to be used as data (not as delimiters), they must be percent-encoded.
| Character | Purpose | Encoded Form |
|---|---|---|
| : | Scheme separator | %3A |
| / | Path separator | %2F |
| ? | Query delimiter | %3F |
| # | Fragment | %23 |
4. Unreserved Characters
These characters can be used without encoding.
- Alphabetic (A-Z, a-z)
- Digits (0-9)
- Hyphen (-)
- Period (.)
- Underscore (_)
- Tilde (~)
A-Z a-z 0-9 - . _ ~
Safe characters
5. Percent-Encoding
// Encoding formula
encoded_char = '%' + char_code.toString(16).toUpperCase()
6. RFC Standards
- ASCII only
- Space = +
- Old standard
- UTF-8 support
- Space = %20
- Modern standard
8. Security Implications of URL Encoding
โ ๏ธ Warning
โ ๏ธ Common Security Issues
- Double Encoding Attacks: %2527 instead of %27 to bypass filters
- SQL Injection: %27 = ' can break SQL queries
- XSS Attacks: %3Cscript%3E = <script>
- Open Redirects: %2F%2Fevil.com = //evil.com
โ Good to Know
โ Mitigation Strategies
- Validate and normalize URLs after decoding
- Use allow-lists for domains and paths
- Canonicalize input (decode once, then validate)
- Never trust user input
9. Common Encoding Mistakes & Fixes
Input: "hello world" โ encodeURI() โ "hello%20world"
encodeURI() again โ "hello%2520world" โ WRONG!
"hello world" โ "hello+world" โ Deprecated!
url = "https://api.com/search?q=" + userInput
// userInput = "hello&evil=param"
// URL breaks
encodeURIComponent("https://example.com")
โ "https%3A%2F%2Fexample.com" โ WRONG!
10. Programming Examples
// JavaScript
const encodeURIComponent = (str) => {
return encodeURIComponent(str); // Built-in
};
// Example
encodeURIComponent("hello world & co"); // "hello%20world%20%26%20co"
// Python
from urllib.parse import quote, unquote
encoded = quote("hello world & co")
# "hello%20world%20%26%20co"
// Java
import java.net.URLEncoder;
String encoded = URLEncoder.encode("hello world & co", StandardCharsets.UTF_8);
// "hello+world+%26+co" (Note: + for spaces)
// PHP
$encoded = urlencode("hello world & co");
// "hello+world+%26+co"
$encoded = rawurlencode("hello world & co");
// "hello%20world%20%26%20co" (RFC 3986 compliant)
// C# / .NET
using System.Web;
string encoded = HttpUtility.UrlEncode("hello world & co");
// "hello+world+%26+co"
// Ruby
require 'cgi'
encoded = CGI.escape("hello world & co")
// "hello+world+%26+co"
11. Production Best Practices
- Always encode user input before using in URLs
- Use UTF-8 encoding for non-ASCII characters
- Normalize URLs before validation
- Use built-in library functions
- Decode only when displaying to users
- Use encodeURIComponent() for query params
- Don't encode already encoded strings
- Don't use custom encoding logic
- Don't mix encoding standards
- Don't forget to decode before validation
- Don't trust encoded input
- Don't encode full URL protocol
Ready to Encode URLs?
Use our professional URL encoder/decoder tool (RFC 3986 supported)
Try URL Encoder Tool โ12. Tools & Reference Tables
| Character | ASCII | Hex | Encoded | Safe? |
|---|---|---|---|---|
| space | 32 | 20 | %20 | No |
| ! | 33 | 21 | %21 | Yes |
| # | 35 | 23 | %23 | No |
| & | 38 | 26 | %26 | No |
| = | 61 | 3D | %3D | No |
| _ | 95 | 5F | _ | Yes |
