Understanding URL and HTML Encoding: Why It Happens, How to Spot It, and What to Do With It

If you’ve ever seen a URL filled with strange symbols like %20, or noticed your payloads getting altered in weird ways during testing, you’ve already met encoding — specifically URL encoding and HTML encoding.

To exploit vulnerabilities effectively, you need to recognize when encoding is in play, understand why it’s used, and know how to manipulate or decode it.

Let’s break it all down.

What Is Encoding?

Encoding is the process of converting data into a different format so it can be safely transmitted or rendered.

It’s not encryption. It doesn’t hide or protect information — it just makes it compatible with certain systems.

Part 1: URL Encoding

Why It’s Done

URLs can only contain a limited set of characters. Certain characters have special meanings:

/ = path separator
? = start of query string
& = separates parameters
= = assigns values
# = anchor (fragment)

Other characters (like spaces or quotes) can break the URL or confuse the server, so they must be encoded.

How It Works

URL encoding replaces unsafe characters with % followed by their ASCII hex code:

Character	Encoded
space	`%20` or `+`
`"`	`%22`
`'`	`%27`
`<`	`%3C`
`>`	`%3E`
`/`	`%2F`

Example:

Original:

http://example.com/search?query=hello world

Encoded:

http://example.com/search?query=hello%20world

Spotting It During a Pentest

You’ll often see encoded values in:

URLs
Query parameters
Form fields
API requests
Burp Suite repeater/interceptor

Example:

username=admin%27+OR+1%3D1--

This is an encoded SQL injection payload:

admin' OR 1=1--

What To Do With It

Decode it to understand what the app is doing
(using Burp, Python, or URL decoding tools)
Encode payloads before sending them manually or via tools
(this avoids breaking syntax and can bypass filters)

Part 2: HTML Encoding

Why It’s Done

When a web app reflects user input back into the page, unencoded special characters can break the page or even inject scripts.

To prevent this, HTML encodes characters like <, >, and " so they’re displayed as text — not interpreted as HTML or JavaScript.

How It Works

Character	Encoded
`<`	`<`
`>`	`>`
`"`	`"`
`'`	`'`
`&`	`&`

Example:

<!-- Raw input (dangerous): -->
<p>Welcome, <script>alert(1)</script></p>

<!-- HTML-encoded input (safe): -->
<p>Welcome, &lt;script&gt;alert(1)&lt;/script&gt;</p>

Spotting It During a Pentest

You’ll see this mostly in reflected input:

<p>Hello, &lt;b&gt;admin&lt;/b&gt;!</p>

This means the app is trying to sanitize output — possibly to prevent Cross-Site Scripting (XSS). But sometimes it misses key spots or fails to encode properly in JavaScript contexts.

What To Do With It

Decode encoded output to spot reflections, misconfigurations, or filtering
Encode your payloads intentionally to:
- Bypass naive filters
- Inject into HTML or JavaScript contexts
Use tools like:
- Burp Suite (Decoder tab)
- Python scripts
- Online encoding/decoding tools

Pentesting Relevance: Why Encoding Matters

Vulnerability	Encoding Role
XSS	Encode/decode payloads to test different injection contexts
SQL Injection	Encode special characters to evade filters
Command Injection	Encode spaces, pipes, etc.
Filter Bypasses	Encode characters to slip through sanitization
API Testing	Some APIs encode responses — decode to reveal info

Quick Tools and Tips

Tools:

curl --data-urlencode
urldecode, urlencode (Linux tools)
Burp Suite Decoder
Firefox Dev Tools (watch live request data)

Test Ideas:

Try sending XSS payloads like:

<script>alert(1)</script>
Encoded version: %3Cscript%3Ealert(1)%3C%2Fscript%3E
HTML encoded version: <script>alert(1)</script>

Then watch how the app reflects or processes it.

Final Thoughts

Encoding isn’t just a web developer’s safety net — it’s also a tool for us as pentesters.

When you see encoded characters, don’t ignore them. Decode them to understand what’s happening under the hood. Encode your payloads smartly to sneak through filters or bypass poor sanitization.

The more fluent you become with spotting and handling encoding, the more control you’ll have over input, context, and ultimately — the target.

Understanding URL and HTML Encoding: Why It Happens, How to Spot It, and What to Do With It

What Is Encoding?

Part 1: URL Encoding

Why It’s Done

How It Works

Example:

Spotting It During a Pentest

What To Do With It

Part 2: HTML Encoding

Why It’s Done

How It Works

Example:

Spotting It During a Pentest

What To Do With It

Pentesting Relevance: Why Encoding Matters

Quick Tools and Tips

Tools:

Test Ideas:

Final Thoughts

Related Posts