hacktricks/pentesting-web/cache-deception/cache-poisoning-via-url-discrepancies.md

# Cache Poisoning via URL discrepancies

{% hint style="success" %}
Learn & practice AWS Hacking:<img src="../../.gitbook/assets/arte.png" alt="" data-size="line">[**HackTricks Training AWS Red Team Expert (ARTE)**](https://training.hacktricks.xyz/courses/arte)<img src="../../.gitbook/assets/arte.png" alt="" data-size="line">\
Learn & practice GCP Hacking: <img src="../../.gitbook/assets/grte.png" alt="" data-size="line">[**HackTricks Training GCP Red Team Expert (GRTE)**<img src="../../.gitbook/assets/grte.png" alt="" data-size="line">](https://training.hacktricks.xyz/courses/grte)

<details>

<summary>Support HackTricks</summary>

* Check the [**subscription plans**](https://github.com/sponsors/carlospolop)!
* **Join the** 💬 [**Discord group**](https://discord.gg/hRep4RUj7f) or the [**telegram group**](https://t.me/peass) or **follow** us on **Twitter** 🐦 [**@hacktricks\_live**](https://twitter.com/hacktricks\_live)**.**
* **Share hacking tricks by submitting PRs to the** [**HackTricks**](https://github.com/carlospolop/hacktricks) and [**HackTricks Cloud**](https://github.com/carlospolop/hacktricks-cloud) github repos.

</details>
{% endhint %}

This is a summary of the techniques proposed in the post [https://portswigger.net/research/gotta-cache-em-all](https://portswigger.net/research/gotta-cache-em-all) in order to perform cache poisoning attacks **abusing discrepancies between cache proxies and web servers.**

{% hint style="info" %}
The goal of this attack is to **make the cache server think that a static resource is being loaded** so it caches it while the cache server stores as cache key part of the path but the web server responds resolving another path. The web server will resolve the real path which will be loading a dynamic page (which might store sensitive information about the user, a malicious payload like XSS or redirecting to lo load a JS file from the attackers website for example).
{% endhint %}

## Delimiters

**URL delimiters** vary by framework and server, impacting how requests are routed and responses are handled. Some common origin delimiters are:

* **Semicolon**: Used in Spring for matrix variables (e.g. `/hello;var=a/world;var1=b;var2=c` → `/hello/world`).
* **Dot**: Specifies response format in Ruby on Rails (e.g. `/MyAccount.css` → `/MyAccount`)
* **Null Byte**: Truncates paths in OpenLiteSpeed (e.g. `/MyAccount%00aaa` → `/MyAccount`).
* **Newline Byte**: Separates URL components in Nginx (e.g. `/users/MyAccount%0aaaa` → `/account/MyAccount`).

Other specific delimiters might be found following this process:

* **Step 1**: Identify non-cacheable requests and use them to monitor how URLs with potential delimiters are handled.
* **Step 2**: Append random suffixes to paths and compare the server's response to determine if a character functions as a delimiter.
* **Step 3**: Introduce potential delimiters before the random suffix to see if the response changes, indicating delimiter usage.

## Normalization & Encodings

* **Purpose**: URL parsers in both cache and origin servers normalize URLs to extract paths for endpoint mapping and cache keys.
* **Process**: Identifies path delimiters, extracts and normalizes the path by decoding characters and removing dot-segments.

### **Encodings**

Different HTTP servers and proxies like Nginx, Node, and CloudFront decode delimiters differently, leading to inconsistencies across CDNs and origin servers that could be exploited. For example, if the web server perform this transformation `/myAccount%3Fparam` → `/myAccount?param` but the cache server keeps as key the path `/myAccount%3Fparam`, there is an inconsistency.&#x20;

A way to check for these inconsistencies is to send requests URL encoding different chars after loading the path without any encoding and check if the encoded path response came from the cached response.

### Dot segment

The path normalization where dots are involved is also very interesting for cache poisoning attacks. For example, `/static/../home/index` or `/aaa..\home/index`, some cache servers will cache these paths with themselves ad the keys while other might resolve the path and use `/home/index` as the cache key.\
Just like before, sending these kind of requests and checking if the response was gathered from the cache helps to identify if the response to `/home/index` is the response sent when those paths are requested.

## Static Resources

Several cache servers will always cache a response if it's identified as static. This might be because:

* **The extension**: Cloudflare will always cache files with the following extensions: 7z, csv, gif, midi, png, tif, zip, avi, doc, gz, mkv, ppt, tiff, zst, avif, docx, ico, mp3, pptx, ttf, apk, dmg, iso, mp4, ps, webm, bin, ejs, jar, ogg, rar, webp, bmp, eot, jpg, otf, svg, woff, bz2, eps, jpeg, pdf, svgz, woff2, class, exe, js, pict, swf, xls, css, flac, mid, pls, tar, xlsx
  * It's possible to force a cache storing a dynamic response by using a delimiter and a static extension like a request to `/home$image.png` will cache `/home$image.png` and the origin server will respond with `/home`
* **Well-known static directories**: The following directories contains static files and therefore their response should be cached: /static, /assets, /wp-content, /media, /templates, /public, /shared
  * It's possible to force a cache storing a dynamic response by using a delimiter, a static directory and dots like: `/home/..%2fstatic/something` will cache  `/static/something` and the response will be`/home`
  * **Static dirs + dots**: A request to `/static/..%2Fhome` or to `/static/..%5Chome` might be cached as is but the response might be `/home`
* **Static files:** Some specific files are always cached like `/robots.txt`, `/favicon.ico`, and `/index.html`. Which can be abused like `/home/..%2Frobots.txt` where the cace might store `/robots.txt` and the origin server respond to `/home`.

{% hint style="success" %}
Learn & practice AWS Hacking:<img src="../../.gitbook/assets/arte.png" alt="" data-size="line">[**HackTricks Training AWS Red Team Expert (ARTE)**](https://training.hacktricks.xyz/courses/arte)<img src="../../.gitbook/assets/arte.png" alt="" data-size="line">\
Learn & practice GCP Hacking: <img src="../../.gitbook/assets/grte.png" alt="" data-size="line">[**HackTricks Training GCP Red Team Expert (GRTE)**<img src="../../.gitbook/assets/grte.png" alt="" data-size="line">](https://training.hacktricks.xyz/courses/grte)

<details>

<summary>Support HackTricks</summary>

* Check the [**subscription plans**](https://github.com/sponsors/carlospolop)!
* **Join the** 💬 [**Discord group**](https://discord.gg/hRep4RUj7f) or the [**telegram group**](https://t.me/peass) or **follow** us on **Twitter** 🐦 [**@hacktricks\_live**](https://twitter.com/hacktricks\_live)**.**
* **Share hacking tricks by submitting PRs to the** [**HackTricks**](https://github.com/carlospolop/hacktricks) and [**HackTricks Cloud**](https://github.com/carlospolop/hacktricks-cloud) github repos.

</details>
{% endhint %}
Recreating repository history 2024-12-12 10:39:29 +00:00			`# Cache Poisoning via URL discrepancies`

			`{% hint style="success" %}`
			`Learn & practice AWS Hacking:<img src="../../.gitbook/assets/arte.png" alt="" data-size="line">[HackTricks Training AWS Red Team Expert (ARTE)](https://training.hacktricks.xyz/courses/arte)<img src="../../.gitbook/assets/arte.png" alt="" data-size="line">\`
			`Learn & practice GCP Hacking: <img src="../../.gitbook/assets/grte.png" alt="" data-size="line">[HackTricks Training GCP Red Team Expert (GRTE)<img src="../../.gitbook/assets/grte.png" alt="" data-size="line">](https://training.hacktricks.xyz/courses/grte)`

			`<details>`

			`<summary>Support HackTricks</summary>`

			`* Check the [subscription plans](https://github.com/sponsors/carlospolop)!`
			`* Join the 💬 [Discord group](https://discord.gg/hRep4RUj7f) or the [telegram group](https://t.me/peass) or follow us on Twitter 🐦 [@hacktricks\_live](https://twitter.com/hacktricks\_live).`
			`* Share hacking tricks by submitting PRs to the [HackTricks](https://github.com/carlospolop/hacktricks) and [HackTricks Cloud](https://github.com/carlospolop/hacktricks-cloud) github repos.`

			`</details>`
			`{% endhint %}`

			`This is a summary of the techniques proposed in the post [https://portswigger.net/research/gotta-cache-em-all](https://portswigger.net/research/gotta-cache-em-all) in order to perform cache poisoning attacks abusing discrepancies between cache proxies and web servers.`

			`{% hint style="info" %}`
			`The goal of this attack is to make the cache server think that a static resource is being loaded so it caches it while the cache server stores as cache key part of the path but the web server responds resolving another path. The web server will resolve the real path which will be loading a dynamic page (which might store sensitive information about the user, a malicious payload like XSS or redirecting to lo load a JS file from the attackers website for example).`
			`{% endhint %}`

			`## Delimiters`

			`URL delimiters vary by framework and server, impacting how requests are routed and responses are handled. Some common origin delimiters are:`

			* Semicolon: Used in Spring for matrix variables (e.g. `/hello;var=a/world;var1=b;var2=c` → `/hello/world`).
			* Dot: Specifies response format in Ruby on Rails (e.g. `/MyAccount.css` → `/MyAccount`)
			* Null Byte: Truncates paths in OpenLiteSpeed (e.g. `/MyAccount%00aaa` → `/MyAccount`).
			* Newline Byte: Separates URL components in Nginx (e.g. `/users/MyAccount%0aaaa` → `/account/MyAccount`).

			`Other specific delimiters might be found following this process:`

			`* Step 1: Identify non-cacheable requests and use them to monitor how URLs with potential delimiters are handled.`
			`* Step 2: Append random suffixes to paths and compare the server's response to determine if a character functions as a delimiter.`
			`* Step 3: Introduce potential delimiters before the random suffix to see if the response changes, indicating delimiter usage.`

			`## Normalization & Encodings`

			`* Purpose: URL parsers in both cache and origin servers normalize URLs to extract paths for endpoint mapping and cache keys.`
			`* Process: Identifies path delimiters, extracts and normalizes the path by decoding characters and removing dot-segments.`

			`### Encodings`

			Different HTTP servers and proxies like Nginx, Node, and CloudFront decode delimiters differently, leading to inconsistencies across CDNs and origin servers that could be exploited. For example, if the web server perform this transformation `/myAccount%3Fparam` → `/myAccount?param` but the cache server keeps as key the path `/myAccount%3Fparam`, there is an inconsistency.

			`A way to check for these inconsistencies is to send requests URL encoding different chars after loading the path without any encoding and check if the encoded path response came from the cached response.`

			`### Dot segment`

			The path normalization where dots are involved is also very interesting for cache poisoning attacks. For example, `/static/../home/index` or `/aaa..\home/index`, some cache servers will cache these paths with themselves ad the keys while other might resolve the path and use `/home/index` as the cache key.\
			Just like before, sending these kind of requests and checking if the response was gathered from the cache helps to identify if the response to `/home/index` is the response sent when those paths are requested.

			`## Static Resources`

			`Several cache servers will always cache a response if it's identified as static. This might be because:`

			`* The extension: Cloudflare will always cache files with the following extensions: 7z, csv, gif, midi, png, tif, zip, avi, doc, gz, mkv, ppt, tiff, zst, avif, docx, ico, mp3, pptx, ttf, apk, dmg, iso, mp4, ps, webm, bin, ejs, jar, ogg, rar, webp, bmp, eot, jpg, otf, svg, woff, bz2, eps, jpeg, pdf, svgz, woff2, class, exe, js, pict, swf, xls, css, flac, mid, pls, tar, xlsx`
			* It's possible to force a cache storing a dynamic response by using a delimiter and a static extension like a request to `/home$image.png` will cache `/home$image.png` and the origin server will respond with `/home`
			`* Well-known static directories: The following directories contains static files and therefore their response should be cached: /static, /assets, /wp-content, /media, /templates, /public, /shared`
			* It's possible to force a cache storing a dynamic response by using a delimiter, a static directory and dots like: `/home/..%2fstatic/something` will cache `/static/something` and the response will be`/home`
			* Static dirs + dots: A request to `/static/..%2Fhome` or to `/static/..%5Chome` might be cached as is but the response might be `/home`
			* Static files: Some specific files are always cached like `/robots.txt`, `/favicon.ico`, and `/index.html`. Which can be abused like `/home/..%2Frobots.txt` where the cace might store `/robots.txt` and the origin server respond to `/home`.

			`{% hint style="success" %}`
			`Learn & practice AWS Hacking:<img src="../../.gitbook/assets/arte.png" alt="" data-size="line">[HackTricks Training AWS Red Team Expert (ARTE)](https://training.hacktricks.xyz/courses/arte)<img src="../../.gitbook/assets/arte.png" alt="" data-size="line">\`
			`Learn & practice GCP Hacking: <img src="../../.gitbook/assets/grte.png" alt="" data-size="line">[HackTricks Training GCP Red Team Expert (GRTE)<img src="../../.gitbook/assets/grte.png" alt="" data-size="line">](https://training.hacktricks.xyz/courses/grte)`

			`<details>`

			`<summary>Support HackTricks</summary>`

			`* Check the [subscription plans](https://github.com/sponsors/carlospolop)!`
			`* Join the 💬 [Discord group](https://discord.gg/hRep4RUj7f) or the [telegram group](https://t.me/peass) or follow us on Twitter 🐦 [@hacktricks\_live](https://twitter.com/hacktricks\_live).`
			`* Share hacking tricks by submitting PRs to the [HackTricks](https://github.com/carlospolop/hacktricks) and [HackTricks Cloud](https://github.com/carlospolop/hacktricks-cloud) github repos.`

			`</details>`
			`{% endhint %}`