Learn & practice AWS Hacking:<imgsrc="/.gitbook/assets/arte.png"alt=""data-size="line">[**HackTricks Training AWS Red Team Expert (ARTE)**](https://training.hacktricks.xyz/courses/arte)<imgsrc="/.gitbook/assets/arte.png"alt=""data-size="line">\
Learn & practice GCP Hacking: <imgsrc="/.gitbook/assets/grte.png"alt=""data-size="line">[**HackTricks Training GCP Red Team Expert (GRTE)**<imgsrc="/.gitbook/assets/grte.png"alt=""data-size="line">](https://training.hacktricks.xyz/courses/grte)
* Check the [**subscription plans**](https://github.com/sponsors/carlospolop)!
* **Join the** 💬 [**Discord group**](https://discord.gg/hRep4RUj7f) or the [**telegram group**](https://t.me/peass) or **follow** us on **Twitter** 🐦 [**@hacktricks\_live**](https://twitter.com/hacktricks\_live)**.**
* **Share hacking tricks by submitting PRs to the** [**HackTricks**](https://github.com/carlospolop/hacktricks) and [**HackTricks Cloud**](https://github.com/carlospolop/hacktricks-cloud) github repos.
The previous program has **9 program headers**, then, the **segment mapping** indicates in which program header (from 00 to 08) **each section is located**.
### PHDR - Program HeaDeR
Contains the program header tables and metadata itself.
### INTERP
Indicates the path of the loader to use to load the binary into memory.
### LOAD
These headers are used to indicate **how to load a binary into memory.**\
Each **LOAD** header indicates a region of **memory** (size, permissions and alignment) and indicates the bytes of the ELF **binary to copy in there**.
For example, the second one has a size of 0x1190, should be located at 0x1fc48 with permissions read and write and will be filled with 0x528 from the offset 0xfc48 (it doesn't fill all the reserved space). This memory will contain the sections `.init_array .fini_array .dynamic .got .data .bss`.
### DYNAMIC
This header helps to link programs to their library dependencies and apply relocations. Check the **`.dynamic`** section.
### NOTE
This stores vendor metadata information about the binary.
### GNU\_EH\_FRAME
Defines the location of the stack unwind tables, used by debuggers and C++ exception handling-runtime functions.
### GNU\_STACK
Contains the configuration of the stack execution prevention defense. If enabled, the binary won't be able to execute code from the stack.
### GNU\_RELRO
Indicates the RELRO (Relocation Read-Only) configuration of the binary. This protection will mark as read-only certain sections of the memory (like the `GOT` or the `init` and `fini` tables) after the program has loaded and before it begins running.
In the previous example it's copying 0x3b8 bytes to 0x1fc48 as read-only affecting the sections `.init_array .fini_array .dynamic .got .data .bss`.
Note that RELRO can be partial or full, the partial version do not protect the section **`.plt.got`**, which is used for **lazy binding** and needs this memory space to have **write permissions** to write the address of the libraries the first time their location is searched.
### TLS
Defines a table of TLS entries, which stores info about thread-local variables.
## Section Headers
Section headers gives a more detailed view of the ELF binary
It also indicates the location, offset, permissions but also the **type of data** it section has.
### Meta Sections
* **String table**: It contains all the strings needed by the ELF file (but not the ones actually used by the program). For example it contains sections names like `.text` or `.data`. And if `.text` is at offset 45 in the strings table it will use the number **45** in the **name** field.
* In order to find where the string table is, the ELF contains a pointer to the string table.
* **Symbol table**: It contains info about the symbols like the name (offset in the strings table), address, size and more metadata about the symbol.
### Main Sections
* **`.text`**: The instruction of the program to run.
* **`.data`**: Global variables with a defined value in the program.
* **`.bss`**: Global variables left uninitialized (or init to zero). Variables here are automatically intialized to zero therefore preventing useless zeroes to being added to the binary.
* **`.rodata`**: Constant global variables (read-only section).
* **`.tdata`** and **`.tbss`**: Like the .data and .bss when thread-local variables are used (`__thread_local` in C++ or `__thread` in C).
* **`.dynamic`**: See below.
## Symbols
Symbols is a named location in the program which could be a function, a global data object, thread-local variables...
```
readelf -s lnstat
Symbol table '.dynsym' contains 49 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000001088 0 SECTION LOCAL DEFAULT 12 .init
2: 0000000000020000 0 SECTION LOCAL DEFAULT 23 .data
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strtok@GLIBC_2.17 (2)
4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND s[...]@GLIBC_2.17 (2)
5: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strlen@GLIBC_2.17 (2)
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND fputs@GLIBC_2.17 (2)
7: 0000000000000000 0 FUNC GLOBAL DEFAULT UND exit@GLIBC_2.17 (2)
8: 0000000000000000 0 FUNC GLOBAL DEFAULT UND _[...]@GLIBC_2.34 (3)
9: 0000000000000000 0 FUNC GLOBAL DEFAULT UND perror@GLIBC_2.17 (2)
10: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterT[...]
11: 0000000000000000 0 FUNC WEAK DEFAULT UND _[...]@GLIBC_2.17 (2)
12: 0000000000000000 0 FUNC GLOBAL DEFAULT UND putc@GLIBC_2.17 (2)
[...]
```
Each symbol entry contains:
* **Name**
* **Binding attributes** (weak, local or global): A local symbol can only be accessed by the program itself while the global symbol are shared outside the program. A weak object is for example a function that can be overridden by a different one.
* **Type**: NOTYPE (no type specified), OBJECT (global data var), FUNC (function), SECTION (section), FILE (source-code file for debuggers), TLS (thread-local variable), GNU\_IFUNC (indirect function for relocation)
* **Section** index where it's located
* **Value** (address sin memory)
* **Size**
## Dynamic Section
```
readelf -d lnstat
Dynamic section at offset 0xfc58 contains 28 entries:
The NEEDED directory indicates that the program **needs to load the mentioned library** in order to continue. The NEEDED directory completes once the shared **library is fully operational and ready** for use.
## Relocations
The loader also must relocate dependencies after having loaded them. These relocations are indicated in the relocation table in formats REL or RELA and the number of relocations is given in the dynamic sections RELSZ or RELASZ.
```
readelf -r lnstat
Relocation section '.rela.dyn' at offset 0xaa0 contains 23 entries:
If the **program is loaded in a place different** from the preferred address (usually 0x400000) because the address is already used or because of **ASLR** or any other reason, a static relocation **corrects pointers** that had values expecting the binary to be loaded in the preferred address.
For example any section of type `R_AARCH64_RELATIV` should have modified the address at the relocation bias plus the addend value.
### Dynamic Relocations and GOT
The relocation could also reference an external symbol (like a function from a dependency). Like the function malloc from libC. Then, the loader when loading libC in an address checking where the malloc function is loaded, it will write this address in the GOT (Global Offset Table) table (indicated in the relocation table) where the address of malloc should be specified.
### Procedure Linkage Table
The PLT section allows to perform lazy binding, which means that the resolution of the location of a function will be performed the first time it's accessed.
So when a program calls to malloc, it actually calls the corresponding location of `malloc` in the PLT (`malloc@plt`). The first time it's called it resolves the address of `malloc` and stores it so next time `malloc` is called, that address is used instead of the PLT code.
## Program Initialization
After the program has been loaded it's time for it to run. However, the first code that is run i**sn't always the `main`** function. This is because for example in C++ if a **global variable is an object of a class**, this object must be **initialized****before** main runs, like in:
```cpp
#include <stdio.h>
// g++ autoinit.cpp -o autoinit
class AutoInit {
public:
AutoInit() {
printf("Hello AutoInit!\n");
}
~AutoInit() {
printf("Goodbye AutoInit!\n");
}
};
AutoInit autoInit;
int main() {
printf("Main\n");
return 0;
}
```
Note that these global variables are located in `.data` or `.bss` but in the lists `__CTOR_LIST__` and `__DTOR_LIST__` the objects to initialize and destruct are stored in order to keep track of them.
From C code it's possible to obtain the same result using the GNU extensions :
```c
__attributte__((constructor)) //Add a constructor to execute before
__attributte__((destructor)) //Add to the destructor list
```
From a compiler perspective, to execute these actions before and after the `main` function is executed, it's possible to create a `init` function and a `fini` function which would be referenced in the dynamic section as **`INIT`** and **`FIN`**. and are placed in the `init` and `fini` sections of the ELF.
The other option, as mentioned, is to reference the lists **`__CTOR_LIST__`** and **`__DTOR_LIST__`** in the **`INIT_ARRAY`** and **`FINI_ARRAY`** entries in the dynamic section and the length of these are indicated by **`INIT_ARRAYSZ`** and **`FINI_ARRAYSZ`**. Each entry is a function pointer that will be called without arguments.
Moreover, it's also possible to have a **`PREINIT_ARRAY`** with **pointers** that will be executed **before** the **`INIT_ARRAY`** pointers.
### Initialization Order
1. The program is loaded into memory, static global variables are initialized in **`.data`** and unitialized ones zeroed in **`.bss`**.
2. All **dependencies** for the program or libraries are **initialized** and the the **dynamic linking** is executed.
3.**`PREINIT_ARRAY`** functions are executed.
4.**`INIT_ARRAY`** functions are executed.
5. If there is a **`INIT`** entry it's called.
6. If a library, dlopen ends here, if a program, it's time to call the **real entry point** (`main` function).
## Thread-Local Storage (TLS)
They are defined using the keyword **`__thread_local`** in C++ or the GNU extension **`__thread`**.
Each thread will maintain a unique location for this variable so only the thread can access its variable.
When this is used the sections **`.tdata`** and **`.tbss`** are used in the ELF. Which are like `.data` (initialized) and `.bss` (not initialized) but for TLS.
Each variable will hace an entry in the TLS header specifying the size and the TLS offset, which is the offset it will use in the thread's local data area.
The `__TLS_MODULE_BASE` is a symbol used to refer to the base address of the thread local storage and points to the area in memory that contains all the thread-local data of a module.
Learn & practice AWS Hacking:<imgsrc="/.gitbook/assets/arte.png"alt=""data-size="line">[**HackTricks Training AWS Red Team Expert (ARTE)**](https://training.hacktricks.xyz/courses/arte)<imgsrc="/.gitbook/assets/arte.png"alt=""data-size="line">\
Learn & practice GCP Hacking: <imgsrc="/.gitbook/assets/grte.png"alt=""data-size="line">[**HackTricks Training GCP Red Team Expert (GRTE)**<imgsrc="/.gitbook/assets/grte.png"alt=""data-size="line">](https://training.hacktricks.xyz/courses/grte)
* Check the [**subscription plans**](https://github.com/sponsors/carlospolop)!
* **Join the** 💬 [**Discord group**](https://discord.gg/hRep4RUj7f) or the [**telegram group**](https://t.me/peass) or **follow** us on **Twitter** 🐦 [**@hacktricks\_live**](https://twitter.com/hacktricks\_live)**.**
* **Share hacking tricks by submitting PRs to the** [**HackTricks**](https://github.com/carlospolop/hacktricks) and [**HackTricks Cloud**](https://github.com/carlospolop/hacktricks-cloud) github repos.