Add heapblock and dlmalloc for managing memory

heapblock is a simple `sbrk` style implementation, also useful as an
"endless" decompression buffer. dlmalloc is used on top as a malloc
implementation.

This also changes how the Python side manages its heap. We still use a
python-side malloc implementation (since this is faster), and we put the
Python heap at the m1n1 heap + 128MB, without allocating it.
Hopefully this should never step on anything m1n1 neads, and avoids
having to manage freeing across Python script calls.

Signed-off-by: Hector Martin <marcan@marcan.st>
This commit is contained in:
Hector Martin 2021-01-29 15:19:34 +09:00
parent f3d0a58f42
commit 986c6730e9
16 changed files with 6595 additions and 23 deletions

View file

@ -0,0 +1,121 @@
Creative Commons Legal Code
CC0 1.0 Universal
CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE
LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN
ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS
INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES
REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS
PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM
THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED
HEREUNDER.
Statement of Purpose
The laws of most jurisdictions throughout the world automatically confer
exclusive Copyright and Related Rights (defined below) upon the creator
and subsequent owner(s) (each and all, an "owner") of an original work of
authorship and/or a database (each, a "Work").
Certain owners wish to permanently relinquish those rights to a Work for
the purpose of contributing to a commons of creative, cultural and
scientific works ("Commons") that the public can reliably and without fear
of later claims of infringement build upon, modify, incorporate in other
works, reuse and redistribute as freely as possible in any form whatsoever
and for any purposes, including without limitation commercial purposes.
These owners may contribute to the Commons to promote the ideal of a free
culture and the further production of creative, cultural and scientific
works, or to gain reputation or greater distribution for their Work in
part through the use and efforts of others.
For these and/or other purposes and motivations, and without any
expectation of additional consideration or compensation, the person
associating CC0 with a Work (the "Affirmer"), to the extent that he or she
is an owner of Copyright and Related Rights in the Work, voluntarily
elects to apply CC0 to the Work and publicly distribute the Work under its
terms, with knowledge of his or her Copyright and Related Rights in the
Work and the meaning and intended legal effect of CC0 on those rights.
1. Copyright and Related Rights. A Work made available under CC0 may be
protected by copyright and related or neighboring rights ("Copyright and
Related Rights"). Copyright and Related Rights include, but are not
limited to, the following:
i. the right to reproduce, adapt, distribute, perform, display,
communicate, and translate a Work;
ii. moral rights retained by the original author(s) and/or performer(s);
iii. publicity and privacy rights pertaining to a person's image or
likeness depicted in a Work;
iv. rights protecting against unfair competition in regards to a Work,
subject to the limitations in paragraph 4(a), below;
v. rights protecting the extraction, dissemination, use and reuse of data
in a Work;
vi. database rights (such as those arising under Directive 96/9/EC of the
European Parliament and of the Council of 11 March 1996 on the legal
protection of databases, and under any national implementation
thereof, including any amended or successor version of such
directive); and
vii. other similar, equivalent or corresponding rights throughout the
world based on applicable law or treaty, and any national
implementations thereof.
2. Waiver. To the greatest extent permitted by, but not in contravention
of, applicable law, Affirmer hereby overtly, fully, permanently,
irrevocably and unconditionally waives, abandons, and surrenders all of
Affirmer's Copyright and Related Rights and associated claims and causes
of action, whether now known or unknown (including existing as well as
future claims and causes of action), in the Work (i) in all territories
worldwide, (ii) for the maximum duration provided by applicable law or
treaty (including future time extensions), (iii) in any current or future
medium and for any number of copies, and (iv) for any purpose whatsoever,
including without limitation commercial, advertising or promotional
purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each
member of the public at large and to the detriment of Affirmer's heirs and
successors, fully intending that such Waiver shall not be subject to
revocation, rescission, cancellation, termination, or any other legal or
equitable action to disrupt the quiet enjoyment of the Work by the public
as contemplated by Affirmer's express Statement of Purpose.
3. Public License Fallback. Should any part of the Waiver for any reason
be judged legally invalid or ineffective under applicable law, then the
Waiver shall be preserved to the maximum extent permitted taking into
account Affirmer's express Statement of Purpose. In addition, to the
extent the Waiver is so judged Affirmer hereby grants to each affected
person a royalty-free, non transferable, non sublicensable, non exclusive,
irrevocable and unconditional license to exercise Affirmer's Copyright and
Related Rights in the Work (i) in all territories worldwide, (ii) for the
maximum duration provided by applicable law or treaty (including future
time extensions), (iii) in any current or future medium and for any number
of copies, and (iv) for any purpose whatsoever, including without
limitation commercial, advertising or promotional purposes (the
"License"). The License shall be deemed effective as of the date CC0 was
applied by Affirmer to the Work. Should any part of the License for any
reason be judged legally invalid or ineffective under applicable law, such
partial invalidity or ineffectiveness shall not invalidate the remainder
of the License, and in such case Affirmer hereby affirms that he or she
will not (i) exercise any of his or her remaining Copyright and Related
Rights in the Work or (ii) assert any associated claims and causes of
action with respect to the Work, in either case contrary to Affirmer's
express Statement of Purpose.
4. Limitations and Disclaimers.
a. No trademark or patent rights held by Affirmer are waived, abandoned,
surrendered, licensed or otherwise affected by this document.
b. Affirmer offers the Work as-is and makes no representations or
warranties of any kind concerning the Work, express, implied,
statutory or otherwise, including without limitation warranties of
title, merchantability, fitness for a particular purpose, non
infringement, or the absence of latent or other defects, accuracy, or
the present or absence of errors, whether or not discoverable, all to
the greatest extent permissible under applicable law.
c. Affirmer disclaims responsibility for clearing rights of other persons
that may apply to the Work or any use thereof, including without
limitation any person's Copyright and Related Rights in the Work.
Further, Affirmer disclaims responsibility for obtaining any necessary
consents, permissions or other rights required for any use of the
Work.
d. Affirmer understands and acknowledges that Creative Commons is not a
party to this document and has no duty or obligation with respect to
this CC0 or use of the Work.

View file

@ -16,9 +16,12 @@ MINILZLIB_OBJECTS := $(patsubst %,minilzlib/%, \
TINF_OBJECTS := $(patsubst %,tinf/%, \ TINF_OBJECTS := $(patsubst %,tinf/%, \
adler32.o crc32.o tinfgzip.o tinflate.o tinfzlib.o) adler32.o crc32.o tinfgzip.o tinflate.o tinfzlib.o)
DLMALLOC_OBJECTS := dlmalloc/malloc.o
OBJECTS := adt.o bootlogo_128.o bootlogo_256.o chickens.o exception.o exception_asm.o fb.o \ OBJECTS := adt.o bootlogo_128.o bootlogo_256.o chickens.o exception.o exception_asm.o fb.o \
main.o memory.o memory_asm.o proxy.o smp.o start.o startup.o string.o uart.o uartproxy.o \ heapblock.o main.o memory.o memory_asm.o proxy.o smp.o start.o startup.o string.o uart.o \
utils.o utils_asm.o vsprintf.o wdt.o $(MINILZLIB_OBJECTS) $(TINF_OBJECTS) uartproxy.o utils.o utils_asm.o vsprintf.o wdt.o $(MINILZLIB_OBJECTS) $(TINF_OBJECTS) \
$(DLMALLOC_OBJECTS)
DTS := apple-j274.dts DTS := apple-j274.dts

View file

@ -38,3 +38,10 @@ m1n1 embeds portions taken from
[BSD](3rdparty_licenses/LICENSE.BSD-3.arm) licensed and copyright: [BSD](3rdparty_licenses/LICENSE.BSD-3.arm) licensed and copyright:
* Copyright (c) 2013-2020, ARM Limited and Contributors. All rights reserved. * Copyright (c) 2013-2020, ARM Limited and Contributors. All rights reserved.
m1n1 embeds [Doug Lea's malloc](ftp://gee.cs.oswego.edu/pub/misc/malloc.c) (dlmalloc), which is in
the public domain ([CC0](3rdparty_licenses/LICENSE.CC0)).
m1n1 embeds portions of [PDCLib](https://github.com/DevSolar/pdclib), which is in the public
domain ([CC0](3rdparty_licenses/LICENSE.CC0).

View file

@ -5,6 +5,8 @@ _va_base = 0xFFFFFE0007004000;
_stack_size = 0x20000; _stack_size = 0x20000;
_max_payload_size = 64*1024*1024;
/* We are actually relocatable */ /* We are actually relocatable */
. = 0; . = 0;
@ -106,9 +108,9 @@ SECTIONS {
LONG(0x444C5950); /* segmname = "PYLD" */ LONG(0x444C5950); /* segmname = "PYLD" */
. += 12; . += 12;
QUAD(_end + _va_off); /* vmaddr */ QUAD(_end + _va_off); /* vmaddr */
QUAD(64*1024*1024); /* vmsize */ QUAD(_max_payload_size); /* vmsize */
QUAD(_file_end - _base); /* fileoff */ QUAD(_file_end - _base); /* fileoff */
QUAD(64*1024*1024); /* filesize */ QUAD(_max_payload_size); /* filesize */
LONG(PROT_READ | PROT_WRITE); /* maxprot */ LONG(PROT_READ | PROT_WRITE); /* maxprot */
LONG(PROT_READ | PROT_WRITE); /* initprot */ LONG(PROT_READ | PROT_WRITE); /* initprot */
LONG(0); /* nsects */ LONG(0); /* nsects */
@ -169,6 +171,8 @@ SECTIONS {
} :data } :data
_data_size = . - _data_start; _data_size = . - _data_start;
_end = .; _end = .;
_payload_start = .;
_payload_end = . + _max_payload_size;
.symtab 0 : { *(.symtab) } .symtab 0 : { *(.symtab) }
.strtab 0 : { *(.strtab) } .strtab 0 : { *(.strtab) }

View file

@ -1,25 +1,19 @@
#!/usr/bin/python #!/usr/bin/python
import serial, os, struct, sys, time from setup import *
from proxy import *
from tgtypes import *
uartdev = os.environ.get("M1N1DEVICE", "/dev/ttyUSB0")
usbuart = serial.Serial(uartdev, 115200)
iface = UartInterface(usbuart, debug=False)
proxy = M1N1Proxy(iface, debug=False)
proxy.set_baud(1500000)
payload = open(sys.argv[1], "rb").read() payload = open(sys.argv[1], "rb").read()
base = proxy.get_base() try:
ba_addr = proxy.get_bootargs() # Try to use the m1n1 heap to avoid wasting 128MB RAM on every load
new_base = p.memalign(0x10000, len(payload))
except:
# Fall back to proxy heap, which will be at the right place in old versions
new_base = u.memalign(0x10000, len(payload))
ba = iface.readstruct(ba_addr, BootArgs) # FIXME: this will currently still waste the whole m1n1 size including payload area (64+MB) on each
# chainload. The best way to fix this is to support in-place chainloading, which has other
new_base = base + ((ba.top_of_kernel_data + 0xffff) & ~0xffff) - ba.phys_base # advantages.
print("Loading %d bytes to 0x%x" % (len(payload), new_base)) print("Loading %d bytes to 0x%x" % (len(payload), new_base))
@ -29,7 +23,7 @@ entry = new_base + 0x4800
print("Jumping to 0x%x" % entry) print("Jumping to 0x%x" % entry)
proxy.reboot(entry, ba_addr) p.reboot(entry, u.ba_addr)
iface.nop() iface.nop()
print("Proxy is alive again") print("Proxy is alive again")

View file

@ -308,6 +308,11 @@ class M1N1Proxy:
P_SMP_CALL = 0x501 P_SMP_CALL = 0x501
P_SMP_CALL_SYNC = 0x502 P_SMP_CALL_SYNC = 0x502
P_HEAPBLOCK_ALLOC = 0x600
P_MALLOC = 0x601
P_MEMALIGN = 0x602
P_FREE = 0x602
def __init__(self, iface, debug=False): def __init__(self, iface, debug=False):
self.debug = debug self.debug = debug
self.iface = iface self.iface = iface
@ -517,6 +522,15 @@ class M1N1Proxy:
raise ValueError("Too many arguments") raise ValueError("Too many arguments")
return self.request(self.P_SMP_CALL_SYNC, cpu, addr, *args) return self.request(self.P_SMP_CALL_SYNC, cpu, addr, *args)
def heapblock_alloc(self, size):
return self.request(self.P_HEAPBLOCK_ALLOC, size)
def malloc(self, size):
return self.request(self.P_MALLOC, size)
def memalign(self, align, size):
return self.request(self.P_MEMALIGN, align, size)
def free(self, ptr):
self.request(self.P_FREE, ptr)
if __name__ == "__main__": if __name__ == "__main__":
import serial import serial
uartdev = os.environ.get("M1N1DEVICE", "/dev/ttyUSB0") uartdev = os.environ.get("M1N1DEVICE", "/dev/ttyUSB0")

View file

@ -40,9 +40,22 @@ class ProxyUtils(object):
self.ba_addr = p.get_bootargs() self.ba_addr = p.get_bootargs()
self.ba = self.iface.readstruct(self.ba_addr, BootArgs) self.ba = self.iface.readstruct(self.ba_addr, BootArgs)
self._scratch = self.base + ((self.ba.top_of_kernel_data + 0xffff) & ~0xffff) - self.ba.phys_base
self.heap = malloc.Heap(self._scratch, self._scratch + 0x1000000) # We allocate a 128MB heap, 128MB after the m1n1 heap, without telling it about it.
# This frees up from having to coordinate memory management or free stuff after a Python
# script runs, at the expense that if m1n1 ever uses more than 128MB of heap it will
# clash with Python (m1n1 will normally not use *any* heap when running proxy ops though,
# except when running very high-level operations like booting a kernel, so this should be
# OK).
self.heap_size = 128 * 1024 * 1024
try:
self.heap_base = p.heapblock_alloc(0)
except ProxyRemoteError:
# Compat with versions that don't have heapblock yet
self.heap_base = (self.base + ((self.ba.top_of_kernel_data + 0xffff) & ~0xffff) -
self.ba.phys_base)
self.heap_base += 128 * 1024 * 1024 # We leave 128MB for m1n1 heap
self.heap = malloc.Heap(self.heap_base, self.heap_base + self.heap_size)
self.malloc = self.heap.malloc self.malloc = self.heap.malloc
self.memalign = self.heap.memalign self.memalign = self.heap.memalign

6282
src/dlmalloc/malloc.c Normal file

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,33 @@
#include "../heapblock.h"
#include "../string.h"
#include "../utils.h"
#define HAVE_MORECORE 1
#define HAVE_MMAP 0
#define MORECORE sbrk
// This is optimal; dlmalloc copes with other users of sbrk/MORECORE gracefully, and heapblock
// guarantees contiguous returns if called consecutively.
#define MORECORE_CONTIGUOUS 1
#define MALLOC_ALIGNMENT 16
#define ABORT panic("dlmalloc: internal error\n")
#define NO_MALLINFO 1
#define NO_MALLOC_STATS 1
#define malloc_getpagesize 16384
#define LACKS_UNISTD_H 1
#define LACKS_FCNTL_H 1
#define LACKS_SYS_PARAM_H 1
#define LACKS_SYS_MMAN_H 1
#define LACKS_STRINGS_H 1
#define LACKS_STRING_H 1
#define LACKS_STDLIB_H 1
#define LACKS_SCHED_H 1
#define LACKS_TIME_H 1
#define MALLOC_FAILURE_ACTION panic("dlmalloc: out of memory\n");
static void *sbrk(intptr_t inc)
{
if (inc < 0)
return (void *)-1;
return heapblock_alloc(inc);
}

49
src/heapblock.c Normal file
View file

@ -0,0 +1,49 @@
/* SPDX-License-Identifier: MIT */
#include <assert.h>
#include "heapblock.h"
#include "types.h"
#include "utils.h"
#include "xnuboot.h"
/*
* This is a non-freeing allocator, used as a backend for malloc and for uncompressing data.
*
* Allocating 0 bytes is allowed, and guarantees "infinite" (until the end of RAM) space is
* available at the returned pointer as long as no other malloc/heapblock calls occur, which is
* useful as a buffer for unknown-length uncompressed data. A subsequent call with a size will then
* actually reserve the block.
*/
static void *heap_base;
void heapblock_init(void)
{
void *top_of_kernel_data = (void *)cur_boot_args.top_of_kernel_data;
void *payload_end = _payload_end;
if (payload_end > top_of_kernel_data)
heap_base = payload_end; // Chainloaded, we are last in RAM
else
heap_base = top_of_kernel_data; // Loaded by iBoot, there is data after us in RAM
heapblock_alloc(0); // align base
printf("Heap base: %p\n", heap_base);
}
void *heapblock_alloc(size_t size)
{
return heapblock_alloc_aligned(size, 64);
}
void *heapblock_alloc_aligned(size_t size, size_t align)
{
assert((align & (align - 1)) == 0);
uintptr_t block = (((uintptr_t)heap_base) + align - 1) & ~(align - 1);
heap_base = (void *)(block + size);
return (void *)block;
}

13
src/heapblock.h Normal file
View file

@ -0,0 +1,13 @@
/* SPDX-License-Identifier: MIT */
#ifndef HEAPBLOCK_H
#define HEAPBLOCK_H
#include "types.h"
void heapblock_init(void);
void *heapblock_alloc(size_t size);
void *heapblock_alloc_aligned(size_t size, size_t align);
#endif

View file

@ -4,6 +4,7 @@
#include "adt.h" #include "adt.h"
#include "fb.h" #include "fb.h"
#include "heapblock.h"
#include "memory.h" #include "memory.h"
#include "smp.h" #include "smp.h"
#include "string.h" #include "string.h"
@ -45,7 +46,9 @@ void m1n1_main(void)
printf("Licensed under the MIT license\n\n"); printf("Licensed under the MIT license\n\n");
printf("Running in EL%d\n\n", mrs(CurrentEL) >> 2); printf("Running in EL%d\n\n", mrs(CurrentEL) >> 2);
mmu_init(); mmu_init();
heapblock_init();
#ifdef SHOW_LOGO #ifdef SHOW_LOGO
fb_init(); fb_init();

14
src/malloc.h Normal file
View file

@ -0,0 +1,14 @@
/* SPDX-License-Identifier: MIT */
#ifndef MALLOC_H
#define MALLOC_H
void *malloc(size_t);
void free(void *);
void *calloc(size_t, size_t);
void *realloc(void *, size_t);
void *realloc_in_place(void *, size_t);
void *memalign(size_t, size_t);
int posix_memalign(void **, size_t, size_t);
#endif

View file

@ -1,4 +1,6 @@
#include "proxy.h" #include "proxy.h"
#include "heapblock.h"
#include "malloc.h"
#include "memory.h" #include "memory.h"
#include "minilzlib/minlzma.h" #include "minilzlib/minlzma.h"
#include "smp.h" #include "smp.h"
@ -210,6 +212,19 @@ int proxy_process(ProxyRequest *request, ProxyReply *reply)
reply->retval = smp_wait(request->args[0]); reply->retval = smp_wait(request->args[0]);
break; break;
case P_HEAPBLOCK_ALLOC:
reply->retval = (u64)heapblock_alloc(request->args[0]);
break;
case P_MALLOC:
reply->retval = (u64)malloc(request->args[0]);
break;
case P_MEMALIGN:
reply->retval = (u64)memalign(request->args[0], request->args[1]);
break;
case P_FREE:
free((void *)request->args[0]);
break;
default: default:
reply->status = S_BADCMD; reply->status = S_BADCMD;
break; break;

View file

@ -62,6 +62,11 @@ typedef enum {
P_SMP_CALL, P_SMP_CALL,
P_SMP_CALL_SYNC, P_SMP_CALL_SYNC,
P_HEAPBLOCK_ALLOC = 0x600, // Heap and memory management ops
P_MALLOC,
P_MEMALIGN,
P_FREE,
} ProxyOp; } ProxyOp;
#define S_OK 0 #define S_OK 0

View file

@ -231,6 +231,8 @@ static inline u8 mask8(u64 addr, u8 clear, u8 set)
#define dc_civac(p) cacheop("dc civac", p) #define dc_civac(p) cacheop("dc civac", p)
extern char _base[0]; extern char _base[0];
extern char _payload_start[];
extern char _payload_end[];
/* /*
* These functions are guaranteed to copy by reading from src and writing to dst * These functions are guaranteed to copy by reading from src and writing to dst