bevy/crates/bevy_ptr
Adam 9bda913e36
Remove redundent information and optimize dynamic allocations in Table (#12929)
# Objective

- fix #12853
- Make `Table::allocate` faster

## Solution
The PR consists of multiple steps:

1) For the component data: create a new data-structure that's similar to
`BlobVec` but doesn't store `len` & `capacity` inside of it: "BlobArray"
(name suggestions welcome)
2) For the `Tick` data: create a new data-structure that's similar to
`ThinSlicePtr` but supports dynamic reallocation: "ThinArrayPtr" (name
suggestions welcome)
3) Create a new data-structure that's very similar to `Column` that
doesn't store `len` & `capacity` inside of it: "ThinColumn"
4) Adjust the `Table` implementation to use `ThinColumn` instead of
`Column`

The result is that only one set of `len` & `capacity` is stored in
`Table`, in `Table::entities`

### Notes Regarding Performance
Apart from shaving off some excess memory in `Table`, the changes have
also brought noteworthy performance improvements:
The previous implementation relied on `Vec::reserve` &
`BlobVec::reserve`, but that redundantly repeated the same if statement
(`capacity` == `len`). Now that check could be made at the `Table` level
because the capacity and length of all the columns are synchronized;
saving N branches per allocation. The result is a respectable
performance improvement per every `Table::reserve` (and subsequently
`Table::allocate`) call.

I'm hesitant to give exact numbers because I don't have a lot of
experience in profiling and benchmarking, but these are the results I
got so far:

*`add_remove_big/table` benchmark after the implementation:*


![after_add_remove_big_table](https://github.com/bevyengine/bevy/assets/46227443/b667da29-1212-4020-8bb0-ec0f15bb5f8a)

*`add_remove_big/table` benchmark in main branch (measured in comparison
to the implementation):*


![main_add_remove_big_table](https://github.com/bevyengine/bevy/assets/46227443/41abb92f-3112-4e01-b935-99696eb2fe58)

*`add_remove_very_big/table` benchmark after the implementation:*


![after_add_remove_very_big](https://github.com/bevyengine/bevy/assets/46227443/f268a155-295b-4f55-ab02-f8a9dcc64fc2)

*`add_remove_very_big/table` benchmark in main branch (measured in
comparison to the implementation):*


![main_add_remove_very_big](https://github.com/bevyengine/bevy/assets/46227443/78b4e3a6-b255-47c9-baee-1a24c25b9aea)

cc @james7132 to verify

---

## Changelog

- New data-structure that's similar to `BlobVec` but doesn't store `len`
& `capacity` inside of it: `BlobArray`
- New data-structure that's similar to `ThinSlicePtr` but supports
dynamic allocation:`ThinArrayPtr`
- New data-structure that's very similar to `Column` that doesn't store
`len` & `capacity` inside of it: `ThinColumn`
- Adjust the `Table` implementation to use `ThinColumn` instead of
`Column`
- New benchmark: `add_remove_very_big` to benchmark the performance of
spawning a lot of entities with a lot of components (15) each

## Migration Guide

`Table` now uses `ThinColumn` instead of `Column`. That means that
methods that previously returned `Column`, will now return `ThinColumn`
instead.

`ThinColumn` has a much more limited and low-level API, but you can
still achieve the same things in `ThinColumn` as you did in `Column`.
For example, instead of calling `Column::get_added_tick`, you'd call
`ThinColumn::get_added_ticks_slice` and index it to get the specific
added tick.

---------

Co-authored-by: James Liu <contact@jamessliu.com>
2024-09-16 22:52:05 +00:00
..
src Remove redundent information and optimize dynamic allocations in Table (#12929) 2024-09-16 22:52:05 +00:00
Cargo.toml Generate links to definition in source code pages on docs.rs and dev-docs.bevyengine.org (#12965) 2024-07-29 23:10:16 +00:00
README.md Add README.md to all crates (#13184) 2024-05-02 18:56:00 +00:00

Bevy Pointer

License Crates.io Downloads Docs Discord

Pointers in computer programming are objects that store a memory address. They're a fundamental building block for constructing more complex data structures.

They're also the definitive source of memory safety bugs: you can dereference a invalid (null) pointer, access a pointer after the underlying memory has been freed, and even ignore type safety and misread or mutate the underlying memory improperly.

Rust is a programming language that heavily relies on its types to enforce correctness, and by proxy, memory safety. As a result, Rust has an entire zoo of types for working with pointers, and a graph of safe and unsafe conversions that make working with them safer.

bevy_ptr is a crate that attempts to bridge the gap between the full blown unsafety of *mut () and the safe &'a T, allowing users to choose what invariants to uphold for their pointer, with the intent to enable building progressively safer abstractions.

How to Build a Borrow (From Scratch)

Correctly and safety converting a pointer into a valid borrow is at the core of all unsafe code in Rust. Looking at the documentation for [(*const T)::as_ref], a pointer must satisfy all of the following conditions:

  • The pointer must be properly aligned.
  • The pointer cannot be null, even for zero sized types.
  • The pointer must be within bounds of a valid allocated object (on the stack or the heap).
  • The pointer must point to an initialized instance of T.
  • The newly assigned lifetime should be valid for the value that the pointer is targeting.
  • The code must enforce Rust's aliasing rules. Only one mutable borrow or arbitrarily many read-only borrows may exist to a value at any given moment in time, and converting from &T to &mut T is never allowed.

Note these rules aren't final and are still in flux as the Rust Project hashes out what exactly are the pointer aliasing rules, but the expectation is that the final set of constraints are going to be a superset of this list, not a subset.

This list already is non-trivial to satisfy in isolation. Thankfully, the Rust core/standard library provides a progressive list of pointer types that help build these safety guarantees...

Standard Pointers

Pointer Type Lifetime'ed Mutable Strongly Typed Aligned Not Null Forbids Aliasing Forbids Arithmetic
Box<T> Owned Yes Yes Yes Yes Yes Yes
&'a mut T Yes Yes Yes Yes Yes Yes Yes
&'a T Yes No Yes Yes Yes No Yes
&'a UnsafeCell<T> Yes Maybe Yes Yes Yes Yes Yes
NonNull<T> No Yes Yes No Yes No No
*const T No No Yes No No No No
*mut T No Yes Yes No No No No
*const () No No No No No No No
*mut () No Yes No No No No No

&T, &mut T, and Box<T> are by far the most common pointer types that Rust developers will see. They're the only ones in this list that are entirely usable without the use of unsafe.

&UnsafeCell<T> is the first step away from safety. UnsafeCell is the only way to get a mutable borrow from an immutable one in the language, so it's the base primitive for all interior mutability in the language: Cell<T>, RefCell<T>, Mutex<T>, RwLock<T>, etc. are all built on top of UnsafeCell<T>. To safety convert &UnsafeCell<T> into a &T or &mut T, the caller must guarantee that all simultaneous access follow Rust's aliasing rules.

NonNull<T> takes quite a step down from the aforementioned types. In addition to allowing aliasing, it's the first pointer type on this list to drop both lifetimes and the alignment guarantees of borrows. Its only guarantees are that the pointer is not null and that it points to a valid instance of type T. If you've ever worked with C++, NonNull<T> is very close to a C++ reference (T&).

*const T and *mut T are what most developers with a background in C or C++ would consider pointers.

*const () is the bottom of this list. They're the Rust equivalent to C's void*. Note that Rust doesn't formally have a concept of type that holds an arbitrary untyped memory address. Pointing at the unit type (or some other zero-sized type) just happens to be the convention. The only way to reasonably use them is to cast back to a typed pointer. They show up occasionally when dealing with FFI and the rare occasion where dynamic dispatch is required, but a trait is too constraining of an interface to work with. A great example of this are the RawWaker APIs, where a singular trait (or set of traits) may be insufficient to capture all usage patterns. *mut () should only be used to carry the mutability of the target, and as there is no way to mutate an unknown type.

Available in Nightly

Pointer Type Lifetime'ed Mutable Strongly Typed Aligned Not Null Forbids Aliasing Forbids Arithmetic
Unique<T> Owned Yes Yes Yes Yes Yes Yes
Shared<T> Owned* Yes Yes Yes Yes No Yes

Unique<T> is currently available in core::ptr on nightly Rust builds. It's a pointer type that acts like it owns the value it points to. It can be thought of as a Box<T> that does not allocate on initialization or deallocated when it's dropped, and is in fact used to implement common types like Box<T>, Vec<T>, etc.

Shared<T> is currently available in core::ptr on nightly Rust builds. It's the pointer that backs both Rc<T> and Arc<T>. Its semantics allow for multiple instances to collectively own the data it points to, and as a result, forbids getting a mutable borrow.

bevy_ptr does not support these types right now, but may support polyfills for these pointer types if the need arises.

Available in bevy_ptr

Pointer Type Lifetime'ed Mutable Strongly Typed Aligned Not Null Forbids Aliasing Forbids Arithmetic
ConstNonNull<T> No No Yes No Yes No Yes
ThinSlicePtr<'a, T> Yes No Yes Yes Yes Yes Yes
OwningPtr<'a> Yes Yes No Maybe Yes Yes No
Ptr<'a> Yes No No Maybe Yes No No
PtrMut<'a> Yes Yes No Maybe Yes Yes No

ConstNonNull<T> is like NonNull<T> but disallows safe conversions into types that allow mutable access to the value it points to. It's the *const T to NonNull<T>'s *mut T.

ThinSlicePtr<'a, T> is a &'a [T] without the slice length. This means it's smaller on the stack, but it means bounds checking is impossible locally, so accessing elements in the slice is unsafe. In debug builds, the length is included and will be checked.

OwningPtr<'a>, Ptr<'a>, and PtrMut<'a> act like NonNull<()>, but attempts to restore much of the safety guarantees of Unique<T>, &T, and &mut T. They allow working with heterogenous type erased storage (i.e. ECS tables, typemaps) without the overhead of dynamic dispatch in a manner that progressively translates back to safe borrows. These types also support optional alignment requirements at a type level, and will verify it on dereference in debug builds.