bevy/crates/bevy_ecs/src/storage/blob_vec.rs

548 lines
21 KiB
Rust
Raw Normal View History

Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
use std::{
alloc::{handle_alloc_error, Layout},
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
cell::UnsafeCell,
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
num::NonZeroUsize,
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
ptr::NonNull,
};
use bevy_ptr::{OwningPtr, Ptr, PtrMut};
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
/// A flat, type-erased data storage type
///
/// Used to densely store homogeneous ECS data.
pub(super) struct BlobVec {
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
item_layout: Layout,
capacity: usize,
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
/// Number of elements, not bytes
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
len: usize,
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// the `data` ptr's layout is always `array_layout(item_layout, capacity)`
data: NonNull<u8>,
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// the `swap_scratch` ptr's layout is always `item_layout`
swap_scratch: NonNull<u8>,
// None if the underlying type doesn't need to be dropped
drop: Option<unsafe fn(OwningPtr<'_>)>,
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
}
// We want to ignore the `drop` field in our `Debug` impl
impl std::fmt::Debug for BlobVec {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("BlobVec")
.field("item_layout", &self.item_layout)
.field("capacity", &self.capacity)
.field("len", &self.len)
.field("data", &self.data)
.field("swap_scratch", &self.swap_scratch)
.finish()
}
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
impl BlobVec {
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
/// # Safety
///
/// `drop` should be safe to call with an [`OwningPtr`] pointing to any item that's been pushed into this [`BlobVec`].
///
/// If `drop` is `None`, the items will be leaked. This should generally be set as None based on [`needs_drop`].
///
/// [`needs_drop`]: core::mem::needs_drop
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
pub unsafe fn new(
item_layout: Layout,
drop: Option<unsafe fn(OwningPtr<'_>)>,
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
capacity: usize,
) -> BlobVec {
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
if item_layout.size() == 0 {
BlobVec {
swap_scratch: NonNull::dangling(),
data: NonNull::dangling(),
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
capacity: usize::MAX,
len: 0,
item_layout,
drop,
}
} else {
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
let swap_scratch = NonNull::new(std::alloc::alloc(item_layout))
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
.unwrap_or_else(|| std::alloc::handle_alloc_error(item_layout));
let mut blob_vec = BlobVec {
swap_scratch,
data: NonNull::dangling(),
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
capacity: 0,
len: 0,
item_layout,
drop,
};
blob_vec.reserve_exact(capacity);
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
blob_vec
}
}
#[inline]
pub fn len(&self) -> usize {
self.len
}
#[inline]
pub fn is_empty(&self) -> bool {
self.len == 0
}
#[inline]
pub fn capacity(&self) -> usize {
self.capacity
}
untyped APIs for components and resources (#4447) # Objective Even if bevy itself does not provide any builtin scripting or modding APIs, it should have the foundations for building them yourself. For that it should be enough to have APIs that are not tied to the actual rust types with generics, but rather accept `ComponentId`s and `bevy_ptr` ptrs. ## Solution Add the following APIs to bevy ```rust fn EntityRef::get_by_id(ComponentId) -> Option<Ptr<'w>>; fn EntityMut::get_by_id(ComponentId) -> Option<Ptr<'_>>; fn EntityMut::get_mut_by_id(ComponentId) -> Option<MutUntyped<'_>>; fn World::get_resource_by_id(ComponentId) -> Option<Ptr<'_>>; fn World::get_resource_mut_by_id(ComponentId) -> Option<MutUntyped<'_>>; // Safety: `value` must point to a valid value of the component unsafe fn World::insert_resource_by_id(ComponentId, value: OwningPtr); fn ComponentDescriptor::new_with_layout(..) -> Self; fn World::init_component_with_descriptor(ComponentDescriptor) -> ComponentId; ``` ~~This PR would definitely benefit from #3001 (lifetime'd pointers) to make sure that the lifetimes of the pointers are valid and the my-move pointer in `insert_resource_by_id` could be an `OwningPtr`, but that can be adapter later if/when #3001 is merged.~~ ### Not in this PR - inserting components on entities (this is very tied to types with bundles and the `BundleInserter`) - an untyped version of a query (needs good API design, has a large implementation complexity, can be done in a third-party crate) Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-05-30 15:32:47 +00:00
#[inline]
pub fn layout(&self) -> Layout {
self.item_layout
}
pub fn reserve_exact(&mut self, additional: usize) {
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
let available_space = self.capacity - self.len;
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
if available_space < additional && self.item_layout.size() > 0 {
// SAFETY: `available_space < additional`, so `additional - available_space > 0`
let increment = unsafe { NonZeroUsize::new_unchecked(additional - available_space) };
// SAFETY: not called for ZSTs
unsafe { self.grow_exact(increment) };
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
}
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: must not be called for a ZST item layout
#[warn(unsafe_op_in_unsafe_fn)] // to allow unsafe blocks in unsafe fn
unsafe fn grow_exact(&mut self, increment: NonZeroUsize) {
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
debug_assert!(self.item_layout.size() != 0);
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
let new_capacity = self.capacity + increment.get();
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
let new_layout =
array_layout(&self.item_layout, new_capacity).expect("array layout should be valid");
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
let new_data = if self.capacity == 0 {
// SAFETY:
// - layout has non-zero size as per safety requirement
unsafe { std::alloc::alloc(new_layout) }
} else {
// SAFETY:
// - ptr was be allocated via this allocator
// - the layout of the ptr was `array_layout(self.item_layout, self.capacity)`
// - `item_layout.size() > 0` and `new_capacity > 0`, so the layout size is non-zero
// - "new_size, when rounded up to the nearest multiple of layout.align(), must not overflow (i.e., the rounded value must be less than usize::MAX)",
// since the item size is always a multiple of its align, the rounding cannot happen
// here and the overflow is handled in `array_layout`
unsafe {
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
std::alloc::realloc(
self.get_ptr_mut().as_ptr(),
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
array_layout(&self.item_layout, self.capacity)
.expect("array layout should be valid"),
new_layout.size(),
)
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
}
};
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
self.data = NonNull::new(new_data).unwrap_or_else(|| handle_alloc_error(new_layout));
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
self.capacity = new_capacity;
}
/// # Safety
/// - index must be in bounds
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
/// - the memory in the [`BlobVec`] starting at index `index`, of a size matching this [`BlobVec`]'s
/// `item_layout`, must have been previously allocated.
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
#[inline]
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
pub unsafe fn initialize_unchecked(&mut self, index: usize, value: OwningPtr<'_>) {
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
debug_assert!(index < self.len());
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
let ptr = self.get_unchecked_mut(index);
std::ptr::copy_nonoverlapping::<u8>(value.as_ptr(), ptr.as_ptr(), self.item_layout.size());
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
/// # Safety
/// - index must be in-bounds
/// - the memory in the [`BlobVec`] starting at index `index`, of a size matching this
/// [`BlobVec`]'s `item_layout`, must have been previously initialized with an item matching
/// this [`BlobVec`]'s `item_layout`
/// - the memory at `*value` must also be previously initialized with an item matching this
/// [`BlobVec`]'s `item_layout`
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
pub unsafe fn replace_unchecked(&mut self, index: usize, value: OwningPtr<'_>) {
debug_assert!(index < self.len());
// If `drop` panics, then when the collection is dropped during stack unwinding, the
// collection's `Drop` impl will call `drop` again for the old value (which is still stored
// in the collection), so we get a double drop. To prevent that, we set len to 0 until we're
// done.
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
let old_len = self.len;
let ptr = self.get_unchecked_mut(index).promote().as_ptr();
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
self.len = 0;
// Drop the old value, then write back, justifying the promotion
// If the drop impl for the old value panics then we run the drop impl for `value` too.
if let Some(drop) = self.drop {
struct OnDrop<F: FnMut()>(F);
impl<F: FnMut()> Drop for OnDrop<F> {
fn drop(&mut self) {
(self.0)();
}
}
let value = value.as_ptr();
let on_unwind = OnDrop(|| (drop)(OwningPtr::new(NonNull::new_unchecked(value))));
(drop)(OwningPtr::new(NonNull::new_unchecked(ptr)));
core::mem::forget(on_unwind);
}
std::ptr::copy_nonoverlapping::<u8>(value.as_ptr(), ptr, self.item_layout.size());
self.len = old_len;
}
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
/// Pushes a value to the [`BlobVec`].
///
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
/// # Safety
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
/// `value` must be valid to add to this [`BlobVec`]
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
#[inline]
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
pub unsafe fn push(&mut self, value: OwningPtr<'_>) {
self.reserve_exact(1);
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
let index = self.len;
self.len += 1;
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
self.initialize_unchecked(index, value);
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
/// # Safety
/// `len` must be <= `capacity`. if length is decreased, "out of bounds" items must be dropped.
/// Newly added items must be immediately populated with valid values and length must be
/// increased. For better unwind safety, call [`BlobVec::set_len`] _after_ populating a new
/// value.
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
pub unsafe fn set_len(&mut self, len: usize) {
debug_assert!(len <= self.capacity());
self.len = len;
}
/// Performs a "swap remove" at the given `index`, which removes the item at `index` and moves
/// the last item in the [`BlobVec`] to `index` (if `index` is not the last item). It is the
/// caller's responsibility to drop the returned pointer, if that is desirable.
///
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
/// # Safety
/// It is the caller's responsibility to ensure that `index` is < `self.len()`
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
#[inline]
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
#[must_use = "The returned pointer should be used to dropped the removed element"]
pub unsafe fn swap_remove_and_forget_unchecked(&mut self, index: usize) -> OwningPtr<'_> {
// FIXME: This should probably just use `core::ptr::swap` and return an `OwningPtr`
// into the underlying `BlobVec` allocation, and remove swap_scratch
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
debug_assert!(index < self.len());
let last = self.len - 1;
let swap_scratch = self.swap_scratch.as_ptr();
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
std::ptr::copy_nonoverlapping::<u8>(
self.get_unchecked_mut(index).as_ptr(),
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
swap_scratch,
self.item_layout.size(),
);
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
std::ptr::copy::<u8>(
self.get_unchecked_mut(last).as_ptr(),
self.get_unchecked_mut(index).as_ptr(),
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
self.item_layout.size(),
);
self.len -= 1;
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
OwningPtr::new(self.swap_scratch)
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
Directly copy moved Table components to the target location (#5056) # Objective Speed up entity moves between tables by reducing the number of copies conducted. Currently three separate copies are conducted: `src[index] -> swap scratch`, `src[last] -> src[index]`, and `swap scratch -> dst[target]`. The first and last copies can be merged by directly using the copy `src[index] -> dst[target]`, which can save quite some time if the component(s) in question are large. ## Solution This PR does the following: - Adds `BlobVec::swap_remove_unchecked(usize, PtrMut<'_>)`, which is identical to `swap_remove_and_forget_unchecked`, but skips the `swap_scratch` and directly copies the component into the provided `PtrMut<'_>`. - Build `Column::initialize_from_unchecked(&mut Column, usize, usize)` on top of it, which uses the above to directly initialize a row from another column. - Update most of the table move APIs to use `initialize_from_unchecked` instead of a combination of `swap_remove_and_forget_unchecked` and `initialize`. This is an alternative, though orthogonal, approach to achieve the same performance gains as seen in #4853. This (hopefully) shouldn't run into the same Miri limitations that said PR currently does. After this PR, `swap_remove_and_forget_unchecked` is still in use for Resources and swap_scratch likely still should be removed, so #4853 still has use, even if this PR is merged. ## Performance TODO: Microbenchmark This PR shows similar improvements to commands that add or remove table components that result in a table move. When tested on `many_cubes sphere`, some of the more command heavy systems saw notable improvements. In particular, `prepare_uniform_components<T>`, this saw a reduction in time from 1.35ms to 1.13ms (a 16.3% improvement) on my local machine, a similar if not slightly better gain than what #4853 showed [here](https://github.com/bevyengine/bevy/pull/4853#issuecomment-1159346106). ![image](https://user-images.githubusercontent.com/3137680/174570088-1c4c6fd7-3215-478c-9eb7-8bd9fe486b32.png) The command heavy `Extract` stage also saw a smaller overall improvement: ![image](https://user-images.githubusercontent.com/3137680/174572261-8a48f004-ab9f-4cb2-b304-a882b6d78065.png) --- ## Changelog Added: `BlobVec::swap_remove_unchecked`. Added: `Column::initialize_from_unchecked`.
2022-06-27 16:52:26 +00:00
/// Removes the value at `index` and copies the value stored into `ptr`.
/// Does not do any bounds checking on `index`.
///
/// # Safety
/// It is the caller's responsibility to ensure that `index` is < `self.len()`
/// and that `self[index]` has been properly initialized.
#[inline]
pub unsafe fn swap_remove_unchecked(&mut self, index: usize, ptr: PtrMut<'_>) {
debug_assert!(index < self.len());
let last = self.get_unchecked_mut(self.len - 1).as_ptr();
let target = self.get_unchecked_mut(index).as_ptr();
// Copy the item at the index into the provided ptr
std::ptr::copy_nonoverlapping::<u8>(target, ptr.as_ptr(), self.item_layout.size());
// Recompress the storage by moving the previous last element into the
// now-free row overwriting the previous data. The removed row may be the last
// one so a non-overlapping copy must not be used here.
std::ptr::copy::<u8>(last, target, self.item_layout.size());
// Invalidate the data stored in the last row, as it has been moved
self.len -= 1;
}
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
/// # Safety
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
/// It is the caller's responsibility to ensure that `index` is < self.len()
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
#[inline]
pub unsafe fn swap_remove_and_drop_unchecked(&mut self, index: usize) {
debug_assert!(index < self.len());
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
let drop = self.drop;
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
let value = self.swap_remove_and_forget_unchecked(index);
if let Some(drop) = drop {
(drop)(value);
}
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
/// # Safety
/// It is the caller's responsibility to ensure that `index` is < self.len()
#[inline]
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
pub unsafe fn get_unchecked(&self, index: usize) -> Ptr<'_> {
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
debug_assert!(index < self.len());
self.get_ptr().byte_add(index * self.item_layout.size())
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
/// # Safety
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
/// It is the caller's responsibility to ensure that `index` is < self.len()
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
#[inline]
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
pub unsafe fn get_unchecked_mut(&mut self, index: usize) -> PtrMut<'_> {
debug_assert!(index < self.len());
let layout_size = self.item_layout.size();
self.get_ptr_mut().byte_add(index * layout_size)
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
}
/// Gets a [`Ptr`] to the start of the vec
#[inline]
pub fn get_ptr(&self) -> Ptr<'_> {
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: the inner data will remain valid for as long as 'self.
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
unsafe { Ptr::new(self.data) }
}
/// Gets a [`PtrMut`] to the start of the vec
#[inline]
pub fn get_ptr_mut(&mut self) -> PtrMut<'_> {
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: the inner data will remain valid for as long as 'self.
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
unsafe { PtrMut::new(self.data) }
}
/// Get a reference to the entire [`BlobVec`] as if it were an array with elements of type `T`
///
/// # Safety
/// The type `T` must be the type of the items in this [`BlobVec`].
pub unsafe fn get_slice<T>(&self) -> &[UnsafeCell<T>] {
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: the inner data will remain valid for as long as 'self.
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
std::slice::from_raw_parts(self.data.as_ptr() as *const UnsafeCell<T>, self.len)
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
pub fn clear(&mut self) {
let len = self.len;
// We set len to 0 _before_ dropping elements for unwind safety. This ensures we don't
// accidentally drop elements twice in the event of a drop impl panicking.
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
self.len = 0;
if let Some(drop) = self.drop {
let layout_size = self.item_layout.size();
for i in 0..len {
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: `i * layout_size` is inbounds for the allocation, and the item is left unreachable so it can be safely promoted to an `OwningPtr`
unsafe {
// NOTE: this doesn't use self.get_unchecked(i) because the debug_assert on index
// will panic here due to self.len being set to 0
let ptr = self.get_ptr_mut().byte_add(i * layout_size).promote();
(drop)(ptr);
}
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
}
}
}
impl Drop for BlobVec {
fn drop(&mut self) {
self.clear();
if self.item_layout.size() > 0 {
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: the `swap_scratch` pointer is always allocated using `self.item_layout`
unsafe {
std::alloc::dealloc(self.swap_scratch.as_ptr(), self.item_layout);
}
}
let array_layout =
array_layout(&self.item_layout, self.capacity).expect("array layout should be valid");
if array_layout.size() > 0 {
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: data ptr layout is correct, swap_scratch ptr layout is correct
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
unsafe {
std::alloc::dealloc(self.get_ptr_mut().as_ptr(), array_layout);
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
}
}
}
/// From <https://doc.rust-lang.org/beta/src/core/alloc/layout.rs.html>
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
fn array_layout(layout: &Layout, n: usize) -> Option<Layout> {
let (array_layout, offset) = repeat_layout(layout, n)?;
debug_assert_eq!(layout.size(), offset);
Some(array_layout)
}
// TODO: replace with `Layout::repeat` if/when it stabilizes
/// From <https://doc.rust-lang.org/beta/src/core/alloc/layout.rs.html>
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
fn repeat_layout(layout: &Layout, n: usize) -> Option<(Layout, usize)> {
// This cannot overflow. Quoting from the invariant of Layout:
// > `size`, when rounded up to the nearest multiple of `align`,
// > must not overflow (i.e., the rounded value must be less than
// > `usize::MAX`)
let padded_size = layout.size() + padding_needed_for(layout, layout.align());
let alloc_size = padded_size.checked_mul(n)?;
// SAFETY: self.align is already known to be valid and alloc_size has been
// padded already.
unsafe {
Some((
Layout::from_size_align_unchecked(alloc_size, layout.align()),
padded_size,
))
}
}
/// From <https://doc.rust-lang.org/beta/src/core/alloc/layout.rs.html>
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
const fn padding_needed_for(layout: &Layout, align: usize) -> usize {
let len = layout.size();
// Rounded up value is:
// len_rounded_up = (len + align - 1) & !(align - 1);
// and then we return the padding difference: `len_rounded_up - len`.
//
// We use modular arithmetic throughout:
//
// 1. align is guaranteed to be > 0, so align - 1 is always
// valid.
//
// 2. `len + align - 1` can overflow by at most `align - 1`,
// so the &-mask with `!(align - 1)` will ensure that in the
// case of overflow, `len_rounded_up` will itself be 0.
// Thus the returned padding, when added to `len`, yields 0,
// which trivially satisfies the alignment `align`.
//
// (Of course, attempts to allocate blocks of memory whose
// size and padding overflow in the above manner should cause
// the allocator to yield an error anyway.)
let len_rounded_up = len.wrapping_add(align).wrapping_sub(1) & !align.wrapping_sub(1);
len_rounded_up.wrapping_sub(len)
}
#[cfg(test)]
mod tests {
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
use crate::ptr::OwningPtr;
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
use super::BlobVec;
use std::{alloc::Layout, cell::RefCell, rc::Rc};
// SAFETY: The pointer points to a valid value of type `T` and it is safe to drop this value.
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
unsafe fn drop_ptr<T>(x: OwningPtr<'_>) {
x.drop_as::<T>();
}
/// # Safety
///
/// `blob_vec` must have a layout that matches `Layout::new::<T>()`
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
unsafe fn push<T>(blob_vec: &mut BlobVec, value: T) {
OwningPtr::make(value, |ptr| {
blob_vec.push(ptr);
});
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
/// # Safety
///
/// `blob_vec` must have a layout that matches `Layout::new::<T>()`
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
unsafe fn swap_remove<T>(blob_vec: &mut BlobVec, index: usize) -> T {
assert!(index < blob_vec.len());
let value = blob_vec.swap_remove_and_forget_unchecked(index);
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
value.read::<T>()
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
/// # Safety
///
/// `blob_vec` must have a layout that matches `Layout::new::<T>()`, it most store a valid `T`
/// value at the given `index`
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
unsafe fn get_mut<T>(blob_vec: &mut BlobVec, index: usize) -> &mut T {
assert!(index < blob_vec.len());
Use lifetimed, type erased pointers in bevy_ecs (#3001) # Objective `bevy_ecs` has large amounts of unsafe code which is hard to get right and makes it difficult to audit for soundness. ## Solution Introduce lifetimed, type-erased pointers: `Ptr<'a>` `PtrMut<'a>` `OwningPtr<'a>'` and `ThinSlicePtr<'a, T>` which are newtypes around a raw pointer with a lifetime and conceptually representing strong invariants about the pointee and validity of the pointer. The process of converting bevy_ecs to use these has already caught multiple cases of unsound behavior. ## Changelog TL;DR for release notes: `bevy_ecs` now uses lifetimed, type-erased pointers internally, significantly improving safety and legibility without sacrificing performance. This should have approximately no end user impact, unless you were meddling with the (unfortunately public) internals of `bevy_ecs`. - `Fetch`, `FilterFetch` and `ReadOnlyFetch` trait no longer have a `'state` lifetime - this was unneeded - `ReadOnly/Fetch` associated types on `WorldQuery` are now on a new `WorldQueryGats<'world>` trait - was required to work around lack of Generic Associated Types (we wish to express `type Fetch<'a>: Fetch<'a>`) - `derive(WorldQuery)` no longer requires `'w` lifetime on struct - this was unneeded, and improves the end user experience - `EntityMut::get_unchecked_mut` returns `&'_ mut T` not `&'w mut T` - allows easier use of unsafe API with less footguns, and can be worked around via lifetime transmutery as a user - `Bundle::from_components` now takes a `ctx` parameter to pass to the `FnMut` closure - required because closure return types can't borrow from captures - `Fetch::init` takes `&'world World`, `Fetch::set_archetype` takes `&'world Archetype` and `&'world Tables`, `Fetch::set_table` takes `&'world Table` - allows types implementing `Fetch` to store borrows into world - `WorldQuery` trait now has a `shrink` fn to shorten the lifetime in `Fetch::<'a>::Item` - this works around lack of subtyping of assoc types, rust doesnt allow you to turn `<T as Fetch<'static>>::Item'` into `<T as Fetch<'a>>::Item'` - `QueryCombinationsIter` requires this - Most types implementing `Fetch` now have a lifetime `'w` - allows the fetches to store borrows of world data instead of using raw pointers ## Migration guide - `EntityMut::get_unchecked_mut` returns a more restricted lifetime, there is no general way to migrate this as it depends on your code - `Bundle::from_components` implementations must pass the `ctx` arg to `func` - `Bundle::from_components` callers have to use a fn arg instead of closure captures for borrowing from world - Remove lifetime args on `derive(WorldQuery)` structs as it is nonsensical - `<Q as WorldQuery>::ReadOnly/Fetch` should be changed to either `RO/QueryFetch<'world>` or `<Q as WorldQueryGats<'world>>::ReadOnly/Fetch` - `<F as Fetch<'w, 's>>` should be changed to `<F as Fetch<'w>>` - Change the fn sigs of `Fetch::init/set_archetype/set_table` to match respective trait fn sigs - Implement the required `fn shrink` on any `WorldQuery` implementations - Move assoc types `Fetch` and `ReadOnlyFetch` on `WorldQuery` impls to `WorldQueryGats` impls - Pass an appropriate `'world` lifetime to whatever fetch struct you are for some reason using ### Type inference regression in some cases rustc may give spurrious errors when attempting to infer the `F` parameter on a query/querystate this can be fixed by manually specifying the type, i.e. `QueryState::new::<_, ()>(world)`. The error is rather confusing: ```rust= error[E0271]: type mismatch resolving `<() as Fetch<'_>>::Item == bool` --> crates/bevy_pbr/src/render/light.rs:1413:30 | 1413 | main_view_query: QueryState::new(world), | ^^^^^^^^^^^^^^^ expected `bool`, found `()` | = note: required because of the requirements on the impl of `for<'x> FilterFetch<'x>` for `<() as WorldQueryGats<'x>>::Fetch` note: required by a bound in `bevy_ecs::query::QueryState::<Q, F>::new` --> crates/bevy_ecs/src/query/state.rs:49:32 | 49 | for<'x> QueryFetch<'x, F>: FilterFetch<'x>, | ^^^^^^^^^^^^^^^ required by this bound in `bevy_ecs::query::QueryState::<Q, F>::new` ``` --- Made with help from @BoxyUwU and @alice-i-cecile Co-authored-by: Boxy <supbscripter@gmail.com>
2022-04-27 23:44:06 +00:00
blob_vec.get_unchecked_mut(index).deref_mut::<T>()
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}
#[test]
fn resize_test() {
let item_layout = Layout::new::<usize>();
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: `drop` fn is `None`, usize doesn't need dropping
let mut blob_vec = unsafe { BlobVec::new(item_layout, None, 64) };
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: `i` is a usize, i.e. the type corresponding to `item_layout`
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
unsafe {
for i in 0..1_000 {
push(&mut blob_vec, i as usize);
}
}
assert_eq!(blob_vec.len(), 1_000);
assert_eq!(blob_vec.capacity(), 1_000);
}
#[derive(Debug, Eq, PartialEq, Clone)]
struct Foo {
a: u8,
b: String,
drop_counter: Rc<RefCell<usize>>,
}
impl Drop for Foo {
fn drop(&mut self) {
*self.drop_counter.borrow_mut() += 1;
}
}
#[test]
fn blob_vec() {
let drop_counter = Rc::new(RefCell::new(0));
{
let item_layout = Layout::new::<Foo>();
let drop = drop_ptr::<Foo>;
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: drop is able to drop a value of its `item_layout`
let mut blob_vec = unsafe { BlobVec::new(item_layout, Some(drop), 2) };
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
assert_eq!(blob_vec.capacity(), 2);
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: the following code only deals with values of type `Foo`, which satisfies the safety requirement of `push`, `get_mut` and `swap_remove` that the
// values have a layout compatible to the blob vec's `item_layout`.
// Every index is in range.
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
unsafe {
let foo1 = Foo {
a: 42,
b: "abc".to_string(),
drop_counter: drop_counter.clone(),
};
push(&mut blob_vec, foo1.clone());
assert_eq!(blob_vec.len(), 1);
assert_eq!(get_mut::<Foo>(&mut blob_vec, 0), &foo1);
let mut foo2 = Foo {
a: 7,
b: "xyz".to_string(),
drop_counter: drop_counter.clone(),
};
push::<Foo>(&mut blob_vec, foo2.clone());
assert_eq!(blob_vec.len(), 2);
assert_eq!(blob_vec.capacity(), 2);
assert_eq!(get_mut::<Foo>(&mut blob_vec, 0), &foo1);
assert_eq!(get_mut::<Foo>(&mut blob_vec, 1), &foo2);
get_mut::<Foo>(&mut blob_vec, 1).a += 1;
assert_eq!(get_mut::<Foo>(&mut blob_vec, 1).a, 8);
let foo3 = Foo {
a: 16,
b: "123".to_string(),
drop_counter: drop_counter.clone(),
};
push(&mut blob_vec, foo3.clone());
assert_eq!(blob_vec.len(), 3);
assert_eq!(blob_vec.capacity(), 3);
let last_index = blob_vec.len() - 1;
let value = swap_remove::<Foo>(&mut blob_vec, last_index);
assert_eq!(foo3, value);
assert_eq!(blob_vec.len(), 2);
assert_eq!(blob_vec.capacity(), 3);
let value = swap_remove::<Foo>(&mut blob_vec, 0);
assert_eq!(foo1, value);
assert_eq!(blob_vec.len(), 1);
assert_eq!(blob_vec.capacity(), 3);
foo2.a = 8;
assert_eq!(get_mut::<Foo>(&mut blob_vec, 0), &foo2);
}
}
assert_eq!(*drop_counter.borrow(), 6);
}
#[test]
fn blob_vec_drop_empty_capacity() {
let item_layout = Layout::new::<Foo>();
let drop = drop_ptr::<Foo>;
add more `SAFETY` comments and lint for missing ones in `bevy_ecs` (#4835) # Objective `SAFETY` comments are meant to be placed before `unsafe` blocks and should contain the reasoning of why in this case the usage of unsafe is okay. This is useful when reading the code because it makes it clear which assumptions are required for safety, and makes it easier to spot possible unsoundness holes. It also forces the code writer to think of something to write and maybe look at the safety contracts of any called unsafe methods again to double-check their correct usage. There's a clippy lint called `undocumented_unsafe_blocks` which warns when using a block without such a comment. ## Solution - since clippy expects `SAFETY` instead of `SAFE`, rename those - add `SAFETY` comments in more places - for the last remaining 3 places, add an `#[allow()]` and `// TODO` since I wasn't comfortable enough with the code to justify their safety - add ` #![warn(clippy::undocumented_unsafe_blocks)]` to `bevy_ecs` ### Note for reviewers The first commit only renames `SAFETY` to `SAFE` so it doesn't need a thorough review. https://github.com/bevyengine/bevy/pull/4835/files/cb042a416ecbe5e7d74797449969e064d8a5f13c..55cef2d6fa3aa634667a60f6d5abc16f43f16298 is the diff for all other changes. ### Safety comments where I'm not too familiar with the code https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/entity/mod.rs#L540-L546 https://github.com/bevyengine/bevy/blob/774012ece50e4add4fcc8324ec48bbecf5546c3c/crates/bevy_ecs/src/world/entity_ref.rs#L249-L252 ### Locations left undocumented with a `TODO` comment https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/schedule/executor_parallel.rs#L196-L199 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L287-L289 https://github.com/bevyengine/bevy/blob/5dde944a3051426ac69fdedc5699f7da97a7e147/crates/bevy_ecs/src/world/entity_ref.rs#L413-L415 Co-authored-by: Jakob Hellermann <hellermann@sipgate.de>
2022-07-04 14:44:24 +00:00
// SAFETY: drop is able to drop a value of its `item_layout`
let _ = unsafe { BlobVec::new(item_layout, Some(drop), 0) };
}
Bevy ECS V2 (#1525) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * **Archetypal ECS**: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * **Sparse Set ECS**: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * **Tables** (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * **Sparse Sets** * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor::new::<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(|c| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(|(a, mut b)| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); *b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(|world: &mut World, a: &mut A| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements * Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)
2021-03-05 07:54:35 +00:00
}