bevy/benches
Periwink ec4cf024f8
Add a ComponentIndex and update QueryState creation/update to use it (#13460)
# Objective

To implement relations we will need to add a `ComponentIndex`, which is
a map from a Component to the list of archetypes that contain this
component.
One of the reasons is that with fragmenting relations the number of
archetypes will explode, so it will become inefficient to create and
update the query caches by iterating through the list of all archetypes.

In this PR, we introduce the `ComponentIndex`, and we update the
`QueryState` to make use of it:
- if a query has at least 1 required component (i.e. something other
than `()`, `Entity` or `Option<>`, etc.): for each of the required
components we find the list of archetypes that contain it (using the
ComponentIndex). Then, we select the smallest list among these. This
gives a small subset of archetypes to iterate through compared with
iterating through all new archetypes
- if it doesn't, then we keep using the current approach of iterating
through all new archetypes


# Implementation
- This breaks query iteration order, in the sense that we are not
guaranteed anymore to return results in the order in which the
archetypes were created. I think this should be fine because this wasn't
an explicit bevy guarantee so users should not be relying on this. I
updated a bunch of unit tests that were failing because of this.

- I had an issue with the borrow checker because iterating the list of
potential archetypes requires access to `&state.component_access`, which
was conflicting with the calls to
```
  if state.new_archetype_internal(archetype) {
      state.update_archetype_component_access(archetype, access);
  }
```
which need a mutable access to the state.

The solution I chose was to introduce a `QueryStateView` which is a
temporary view into the `QueryState` which enables a "split-borrows"
kind of approach. It is described in detail in this blog post:
https://smallcultfollowing.com/babysteps/blog/2018/11/01/after-nll-interprocedural-conflicts/

# Test

The unit tests pass.

Benchmark results:
```
❯ critcmp main pr
group                                  main                                   pr
-----                                  ----                                   --
iter_fragmented/base                   1.00   342.2±25.45ns        ? ?/sec    1.02   347.5±16.24ns        ? ?/sec
iter_fragmented/foreach                1.04   165.4±11.29ns        ? ?/sec    1.00    159.5±4.27ns        ? ?/sec
iter_fragmented/foreach_wide           1.03      3.3±0.04µs        ? ?/sec    1.00      3.2±0.06µs        ? ?/sec
iter_fragmented/wide                   1.03      3.1±0.06µs        ? ?/sec    1.00      3.0±0.08µs        ? ?/sec
iter_fragmented_sparse/base            1.00      6.5±0.14ns        ? ?/sec    1.02      6.6±0.08ns        ? ?/sec
iter_fragmented_sparse/foreach         1.00      6.3±0.08ns        ? ?/sec    1.04      6.6±0.08ns        ? ?/sec
iter_fragmented_sparse/foreach_wide    1.00     43.8±0.15ns        ? ?/sec    1.02     44.6±0.53ns        ? ?/sec
iter_fragmented_sparse/wide            1.00     29.8±0.44ns        ? ?/sec    1.00     29.8±0.26ns        ? ?/sec
iter_simple/base                       1.00      8.2±0.10µs        ? ?/sec    1.00      8.2±0.09µs        ? ?/sec
iter_simple/foreach                    1.00      3.8±0.02µs        ? ?/sec    1.02      3.9±0.03µs        ? ?/sec
iter_simple/foreach_sparse_set         1.00     19.0±0.26µs        ? ?/sec    1.01     19.3±0.16µs        ? ?/sec
iter_simple/foreach_wide               1.00     17.8±0.24µs        ? ?/sec    1.00     17.9±0.31µs        ? ?/sec
iter_simple/foreach_wide_sparse_set    1.06     95.6±6.23µs        ? ?/sec    1.00     90.6±0.59µs        ? ?/sec
iter_simple/sparse_set                 1.00     19.3±1.63µs        ? ?/sec    1.01     19.5±0.29µs        ? ?/sec
iter_simple/system                     1.00      8.1±0.10µs        ? ?/sec    1.00      8.1±0.09µs        ? ?/sec
iter_simple/wide                       1.05     37.7±2.53µs        ? ?/sec    1.00     35.8±0.57µs        ? ?/sec
iter_simple/wide_sparse_set            1.00     95.7±1.62µs        ? ?/sec    1.00     95.9±0.76µs        ? ?/sec
par_iter_simple/with_0_fragment        1.04     35.0±2.51µs        ? ?/sec    1.00     33.7±0.49µs        ? ?/sec
par_iter_simple/with_1000_fragment     1.00     50.4±2.52µs        ? ?/sec    1.01     51.0±3.84µs        ? ?/sec
par_iter_simple/with_100_fragment      1.02     40.3±2.23µs        ? ?/sec    1.00     39.5±1.32µs        ? ?/sec
par_iter_simple/with_10_fragment       1.14     38.8±7.79µs        ? ?/sec    1.00     34.0±0.78µs        ? ?/sec
```
2024-08-06 00:57:15 +00:00
..
benches Add a ComponentIndex and update QueryState creation/update to use it (#13460) 2024-08-06 00:57:15 +00:00
Cargo.toml Minimal Bubbling Observers (#13991) 2024-07-15 13:39:41 +00:00
README.md Add README to benches (#11508) 2024-01-24 17:11:28 +00:00

Bevy Benchmarks

This is a crate with a collection of benchmarks for Bevy, separate from the rest of the Bevy crates.

Running the benchmarks

  1. Setup everything you need for Bevy with the setup guide.

  2. Move into the benches directory (where this README is located).

    bevy $ cd benches
    
  3. Run the benchmarks with cargo (This will take a while)

    bevy/benches $ cargo bench
    

    If you'd like to only compile the benchmarks (without running them), you can do that like this:

    bevy/benches $ cargo bench --no-run
    

Criterion

Bevy's benchmarks use Criterion. If you want to learn more about using Criterion for comparing performance against a baseline or generating detailed reports, you can read the Criterion.rs documentation.