bevy/benches/benches/bevy_render
Periwink ec4cf024f8
Add a ComponentIndex and update QueryState creation/update to use it (#13460)
# Objective

To implement relations we will need to add a `ComponentIndex`, which is
a map from a Component to the list of archetypes that contain this
component.
One of the reasons is that with fragmenting relations the number of
archetypes will explode, so it will become inefficient to create and
update the query caches by iterating through the list of all archetypes.

In this PR, we introduce the `ComponentIndex`, and we update the
`QueryState` to make use of it:
- if a query has at least 1 required component (i.e. something other
than `()`, `Entity` or `Option<>`, etc.): for each of the required
components we find the list of archetypes that contain it (using the
ComponentIndex). Then, we select the smallest list among these. This
gives a small subset of archetypes to iterate through compared with
iterating through all new archetypes
- if it doesn't, then we keep using the current approach of iterating
through all new archetypes


# Implementation
- This breaks query iteration order, in the sense that we are not
guaranteed anymore to return results in the order in which the
archetypes were created. I think this should be fine because this wasn't
an explicit bevy guarantee so users should not be relying on this. I
updated a bunch of unit tests that were failing because of this.

- I had an issue with the borrow checker because iterating the list of
potential archetypes requires access to `&state.component_access`, which
was conflicting with the calls to
```
  if state.new_archetype_internal(archetype) {
      state.update_archetype_component_access(archetype, access);
  }
```
which need a mutable access to the state.

The solution I chose was to introduce a `QueryStateView` which is a
temporary view into the `QueryState` which enables a "split-borrows"
kind of approach. It is described in detail in this blog post:
https://smallcultfollowing.com/babysteps/blog/2018/11/01/after-nll-interprocedural-conflicts/

# Test

The unit tests pass.

Benchmark results:
```
❯ critcmp main pr
group                                  main                                   pr
-----                                  ----                                   --
iter_fragmented/base                   1.00   342.2±25.45ns        ? ?/sec    1.02   347.5±16.24ns        ? ?/sec
iter_fragmented/foreach                1.04   165.4±11.29ns        ? ?/sec    1.00    159.5±4.27ns        ? ?/sec
iter_fragmented/foreach_wide           1.03      3.3±0.04µs        ? ?/sec    1.00      3.2±0.06µs        ? ?/sec
iter_fragmented/wide                   1.03      3.1±0.06µs        ? ?/sec    1.00      3.0±0.08µs        ? ?/sec
iter_fragmented_sparse/base            1.00      6.5±0.14ns        ? ?/sec    1.02      6.6±0.08ns        ? ?/sec
iter_fragmented_sparse/foreach         1.00      6.3±0.08ns        ? ?/sec    1.04      6.6±0.08ns        ? ?/sec
iter_fragmented_sparse/foreach_wide    1.00     43.8±0.15ns        ? ?/sec    1.02     44.6±0.53ns        ? ?/sec
iter_fragmented_sparse/wide            1.00     29.8±0.44ns        ? ?/sec    1.00     29.8±0.26ns        ? ?/sec
iter_simple/base                       1.00      8.2±0.10µs        ? ?/sec    1.00      8.2±0.09µs        ? ?/sec
iter_simple/foreach                    1.00      3.8±0.02µs        ? ?/sec    1.02      3.9±0.03µs        ? ?/sec
iter_simple/foreach_sparse_set         1.00     19.0±0.26µs        ? ?/sec    1.01     19.3±0.16µs        ? ?/sec
iter_simple/foreach_wide               1.00     17.8±0.24µs        ? ?/sec    1.00     17.9±0.31µs        ? ?/sec
iter_simple/foreach_wide_sparse_set    1.06     95.6±6.23µs        ? ?/sec    1.00     90.6±0.59µs        ? ?/sec
iter_simple/sparse_set                 1.00     19.3±1.63µs        ? ?/sec    1.01     19.5±0.29µs        ? ?/sec
iter_simple/system                     1.00      8.1±0.10µs        ? ?/sec    1.00      8.1±0.09µs        ? ?/sec
iter_simple/wide                       1.05     37.7±2.53µs        ? ?/sec    1.00     35.8±0.57µs        ? ?/sec
iter_simple/wide_sparse_set            1.00     95.7±1.62µs        ? ?/sec    1.00     95.9±0.76µs        ? ?/sec
par_iter_simple/with_0_fragment        1.04     35.0±2.51µs        ? ?/sec    1.00     33.7±0.49µs        ? ?/sec
par_iter_simple/with_1000_fragment     1.00     50.4±2.52µs        ? ?/sec    1.01     51.0±3.84µs        ? ?/sec
par_iter_simple/with_100_fragment      1.02     40.3±2.23µs        ? ?/sec    1.00     39.5±1.32µs        ? ?/sec
par_iter_simple/with_10_fragment       1.14     38.8±7.79µs        ? ?/sec    1.00     34.0±0.78µs        ? ?/sec
```
2024-08-06 00:57:15 +00:00
..
render_layers.rs #12502 Remove limit on RenderLayers. (#13317) 2024-05-16 16:15:47 +00:00
torus.rs Add a ComponentIndex and update QueryState creation/update to use it (#13460) 2024-08-06 00:57:15 +00:00