bevy/benches/benches/bevy_render at bfb2ff16e8cd8c09df7ecd6fe694b34f44a585d1 - Mirrors/bevy

mirror of https://github.com/bevyengine/bevy synced 2024-12-24 20:13:07 +00:00

History

Periwink ec4cf024f8 Add a ComponentIndex and update QueryState creation/update to use it (#13460 ) # Objective To implement relations we will need to add a `ComponentIndex`, which is a map from a Component to the list of archetypes that contain this component. One of the reasons is that with fragmenting relations the number of archetypes will explode, so it will become inefficient to create and update the query caches by iterating through the list of all archetypes. In this PR, we introduce the `ComponentIndex`, and we update the `QueryState` to make use of it: - if a query has at least 1 required component (i.e. something other than `()`, `Entity` or `Option<>`, etc.): for each of the required components we find the list of archetypes that contain it (using the ComponentIndex). Then, we select the smallest list among these. This gives a small subset of archetypes to iterate through compared with iterating through all new archetypes - if it doesn't, then we keep using the current approach of iterating through all new archetypes # Implementation - This breaks query iteration order, in the sense that we are not guaranteed anymore to return results in the order in which the archetypes were created. I think this should be fine because this wasn't an explicit bevy guarantee so users should not be relying on this. I updated a bunch of unit tests that were failing because of this. - I had an issue with the borrow checker because iterating the list of potential archetypes requires access to `&state.component_access`, which was conflicting with the calls to ``` if state.new_archetype_internal(archetype) { state.update_archetype_component_access(archetype, access); } ``` which need a mutable access to the state. The solution I chose was to introduce a `QueryStateView` which is a temporary view into the `QueryState` which enables a "split-borrows" kind of approach. It is described in detail in this blog post: https://smallcultfollowing.com/babysteps/blog/2018/11/01/after-nll-interprocedural-conflicts/ # Test The unit tests pass. Benchmark results: ``` ❯ critcmp main pr group main pr ----- ---- -- iter_fragmented/base 1.00 342.2±25.45ns ? ?/sec 1.02 347.5±16.24ns ? ?/sec iter_fragmented/foreach 1.04 165.4±11.29ns ? ?/sec 1.00 159.5±4.27ns ? ?/sec iter_fragmented/foreach_wide 1.03 3.3±0.04µs ? ?/sec 1.00 3.2±0.06µs ? ?/sec iter_fragmented/wide 1.03 3.1±0.06µs ? ?/sec 1.00 3.0±0.08µs ? ?/sec iter_fragmented_sparse/base 1.00 6.5±0.14ns ? ?/sec 1.02 6.6±0.08ns ? ?/sec iter_fragmented_sparse/foreach 1.00 6.3±0.08ns ? ?/sec 1.04 6.6±0.08ns ? ?/sec iter_fragmented_sparse/foreach_wide 1.00 43.8±0.15ns ? ?/sec 1.02 44.6±0.53ns ? ?/sec iter_fragmented_sparse/wide 1.00 29.8±0.44ns ? ?/sec 1.00 29.8±0.26ns ? ?/sec iter_simple/base 1.00 8.2±0.10µs ? ?/sec 1.00 8.2±0.09µs ? ?/sec iter_simple/foreach 1.00 3.8±0.02µs ? ?/sec 1.02 3.9±0.03µs ? ?/sec iter_simple/foreach_sparse_set 1.00 19.0±0.26µs ? ?/sec 1.01 19.3±0.16µs ? ?/sec iter_simple/foreach_wide 1.00 17.8±0.24µs ? ?/sec 1.00 17.9±0.31µs ? ?/sec iter_simple/foreach_wide_sparse_set 1.06 95.6±6.23µs ? ?/sec 1.00 90.6±0.59µs ? ?/sec iter_simple/sparse_set 1.00 19.3±1.63µs ? ?/sec 1.01 19.5±0.29µs ? ?/sec iter_simple/system 1.00 8.1±0.10µs ? ?/sec 1.00 8.1±0.09µs ? ?/sec iter_simple/wide 1.05 37.7±2.53µs ? ?/sec 1.00 35.8±0.57µs ? ?/sec iter_simple/wide_sparse_set 1.00 95.7±1.62µs ? ?/sec 1.00 95.9±0.76µs ? ?/sec par_iter_simple/with_0_fragment 1.04 35.0±2.51µs ? ?/sec 1.00 33.7±0.49µs ? ?/sec par_iter_simple/with_1000_fragment 1.00 50.4±2.52µs ? ?/sec 1.01 51.0±3.84µs ? ?/sec par_iter_simple/with_100_fragment 1.02 40.3±2.23µs ? ?/sec 1.00 39.5±1.32µs ? ?/sec par_iter_simple/with_10_fragment 1.14 38.8±7.79µs ? ?/sec 1.00 34.0±0.78µs ? ?/sec ```	2024-08-06 00:57:15 +00:00
..
render_layers.rs	#12502 Remove limit on RenderLayers. (#13317 )	2024-05-16 16:15:47 +00:00
torus.rs	Add a ComponentIndex and update QueryState creation/update to use it (#13460 )	2024-08-06 00:57:15 +00:00

Add a ComponentIndex and update QueryState creation/update to use it (#13460 )

# Objective

To implement relations we will need to add a `ComponentIndex`, which is
a map from a Component to the list of archetypes that contain this
component.
One of the reasons is that with fragmenting relations the number of
archetypes will explode, so it will become inefficient to create and
update the query caches by iterating through the list of all archetypes.

In this PR, we introduce the `ComponentIndex`, and we update the
`QueryState` to make use of it:
- if a query has at least 1 required component (i.e. something other
than `()`, `Entity` or `Option<>`, etc.): for each of the required
components we find the list of archetypes that contain it (using the
ComponentIndex). Then, we select the smallest list among these. This
gives a small subset of archetypes to iterate through compared with
iterating through all new archetypes
- if it doesn't, then we keep using the current approach of iterating
through all new archetypes


# Implementation
- This breaks query iteration order, in the sense that we are not
guaranteed anymore to return results in the order in which the
archetypes were created. I think this should be fine because this wasn't
an explicit bevy guarantee so users should not be relying on this. I
updated a bunch of unit tests that were failing because of this.

- I had an issue with the borrow checker because iterating the list of
potential archetypes requires access to `&state.component_access`, which
was conflicting with the calls to
```
  if state.new_archetype_internal(archetype) {
      state.update_archetype_component_access(archetype, access);
  }
```
which need a mutable access to the state.

The solution I chose was to introduce a `QueryStateView` which is a
temporary view into the `QueryState` which enables a "split-borrows"
kind of approach. It is described in detail in this blog post:
https://smallcultfollowing.com/babysteps/blog/2018/11/01/after-nll-interprocedural-conflicts/

# Test

The unit tests pass.

Benchmark results:
```
❯ critcmp main pr
group                                  main                                   pr
-----                                  ----                                   --
iter_fragmented/base                   1.00   342.2±25.45ns        ? ?/sec    1.02   347.5±16.24ns        ? ?/sec
iter_fragmented/foreach                1.04   165.4±11.29ns        ? ?/sec    1.00    159.5±4.27ns        ? ?/sec
iter_fragmented/foreach_wide           1.03      3.3±0.04µs        ? ?/sec    1.00      3.2±0.06µs        ? ?/sec
iter_fragmented/wide                   1.03      3.1±0.06µs        ? ?/sec    1.00      3.0±0.08µs        ? ?/sec
iter_fragmented_sparse/base            1.00      6.5±0.14ns        ? ?/sec    1.02      6.6±0.08ns        ? ?/sec
iter_fragmented_sparse/foreach         1.00      6.3±0.08ns        ? ?/sec    1.04      6.6±0.08ns        ? ?/sec
iter_fragmented_sparse/foreach_wide    1.00     43.8±0.15ns        ? ?/sec    1.02     44.6±0.53ns        ? ?/sec
iter_fragmented_sparse/wide            1.00     29.8±0.44ns        ? ?/sec    1.00     29.8±0.26ns        ? ?/sec
iter_simple/base                       1.00      8.2±0.10µs        ? ?/sec    1.00      8.2±0.09µs        ? ?/sec
iter_simple/foreach                    1.00      3.8±0.02µs        ? ?/sec    1.02      3.9±0.03µs        ? ?/sec
iter_simple/foreach_sparse_set         1.00     19.0±0.26µs        ? ?/sec    1.01     19.3±0.16µs        ? ?/sec
iter_simple/foreach_wide               1.00     17.8±0.24µs        ? ?/sec    1.00     17.9±0.31µs        ? ?/sec
iter_simple/foreach_wide_sparse_set    1.06     95.6±6.23µs        ? ?/sec    1.00     90.6±0.59µs        ? ?/sec
iter_simple/sparse_set                 1.00     19.3±1.63µs        ? ?/sec    1.01     19.5±0.29µs        ? ?/sec
iter_simple/system                     1.00      8.1±0.10µs        ? ?/sec    1.00      8.1±0.09µs        ? ?/sec
iter_simple/wide                       1.05     37.7±2.53µs        ? ?/sec    1.00     35.8±0.57µs        ? ?/sec
iter_simple/wide_sparse_set            1.00     95.7±1.62µs        ? ?/sec    1.00     95.9±0.76µs        ? ?/sec
par_iter_simple/with_0_fragment        1.04     35.0±2.51µs        ? ?/sec    1.00     33.7±0.49µs        ? ?/sec
par_iter_simple/with_1000_fragment     1.00     50.4±2.52µs        ? ?/sec    1.01     51.0±3.84µs        ? ?/sec
par_iter_simple/with_100_fragment      1.02     40.3±2.23µs        ? ?/sec    1.00     39.5±1.32µs        ? ?/sec
par_iter_simple/with_10_fragment       1.14     38.8±7.79µs        ? ?/sec    1.00     34.0±0.78µs        ? ?/sec
```

2024-08-06 00:57:15 +00:00

render_layers.rs

#12502 Remove limit on RenderLayers. (#13317 )

2024-05-16 16:15:47 +00:00

torus.rs

Add a ComponentIndex and update QueryState creation/update to use it (#13460 )

2024-08-06 00:57:15 +00:00