bevy/crates/bevy_ecs/src/query/state.rs
Alice Cecile 6121e5f933 Reliable change detection (#1471)
# Problem Definition

The current change tracking (via flags for both components and resources) fails to detect changes made by systems that are scheduled to run earlier in the frame than they are.

This issue is discussed at length in [#68](https://github.com/bevyengine/bevy/issues/68) and [#54](https://github.com/bevyengine/bevy/issues/54).

This is very much a draft PR, and contributions are welcome and needed.

# Criteria
1. Each change is detected at least once, no matter the ordering.
2. Each change is detected at most once, no matter the ordering.
3. Changes should be detected the same frame that they are made.
4. Competitive ergonomics. Ideally does not require opting-in.
5. Low CPU overhead of computation.
6. Memory efficient. This must not increase over time, except where the number of entities / resources does.
7. Changes should not be lost for systems that don't run.
8. A frame needs to act as a pure function. Given the same set of entities / components it needs to produce the same end state without side-effects.

**Exact** change-tracking proposals satisfy criteria 1 and 2.
**Conservative** change-tracking proposals satisfy criteria 1 but not 2.
**Flaky** change tracking proposals satisfy criteria 2 but not 1.

# Code Base Navigation

There are three types of flags: 
- `Added`: A piece of data was added to an entity / `Resources`.
- `Mutated`: A piece of data was able to be modified, because its `DerefMut` was accessed
- `Changed`: The bitwise OR of `Added` and `Changed`

The special behavior of `ChangedRes`, with respect to the scheduler is being removed in [#1313](https://github.com/bevyengine/bevy/pull/1313) and does not need to be reproduced.

`ChangedRes` and friends can be found in "bevy_ecs/core/resources/resource_query.rs".

The `Flags` trait for Components can be found in "bevy_ecs/core/query.rs".

`ComponentFlags` are stored in "bevy_ecs/core/archetypes.rs", defined on line 446.

# Proposals

**Proposal 5 was selected for implementation.**

## Proposal 0: No Change Detection

The baseline, where computations are performed on everything regardless of whether it changed.

**Type:** Conservative

**Pros:**
- already implemented
- will never miss events
- no overhead

**Cons:**
- tons of repeated work
- doesn't allow users to avoid repeating work (or monitoring for other changes)

## Proposal 1: Earlier-This-Tick Change Detection

The current approach as of Bevy 0.4. Flags are set, and then flushed at the end of each frame.

**Type:** Flaky

**Pros:**
- already implemented
- simple to understand
- low memory overhead (2 bits per component)
- low time overhead (clear every flag once per frame)

**Cons:**
- misses systems based on ordering
- systems that don't run every frame miss changes
- duplicates detection when looping
- can lead to unresolvable circular dependencies

## Proposal 2: Two-Tick Change Detection

Flags persist for two frames, using a double-buffer system identical to that used in events.

A change is observed if it is found in either the current frame's list of changes or the previous frame's.

**Type:** Conservative

**Pros:**
- easy to understand
- easy to implement
- low memory overhead (4 bits per component)
- low time overhead (bit mask and shift every flag once per frame)

**Cons:**
- can result in a great deal of duplicated work
- systems that don't run every frame miss changes
- duplicates detection when looping

## Proposal 3: Last-Tick Change Detection

Flags persist for two frames, using a double-buffer system identical to that used in events.

A change is observed if it is found in the previous frame's list of changes.

**Type:** Exact

**Pros:**
- exact
- easy to understand
- easy to implement
- low memory overhead (4 bits per component)
- low time overhead (bit mask and shift every flag once per frame)

**Cons:**
- change detection is always delayed, possibly causing painful chained delays
- systems that don't run every frame miss changes
- duplicates detection when looping

## Proposal 4: Flag-Doubling Change Detection

Combine Proposal 2 and Proposal 3. Differentiate between `JustChanged` (current behavior) and `Changed` (Proposal 3).

Pack this data into the flags according to [this implementation proposal](https://github.com/bevyengine/bevy/issues/68#issuecomment-769174804).

**Type:** Flaky + Exact

**Pros:**
- allows users to acc
- easy to implement
- low memory overhead (4 bits per component)
- low time overhead (bit mask and shift every flag once per frame)

**Cons:**
- users must specify the type of change detection required
- still quite fragile to system ordering effects when using the flaky `JustChanged` form
- cannot get immediate + exact results
- systems that don't run every frame miss changes
- duplicates detection when looping

## [SELECTED] Proposal 5: Generation-Counter Change Detection

A global counter is increased after each system is run. Each component saves the time of last mutation, and each system saves the time of last execution. Mutation is detected when the component's counter is greater than the system's counter. Discussed [here](https://github.com/bevyengine/bevy/issues/68#issuecomment-769174804). How to handle addition detection is unsolved; the current proposal is to use the highest bit of the counter as in proposal 1.

**Type:** Exact (for mutations), flaky (for additions)

**Pros:**
- low time overhead (set component counter on access, set system counter after execution)
- robust to systems that don't run every frame
- robust to systems that loop

**Cons:**
- moderately complex implementation
- must be modified as systems are inserted dynamically
- medium memory overhead (4 bytes per component + system)
- unsolved addition detection

## Proposal 6: System-Data Change Detection

For each system, track which system's changes it has seen. This approach is only worth fully designing and implementing if Proposal 5 fails in some way.  

**Type:** Exact

**Pros:**
- exact
- conceptually simple

**Cons:**
- requires storing data on each system
- implementation is complex
- must be modified as systems are inserted dynamically

## Proposal 7: Total-Order Change Detection

Discussed [here](https://github.com/bevyengine/bevy/issues/68#issuecomment-754326523). This proposal is somewhat complicated by the new scheduler, but I believe it should still be conceptually feasible. This approach is only worth fully designing and implementing if Proposal 5 fails in some way.  

**Type:** Exact

**Pros:**
- exact
- efficient data storage relative to other exact proposals

**Cons:**
- requires access to the scheduler
- complex implementation and difficulty grokking
- must be modified as systems are inserted dynamically

# Tests

- We will need to verify properties 1, 2, 3, 7 and 8. Priority: 1 > 2 = 3 > 8 > 7
- Ideally we can use identical user-facing syntax for all proposals, allowing us to re-use the same syntax for each.
- When writing tests, we need to carefully specify order using explicit dependencies.
- These tests will need to be duplicated for both components and resources.
- We need to be sure to handle cases where ambiguous system orders exist.

`changing_system` is always the system that makes the changes, and `detecting_system` always detects the changes.

The component / resource changed will be simple boolean wrapper structs.

## Basic Added / Mutated / Changed

2 x 3 design:
- Resources vs. Components
- Added vs. Changed vs. Mutated
- `changing_system` runs before `detecting_system`
- verify at the end of tick 2

## At Least Once

2 x 3 design:
- Resources vs. Components
- Added vs. Changed vs. Mutated
- `changing_system` runs after `detecting_system`
- verify at the end of tick 2

## At Most Once

2 x 3 design:
- Resources vs. Components
- Added vs. Changed vs. Mutated
- `changing_system` runs once before `detecting_system`
- increment a counter based on the number of changes detected
- verify at the end of tick 2

## Fast Detection
2 x 3 design:
- Resources vs. Components
- Added vs. Changed vs. Mutated
- `changing_system` runs before `detecting_system`
- verify at the end of tick 1

## Ambiguous System Ordering Robustness
2 x 3 x 2 design:
- Resources vs. Components
- Added vs. Changed vs. Mutated
- `changing_system` runs [before/after] `detecting_system` in tick 1
- `changing_system` runs [after/before] `detecting_system` in tick 2

## System Pausing
2 x 3 design:
- Resources vs. Components
- Added vs. Changed vs. Mutated
- `changing_system` runs in tick 1, then is disabled by run criteria
- `detecting_system` is disabled by run criteria until it is run once during tick 3
- verify at the end of tick 3

## Addition Causes Mutation

2 design:
- Resources vs. Components
- `adding_system_1` adds a component / resource
- `adding system_2` adds the same component / resource
- verify the `Mutated` flag at the end of the tick
- verify the `Added` flag at the end of the tick

First check tests for: https://github.com/bevyengine/bevy/issues/333
Second check tests for: https://github.com/bevyengine/bevy/issues/1443

## Changes Made By Commands

- `adding_system` runs in Update in tick 1, and sends a command to add a component 
- `detecting_system` runs in Update in tick 1 and 2, after `adding_system`
- We can't detect the changes in tick 1, since they haven't been processed yet
- If we were to track these changes as being emitted by `adding_system`, we can't detect the changes in tick 2 either, since `detecting_system` has already run once after `adding_system` :( 

# Benchmarks

See: [general advice](https://github.com/bevyengine/bevy/blob/master/docs/profiling.md), [Criterion crate](https://github.com/bheisler/criterion.rs)

There are several critical parameters to vary: 
1. entity count (1 to 10^9)
2. fraction of entities that are changed (0% to 100%)
3. cost to perform work on changed entities, i.e. workload (1 ns to 1s)

1 and 2 should be varied between benchmark runs. 3 can be added on computationally.

We want to measure:
- memory cost
- run time

We should collect these measurements across several frames (100?) to reduce bootup effects and accurately measure the mean, variance and drift.

Entity-component change detection is much more important to benchmark than resource change detection, due to the orders of magnitude higher number of pieces of data.

No change detection at all should be included in benchmarks as a second control for cases where missing changes is unacceptable.

## Graphs
1. y: performance, x: log_10(entity count), color: proposal, facet: performance metric. Set cost to perform work to 0. 
2. y: run time, x: cost to perform work, color: proposal, facet: fraction changed. Set number of entities to 10^6
3. y: memory, x: frames, color: proposal

# Conclusions
1. Is the theoretical categorization of the proposals correct according to our tests?
2. How does the performance of the proposals compare without any load?
3. How does the performance of the proposals compare with realistic loads?
4. At what workload does more exact change tracking become worth the (presumably) higher overhead?
5. When does adding change-detection to save on work become worthwhile?
6. Is there enough divergence in performance between the best solutions in each class to ship more than one change-tracking solution?

# Implementation Plan

1. Write a test suite.
2. Verify that tests fail for existing approach.
3. Write a benchmark suite.
4. Get performance numbers for existing approach.
5. Implement, test and benchmark various solutions using a Git branch per proposal.
6. Create a draft PR with all solutions and present results to team.
7. Select a solution and replace existing change detection.

Co-authored-by: Brice DAVIER <bricedavier@gmail.com>
Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2021-03-19 17:53:26 +00:00

468 lines
17 KiB
Rust

use crate::{
archetype::{Archetype, ArchetypeComponentId, ArchetypeGeneration, ArchetypeId},
component::ComponentId,
entity::Entity,
query::{
Access, Fetch, FetchState, FilterFetch, FilteredAccess, QueryIter, ReadOnlyFetch,
WorldQuery,
},
storage::TableId,
world::{World, WorldId},
};
use bevy_tasks::TaskPool;
use fixedbitset::FixedBitSet;
use thiserror::Error;
pub struct QueryState<Q: WorldQuery, F: WorldQuery = ()>
where
F::Fetch: FilterFetch,
{
world_id: WorldId,
pub(crate) archetype_generation: ArchetypeGeneration,
pub(crate) matched_tables: FixedBitSet,
pub(crate) matched_archetypes: FixedBitSet,
pub(crate) archetype_component_access: Access<ArchetypeComponentId>,
pub(crate) component_access: FilteredAccess<ComponentId>,
// NOTE: we maintain both a TableId bitset and a vec because iterating the vec is faster
pub(crate) matched_table_ids: Vec<TableId>,
// NOTE: we maintain both a ArchetypeId bitset and a vec because iterating the vec is faster
pub(crate) matched_archetype_ids: Vec<ArchetypeId>,
pub(crate) fetch_state: Q::State,
pub(crate) filter_state: F::State,
}
impl<Q: WorldQuery, F: WorldQuery> QueryState<Q, F>
where
F::Fetch: FilterFetch,
{
pub fn new(world: &mut World) -> Self {
let fetch_state = <Q::State as FetchState>::init(world);
let filter_state = <F::State as FetchState>::init(world);
let mut component_access = Default::default();
fetch_state.update_component_access(&mut component_access);
filter_state.update_component_access(&mut component_access);
let mut state = Self {
world_id: world.id(),
archetype_generation: ArchetypeGeneration::new(usize::MAX),
matched_table_ids: Vec::new(),
matched_archetype_ids: Vec::new(),
fetch_state,
filter_state,
component_access,
matched_tables: Default::default(),
matched_archetypes: Default::default(),
archetype_component_access: Default::default(),
};
state.validate_world_and_update_archetypes(world);
state
}
pub fn validate_world_and_update_archetypes(&mut self, world: &World) {
if world.id() != self.world_id {
panic!("Attempted to use {} with a mismatched World. QueryStates can only be used with the World they were created from.",
std::any::type_name::<Self>());
}
let archetypes = world.archetypes();
let old_generation = self.archetype_generation;
let archetype_index_range = if old_generation == archetypes.generation() {
0..0
} else {
self.archetype_generation = archetypes.generation();
if old_generation.value() == usize::MAX {
0..archetypes.len()
} else {
old_generation.value()..archetypes.len()
}
};
for archetype_index in archetype_index_range {
self.new_archetype(&archetypes[ArchetypeId::new(archetype_index)]);
}
}
pub fn new_archetype(&mut self, archetype: &Archetype) {
let table_index = archetype.table_id().index();
if self.fetch_state.matches_archetype(archetype)
&& self.filter_state.matches_archetype(archetype)
{
self.fetch_state
.update_archetype_component_access(archetype, &mut self.archetype_component_access);
self.filter_state
.update_archetype_component_access(archetype, &mut self.archetype_component_access);
self.matched_archetypes.grow(archetype.id().index() + 1);
self.matched_archetypes.set(archetype.id().index(), true);
self.matched_archetype_ids.push(archetype.id());
if !self.matched_tables.contains(table_index) {
self.matched_tables.grow(table_index + 1);
self.matched_tables.set(table_index, true);
self.matched_table_ids.push(archetype.table_id());
}
}
}
#[inline]
pub fn get<'w>(
&mut self,
world: &'w World,
entity: Entity,
) -> Result<<Q::Fetch as Fetch<'w>>::Item, QueryEntityError>
where
Q::Fetch: ReadOnlyFetch,
{
// SAFE: query is read only
unsafe { self.get_unchecked(world, entity) }
}
#[inline]
pub fn get_mut<'w>(
&mut self,
world: &'w mut World,
entity: Entity,
) -> Result<<Q::Fetch as Fetch<'w>>::Item, QueryEntityError> {
// SAFE: query has unique world access
unsafe { self.get_unchecked(world, entity) }
}
/// # Safety
/// This does not check for mutable query correctness. To be safe, make sure mutable queries
/// have unique access to the components they query.
#[inline]
pub unsafe fn get_unchecked<'w>(
&mut self,
world: &'w World,
entity: Entity,
) -> Result<<Q::Fetch as Fetch<'w>>::Item, QueryEntityError> {
self.validate_world_and_update_archetypes(world);
self.get_unchecked_manual(
world,
entity,
world.last_change_tick(),
world.read_change_tick(),
)
}
/// # Safety
/// This does not check for mutable query correctness. To be safe, make sure mutable queries
/// have unique access to the components they query.
pub unsafe fn get_unchecked_manual<'w>(
&self,
world: &'w World,
entity: Entity,
last_change_tick: u32,
change_tick: u32,
) -> Result<<Q::Fetch as Fetch<'w>>::Item, QueryEntityError> {
let location = world
.entities
.get(entity)
.ok_or(QueryEntityError::NoSuchEntity)?;
if !self
.matched_archetypes
.contains(location.archetype_id.index())
{
return Err(QueryEntityError::QueryDoesNotMatch);
}
let archetype = &world.archetypes[location.archetype_id];
let mut fetch =
<Q::Fetch as Fetch>::init(world, &self.fetch_state, last_change_tick, change_tick);
let mut filter =
<F::Fetch as Fetch>::init(world, &self.filter_state, last_change_tick, change_tick);
fetch.set_archetype(&self.fetch_state, archetype, &world.storages().tables);
filter.set_archetype(&self.filter_state, archetype, &world.storages().tables);
if filter.archetype_filter_fetch(location.index) {
Ok(fetch.archetype_fetch(location.index))
} else {
Err(QueryEntityError::QueryDoesNotMatch)
}
}
#[inline]
pub fn iter<'w, 's>(&'s mut self, world: &'w World) -> QueryIter<'w, 's, Q, F>
where
Q::Fetch: ReadOnlyFetch,
{
// SAFE: query is read only
unsafe { self.iter_unchecked(world) }
}
#[inline]
pub fn iter_mut<'w, 's>(&'s mut self, world: &'w mut World) -> QueryIter<'w, 's, Q, F> {
// SAFE: query has unique world access
unsafe { self.iter_unchecked(world) }
}
/// # Safety
/// This does not check for mutable query correctness. To be safe, make sure mutable queries
/// have unique access to the components they query.
#[inline]
pub unsafe fn iter_unchecked<'w, 's>(
&'s mut self,
world: &'w World,
) -> QueryIter<'w, 's, Q, F> {
self.validate_world_and_update_archetypes(world);
self.iter_unchecked_manual(world, world.last_change_tick(), world.read_change_tick())
}
/// # Safety
/// This does not check for mutable query correctness. To be safe, make sure mutable queries
/// have unique access to the components they query.
/// This does not validate that `world.id()` matches `self.world_id`. Calling this on a `world`
/// with a mismatched WorldId is unsafe.
#[inline]
pub(crate) unsafe fn iter_unchecked_manual<'w, 's>(
&'s self,
world: &'w World,
last_change_tick: u32,
change_tick: u32,
) -> QueryIter<'w, 's, Q, F> {
QueryIter::new(world, self, last_change_tick, change_tick)
}
#[inline]
pub fn for_each<'w>(
&mut self,
world: &'w World,
func: impl FnMut(<Q::Fetch as Fetch<'w>>::Item),
) where
Q::Fetch: ReadOnlyFetch,
{
// SAFE: query is read only
unsafe {
self.for_each_unchecked(world, func);
}
}
#[inline]
pub fn for_each_mut<'w>(
&mut self,
world: &'w mut World,
func: impl FnMut(<Q::Fetch as Fetch<'w>>::Item),
) {
// SAFE: query has unique world access
unsafe {
self.for_each_unchecked(world, func);
}
}
/// # Safety
/// This does not check for mutable query correctness. To be safe, make sure mutable queries
/// have unique access to the components they query.
#[inline]
pub unsafe fn for_each_unchecked<'w>(
&mut self,
world: &'w World,
func: impl FnMut(<Q::Fetch as Fetch<'w>>::Item),
) {
self.validate_world_and_update_archetypes(world);
self.for_each_unchecked_manual(
world,
func,
world.last_change_tick(),
world.read_change_tick(),
);
}
#[inline]
pub fn par_for_each<'w>(
&mut self,
world: &'w World,
task_pool: &TaskPool,
batch_size: usize,
func: impl Fn(<Q::Fetch as Fetch<'w>>::Item) + Send + Sync + Clone,
) where
Q::Fetch: ReadOnlyFetch,
{
// SAFE: query is read only
unsafe {
self.par_for_each_unchecked(world, task_pool, batch_size, func);
}
}
#[inline]
pub fn par_for_each_mut<'w>(
&mut self,
world: &'w mut World,
task_pool: &TaskPool,
batch_size: usize,
func: impl Fn(<Q::Fetch as Fetch<'w>>::Item) + Send + Sync + Clone,
) {
// SAFE: query has unique world access
unsafe {
self.par_for_each_unchecked(world, task_pool, batch_size, func);
}
}
/// # Safety
/// This does not check for mutable query correctness. To be safe, make sure mutable queries
/// have unique access to the components they query.
#[inline]
pub unsafe fn par_for_each_unchecked<'w>(
&mut self,
world: &'w World,
task_pool: &TaskPool,
batch_size: usize,
func: impl Fn(<Q::Fetch as Fetch<'w>>::Item) + Send + Sync + Clone,
) {
self.validate_world_and_update_archetypes(world);
self.par_for_each_unchecked_manual(
world,
task_pool,
batch_size,
func,
world.last_change_tick(),
world.read_change_tick(),
);
}
/// # Safety
/// This does not check for mutable query correctness. To be safe, make sure mutable queries
/// have unique access to the components they query.
/// This does not validate that `world.id()` matches `self.world_id`. Calling this on a `world`
/// with a mismatched WorldId is unsafe.
pub(crate) unsafe fn for_each_unchecked_manual<'w, 's>(
&'s self,
world: &'w World,
mut func: impl FnMut(<Q::Fetch as Fetch<'w>>::Item),
last_change_tick: u32,
change_tick: u32,
) {
let mut fetch =
<Q::Fetch as Fetch>::init(world, &self.fetch_state, last_change_tick, change_tick);
let mut filter =
<F::Fetch as Fetch>::init(world, &self.filter_state, last_change_tick, change_tick);
if fetch.is_dense() && filter.is_dense() {
let tables = &world.storages().tables;
for table_id in self.matched_table_ids.iter() {
let table = &tables[*table_id];
fetch.set_table(&self.fetch_state, table);
filter.set_table(&self.filter_state, table);
for table_index in 0..table.len() {
if !filter.table_filter_fetch(table_index) {
continue;
}
let item = fetch.table_fetch(table_index);
func(item);
}
}
} else {
let archetypes = &world.archetypes;
let tables = &world.storages().tables;
for archetype_id in self.matched_archetype_ids.iter() {
let archetype = &archetypes[*archetype_id];
fetch.set_archetype(&self.fetch_state, archetype, tables);
filter.set_archetype(&self.filter_state, archetype, tables);
for archetype_index in 0..archetype.len() {
if !filter.archetype_filter_fetch(archetype_index) {
continue;
}
func(fetch.archetype_fetch(archetype_index));
}
}
}
}
/// # Safety
/// This does not check for mutable query correctness. To be safe, make sure mutable queries
/// have unique access to the components they query.
/// This does not validate that `world.id()` matches `self.world_id`. Calling this on a `world`
/// with a mismatched WorldId is unsafe.
pub unsafe fn par_for_each_unchecked_manual<'w, 's>(
&'s self,
world: &'w World,
task_pool: &TaskPool,
batch_size: usize,
func: impl Fn(<Q::Fetch as Fetch<'w>>::Item) + Send + Sync + Clone,
last_change_tick: u32,
change_tick: u32,
) {
task_pool.scope(|scope| {
let fetch =
<Q::Fetch as Fetch>::init(world, &self.fetch_state, last_change_tick, change_tick);
let filter =
<F::Fetch as Fetch>::init(world, &self.filter_state, last_change_tick, change_tick);
if fetch.is_dense() && filter.is_dense() {
let tables = &world.storages().tables;
for table_id in self.matched_table_ids.iter() {
let table = &tables[*table_id];
let mut offset = 0;
while offset < table.len() {
let func = func.clone();
scope.spawn(async move {
let mut fetch = <Q::Fetch as Fetch>::init(
world,
&self.fetch_state,
last_change_tick,
change_tick,
);
let mut filter = <F::Fetch as Fetch>::init(
world,
&self.filter_state,
last_change_tick,
change_tick,
);
let tables = &world.storages().tables;
let table = &tables[*table_id];
fetch.set_table(&self.fetch_state, table);
filter.set_table(&self.filter_state, table);
let len = batch_size.min(table.len() - offset);
for table_index in offset..offset + len {
if !filter.table_filter_fetch(table_index) {
continue;
}
let item = fetch.table_fetch(table_index);
func(item);
}
});
offset += batch_size;
}
}
} else {
let archetypes = &world.archetypes;
for archetype_id in self.matched_archetype_ids.iter() {
let mut offset = 0;
let archetype = &archetypes[*archetype_id];
while offset < archetype.len() {
let func = func.clone();
scope.spawn(async move {
let mut fetch = <Q::Fetch as Fetch>::init(
world,
&self.fetch_state,
last_change_tick,
change_tick,
);
let mut filter = <F::Fetch as Fetch>::init(
world,
&self.filter_state,
last_change_tick,
change_tick,
);
let tables = &world.storages().tables;
let archetype = &world.archetypes[*archetype_id];
fetch.set_archetype(&self.fetch_state, archetype, tables);
filter.set_archetype(&self.filter_state, archetype, tables);
for archetype_index in 0..archetype.len() {
if !filter.archetype_filter_fetch(archetype_index) {
continue;
}
func(fetch.archetype_fetch(archetype_index));
}
});
offset += batch_size;
}
}
}
});
}
}
/// An error that occurs when retrieving a specific [Entity]'s query result.
#[derive(Error, Debug)]
pub enum QueryEntityError {
#[error("The given entity does not have the requested component.")]
QueryDoesNotMatch,
#[error("The requested entity does not exist.")]
NoSuchEntity,
}