mirror of
https://github.com/bevyengine/bevy
synced 2025-02-17 06:28:34 +00:00
# Objective Fixes #3184. Fixes #6640. Fixes #4798. Using `Query::par_for_each(_mut)` currently requires a `batch_size` parameter, which affects how it chunks up large archetypes and tables into smaller chunks to run in parallel. Tuning this value is difficult, as the performance characteristics entirely depends on the state of the `World` it's being run on. Typically, users will just use a flat constant and just tune it by hand until it performs well in some benchmarks. However, this is both error prone and risks overfitting the tuning on that benchmark. This PR proposes a naive automatic batch-size computation based on the current state of the `World`. ## Background `Query::par_for_each(_mut)` schedules a new Task for every archetype or table that it matches. Archetypes/tables larger than the batch size are chunked into smaller tasks. Assuming every entity matched by the query has an identical workload, this makes the worst case scenario involve using a batch size equal to the size of the largest matched archetype or table. Conversely, a batch size of `max {archetype, table} size / thread count * COUNT_PER_THREAD` is likely the sweetspot where the overhead of scheduling tasks is minimized, at least not without grouping small archetypes/tables together. There is also likely a strict minimum batch size below which the overhead of scheduling these tasks is heavier than running the entire thing single-threaded. ## Solution - [x] Remove the `batch_size` from `Query(State)::par_for_each` and friends. - [x] Add a check to compute `batch_size = max {archeytpe/table} size / thread count * COUNT_PER_THREAD` - [x] ~~Panic if thread count is 0.~~ Defer to `for_each` if the thread count is 1 or less. - [x] Early return if there is no matched table/archetype. - [x] Add override option for users have queries that strongly violate the initial assumption that all iterated entities have an equal workload. --- ## Changelog Changed: `Query::par_for_each(_mut)` has been changed to `Query::par_iter(_mut)` and will now automatically try to produce a batch size for callers based on the current `World` state. ## Migration Guide The `batch_size` parameter for `Query(State)::par_for_each(_mut)` has been removed. These calls will automatically compute a batch size for you. Remove these parameters from all calls to these functions. Before: ```rust fn parallel_system(query: Query<&MyComponent>) { query.par_for_each(32, |comp| { ... }); } ``` After: ```rust fn parallel_system(query: Query<&MyComponent>) { query.par_iter().for_each(|comp| { ... }); } ``` Co-authored-by: Arnav Choubey <56453634+x-52@users.noreply.github.com> Co-authored-by: Robert Swain <robert.swain@gmail.com> Co-authored-by: François <mockersf@gmail.com> Co-authored-by: Corey Farwell <coreyf@rwell.org> Co-authored-by: Aevyrie <aevyrie@gmail.com>
76 lines
2.6 KiB
Rust
76 lines
2.6 KiB
Rust
//! Illustrates parallel queries with `ParallelIterator`.
|
|
|
|
use bevy::ecs::query::BatchingStrategy;
|
|
use bevy::prelude::*;
|
|
use rand::random;
|
|
|
|
#[derive(Component, Deref)]
|
|
struct Velocity(Vec2);
|
|
|
|
fn spawn_system(mut commands: Commands, asset_server: Res<AssetServer>) {
|
|
commands.spawn(Camera2dBundle::default());
|
|
let texture = asset_server.load("branding/icon.png");
|
|
for _ in 0..128 {
|
|
commands.spawn((
|
|
SpriteBundle {
|
|
texture: texture.clone(),
|
|
transform: Transform::from_scale(Vec3::splat(0.1)),
|
|
..default()
|
|
},
|
|
Velocity(20.0 * Vec2::new(random::<f32>() - 0.5, random::<f32>() - 0.5)),
|
|
));
|
|
}
|
|
}
|
|
|
|
// Move sprites according to their velocity
|
|
fn move_system(mut sprites: Query<(&mut Transform, &Velocity)>) {
|
|
// Compute the new location of each sprite in parallel on the
|
|
// ComputeTaskPool
|
|
//
|
|
// This example is only for demonstrative purposes. Using a
|
|
// ParallelIterator for an inexpensive operation like addition on only 128
|
|
// elements will not typically be faster than just using a normal Iterator.
|
|
// See the ParallelIterator documentation for more information on when
|
|
// to use or not use ParallelIterator over a normal Iterator.
|
|
sprites
|
|
.par_iter_mut()
|
|
.for_each_mut(|(mut transform, velocity)| {
|
|
transform.translation += velocity.extend(0.0);
|
|
});
|
|
}
|
|
|
|
// Bounce sprites outside the window
|
|
fn bounce_system(windows: Query<&Window>, mut sprites: Query<(&Transform, &mut Velocity)>) {
|
|
let window = windows.single();
|
|
let width = window.width();
|
|
let height = window.height();
|
|
let left = width / -2.0;
|
|
let right = width / 2.0;
|
|
let bottom = height / -2.0;
|
|
let top = height / 2.0;
|
|
// The default batch size can also be overridden.
|
|
// In this case a batch size of 32 is chosen to limit the overhead of
|
|
// ParallelIterator, since negating a vector is very inexpensive.
|
|
sprites
|
|
.par_iter_mut()
|
|
.batching_strategy(BatchingStrategy::fixed(32))
|
|
.for_each_mut(|(transform, mut v)| {
|
|
if !(left < transform.translation.x
|
|
&& transform.translation.x < right
|
|
&& bottom < transform.translation.y
|
|
&& transform.translation.y < top)
|
|
{
|
|
// For simplicity, just reverse the velocity; don't use realistic bounces
|
|
v.0 = -v.0;
|
|
}
|
|
});
|
|
}
|
|
|
|
fn main() {
|
|
App::new()
|
|
.add_plugins(DefaultPlugins)
|
|
.add_startup_system(spawn_system)
|
|
.add_system(move_system)
|
|
.add_system(bounce_system)
|
|
.run();
|
|
}
|