bevy/examples/stress_tests/bevymark.rs
Robert Swain b6ead2be95
Use EntityHashMap<Entity, T> for render world entity storage for better performance (#9903)
# Objective

- Improve rendering performance, particularly by avoiding the large
system commands costs of using the ECS in the way that the render world
does.

## Solution

- Define `EntityHasher` that calculates a hash from the
`Entity.to_bits()` by `i | (i.wrapping_mul(0x517cc1b727220a95) << 32)`.
`0x517cc1b727220a95` is something like `u64::MAX / N` for N that gives a
value close to π and that works well for hashing. Thanks for @SkiFire13
for the suggestion and to @nicopap for alternative suggestions and
discussion. This approach comes from `rustc-hash` (a.k.a. `FxHasher`)
with some tweaks for the case of hashing an `Entity`. `FxHasher` and
`SeaHasher` were also tested but were significantly slower.
- Define `EntityHashMap` type that uses the `EntityHashser`
- Use `EntityHashMap<Entity, T>` for render world entity storage,
including:
- `RenderMaterialInstances` - contains the `AssetId<M>` of the material
associated with the entity. Also for 2D.
- `RenderMeshInstances` - contains mesh transforms, flags and properties
about mesh entities. Also for 2D.
- `SkinIndices` and `MorphIndices` - contains the skin and morph index
for an entity, respectively
  - `ExtractedSprites`
  - `ExtractedUiNodes`

## Benchmarks

All benchmarks have been conducted on an M1 Max connected to AC power.
The tests are run for 1500 frames. The 1000th frame is captured for
comparison to check for visual regressions. There were none.

### 2D Meshes

`bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d`

#### `--ordered-z`

This test spawns the 2D meshes with z incrementing back to front, which
is the ideal arrangement allocation order as it matches the sorted
render order which means lookups have a high cache hit rate.

<img width="1112" alt="Screenshot 2023-09-27 at 07 50 45"
src="https://github.com/bevyengine/bevy/assets/302146/e140bc98-7091-4a3b-8ae1-ab75d16d2ccb">

-39.1% median frame time.

#### Random

This test spawns the 2D meshes with random z. This not only makes the
batching and transparent 2D pass lookups get a lot of cache misses, it
also currently means that the meshes are almost certain to not be
batchable.

<img width="1108" alt="Screenshot 2023-09-27 at 07 51 28"
src="https://github.com/bevyengine/bevy/assets/302146/29c2e813-645a-43ce-982a-55df4bf7d8c4">

-7.2% median frame time.

### 3D Meshes

`many_cubes --benchmark`

<img width="1112" alt="Screenshot 2023-09-27 at 07 51 57"
src="https://github.com/bevyengine/bevy/assets/302146/1a729673-3254-4e2a-9072-55e27c69f0fc">

-7.7% median frame time.

### Sprites

**NOTE: On `main` sprites are using `SparseSet<Entity, T>`!**

`bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite`

#### `--ordered-z`

This test spawns the sprites with z incrementing back to front, which is
the ideal arrangement allocation order as it matches the sorted render
order which means lookups have a high cache hit rate.

<img width="1116" alt="Screenshot 2023-09-27 at 07 52 31"
src="https://github.com/bevyengine/bevy/assets/302146/bc8eab90-e375-4d31-b5cd-f55f6f59ab67">

+13.0% median frame time.

#### Random

This test spawns the sprites with random z. This makes the batching and
transparent 2D pass lookups get a lot of cache misses.

<img width="1109" alt="Screenshot 2023-09-27 at 07 53 01"
src="https://github.com/bevyengine/bevy/assets/302146/22073f5d-99a7-49b0-9584-d3ac3eac3033">

+0.6% median frame time.

### UI

**NOTE: On `main` UI is using `SparseSet<Entity, T>`!**

`many_buttons`

<img width="1111" alt="Screenshot 2023-09-27 at 07 53 26"
src="https://github.com/bevyengine/bevy/assets/302146/66afd56d-cbe4-49e7-8b64-2f28f6043d85">

+15.1% median frame time.

## Alternatives

- Cart originally suggested trying out `SparseSet<Entity, T>` and indeed
that is slightly faster under ideal conditions. However,
`PassHashMap<Entity, T>` has better worst case performance when data is
randomly distributed, rather than in sorted render order, and does not
have the worst case memory usage that `SparseSet`'s dense `Vec<usize>`
that maps from the `Entity` index to sparse index into `Vec<T>`. This
dense `Vec` has to be as large as the largest Entity index used with the
`SparseSet`.
- I also tested `PassHashMap<u32, T>`, intending to use `Entity.index()`
as the key, but this proved to sometimes be slower and mostly no
different.
- The only outstanding approach that has not been implemented and tested
is to _not_ clear the render world of its entities each frame. That has
its own problems, though they could perhaps be solved.
- Performance-wise, if the entities and their component data were not
cleared, then they would incur table moves on spawn, and should not
thereafter, rather just their component data would be overwritten.
Ideally we would have a neat way of either updating data in-place via
`&mut T` queries, or inserting components if not present. This would
likely be quite cumbersome to have to remember to do everywhere, but
perhaps it only needs to be done in the more performance-sensitive
systems.
- The main problem to solve however is that we want to both maintain a
mapping between main world entities and render world entities, be able
to run the render app and world in parallel with the main app and world
for pipelined rendering, and at the same time be able to spawn entities
in the render world in such a way that those Entity ids do not collide
with those spawned in the main world. This is potentially quite
solvable, but could well be a lot of ECS work to do it in a way that
makes sense.

---

## Changelog

- Changed: Component data for entities to be drawn are no longer stored
on entities in the render world. Instead, data is stored in a
`EntityHashMap<Entity, T>` in various resources. This brings significant
performance benefits due to the way the render app clears entities every
frame. Resources of most interest are `RenderMeshInstances` and
`RenderMaterialInstances`, and their 2D counterparts.

## Migration Guide

Previously the render app extracted mesh entities and their component
data from the main world and stored them as entities and components in
the render world. Now they are extracted into essentially
`EntityHashMap<Entity, T>` where `T` are structs containing an
appropriate group of data. This means that while extract set systems
will continue to run extract queries against the main world they will
store their data in hash maps. Also, systems in later sets will either
need to look up entities in the available resources such as
`RenderMeshInstances`, or maintain their own `EntityHashMap<Entity, T>`
for their own data.

Before:
```rust
fn queue_custom(
    material_meshes: Query<(Entity, &MeshTransforms, &Handle<Mesh>), With<InstanceMaterialData>>,
) {
    ...
    for (entity, mesh_transforms, mesh_handle) in &material_meshes {
        ...
    }
}
```

After:
```rust
fn queue_custom(
    render_mesh_instances: Res<RenderMeshInstances>,
    instance_entities: Query<Entity, With<InstanceMaterialData>>,
) {
    ...
    for entity in &instance_entities {
        let Some(mesh_instance) = render_mesh_instances.get(&entity) else { continue; };
        // The mesh handle in `AssetId<Mesh>` form, and the `MeshTransforms` can now
        // be found in `mesh_instance` which is a `RenderMeshInstance`
        ...
    }
}
```

---------

Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>
2023-09-27 08:28:28 +00:00

577 lines
17 KiB
Rust

//! This example provides a 2D benchmark.
//!
//! Usage: spawn more entities by clicking on the screen.
use std::str::FromStr;
use argh::FromArgs;
use bevy::{
diagnostic::{DiagnosticsStore, FrameTimeDiagnosticsPlugin, LogDiagnosticsPlugin},
prelude::*,
render::render_resource::{Extent3d, TextureDimension, TextureFormat},
sprite::{MaterialMesh2dBundle, Mesh2dHandle},
window::{PresentMode, WindowResolution},
};
use rand::{rngs::StdRng, seq::SliceRandom, Rng, SeedableRng};
const BIRDS_PER_SECOND: u32 = 10000;
const GRAVITY: f32 = -9.8 * 100.0;
const MAX_VELOCITY: f32 = 750.;
const BIRD_SCALE: f32 = 0.15;
const BIRD_TEXTURE_SIZE: usize = 256;
const HALF_BIRD_SIZE: f32 = BIRD_TEXTURE_SIZE as f32 * BIRD_SCALE * 0.5;
#[derive(Resource)]
struct BevyCounter {
pub count: usize,
pub color: Color,
}
#[derive(Component)]
struct Bird {
velocity: Vec3,
}
#[derive(FromArgs, Resource)]
/// `bevymark` sprite / 2D mesh stress test
struct Args {
/// whether to use sprite or mesh2d
#[argh(option, default = "Mode::Sprite")]
mode: Mode,
/// whether to step animations by a fixed amount such that each frame is the same across runs.
/// If spawning waves, all are spawned up-front to immediately start rendering at the heaviest
/// load.
#[argh(switch)]
benchmark: bool,
/// how many birds to spawn per wave.
#[argh(option, default = "0")]
per_wave: usize,
/// the number of waves to spawn.
#[argh(option, default = "0")]
waves: usize,
/// whether to vary the material data in each instance.
#[argh(switch)]
vary_per_instance: bool,
/// the number of different textures from which to randomly select the material color. 0 means no textures.
#[argh(option, default = "1")]
material_texture_count: usize,
/// generate z values in increasing order rather than randomly
#[argh(switch)]
ordered_z: bool,
}
#[derive(Default, Clone)]
enum Mode {
#[default]
Sprite,
Mesh2d,
}
impl FromStr for Mode {
type Err = String;
fn from_str(s: &str) -> Result<Self, Self::Err> {
match s {
"sprite" => Ok(Self::Sprite),
"mesh2d" => Ok(Self::Mesh2d),
_ => Err(format!(
"Unknown mode: '{s}', valid modes: 'sprite', 'mesh2d'"
)),
}
}
}
const FIXED_TIMESTEP: f32 = 0.2;
fn main() {
let args: Args = argh::from_env();
App::new()
.add_plugins((
DefaultPlugins.set(WindowPlugin {
primary_window: Some(Window {
title: "BevyMark".into(),
resolution: WindowResolution::new(1920.0, 1080.0)
.with_scale_factor_override(1.0),
present_mode: PresentMode::AutoNoVsync,
..default()
}),
..default()
}),
FrameTimeDiagnosticsPlugin,
LogDiagnosticsPlugin::default(),
))
.insert_resource(args)
.insert_resource(BevyCounter {
count: 0,
color: Color::WHITE,
})
.add_systems(Startup, setup)
.add_systems(FixedUpdate, scheduled_spawner)
.add_systems(
Update,
(
mouse_handler,
movement_system,
collision_system,
counter_system,
),
)
.insert_resource(FixedTime::new_from_secs(FIXED_TIMESTEP))
.run();
}
#[derive(Resource)]
struct BirdScheduled {
waves: usize,
per_wave: usize,
}
fn scheduled_spawner(
mut commands: Commands,
args: Res<Args>,
windows: Query<&Window>,
mut scheduled: ResMut<BirdScheduled>,
mut counter: ResMut<BevyCounter>,
bird_resources: ResMut<BirdResources>,
) {
let window = windows.single();
if scheduled.waves > 0 {
let bird_resources = bird_resources.into_inner();
spawn_birds(
&mut commands,
args.into_inner(),
&window.resolution,
&mut counter,
scheduled.per_wave,
bird_resources,
None,
scheduled.waves - 1,
);
scheduled.waves -= 1;
}
}
#[derive(Resource)]
struct BirdResources {
textures: Vec<Handle<Image>>,
materials: Vec<Handle<ColorMaterial>>,
quad: Mesh2dHandle,
color_rng: StdRng,
material_rng: StdRng,
velocity_rng: StdRng,
transform_rng: StdRng,
}
#[derive(Component)]
struct StatsText;
#[allow(clippy::too_many_arguments)]
fn setup(
mut commands: Commands,
args: Res<Args>,
asset_server: Res<AssetServer>,
mut meshes: ResMut<Assets<Mesh>>,
material_assets: ResMut<Assets<ColorMaterial>>,
images: ResMut<Assets<Image>>,
windows: Query<&Window>,
counter: ResMut<BevyCounter>,
) {
warn!(include_str!("warning_string.txt"));
let args = args.into_inner();
let images = images.into_inner();
let mut textures = Vec::with_capacity(args.material_texture_count.max(1));
if matches!(args.mode, Mode::Sprite) || args.material_texture_count > 0 {
textures.push(asset_server.load("branding/icon.png"));
}
init_textures(&mut textures, args, images);
let material_assets = material_assets.into_inner();
let materials = init_materials(args, &textures, material_assets);
let mut bird_resources = BirdResources {
textures,
materials,
quad: meshes
.add(Mesh::from(shape::Quad::new(Vec2::splat(
BIRD_TEXTURE_SIZE as f32,
))))
.into(),
color_rng: StdRng::seed_from_u64(42),
material_rng: StdRng::seed_from_u64(42),
velocity_rng: StdRng::seed_from_u64(42),
transform_rng: StdRng::seed_from_u64(42),
};
let text_section = move |color, value: &str| {
TextSection::new(
value,
TextStyle {
font_size: 40.0,
color,
..default()
},
)
};
commands.spawn(Camera2dBundle::default());
commands
.spawn(NodeBundle {
style: Style {
position_type: PositionType::Absolute,
padding: UiRect::all(Val::Px(5.0)),
..default()
},
z_index: ZIndex::Global(i32::MAX),
background_color: Color::BLACK.with_a(0.75).into(),
..default()
})
.with_children(|c| {
c.spawn((
TextBundle::from_sections([
text_section(Color::GREEN, "Bird Count: "),
text_section(Color::CYAN, ""),
text_section(Color::GREEN, "\nFPS (raw): "),
text_section(Color::CYAN, ""),
text_section(Color::GREEN, "\nFPS (SMA): "),
text_section(Color::CYAN, ""),
text_section(Color::GREEN, "\nFPS (EMA): "),
text_section(Color::CYAN, ""),
]),
StatsText,
));
});
let mut scheduled = BirdScheduled {
per_wave: args.per_wave,
waves: args.waves,
};
if args.benchmark {
let counter = counter.into_inner();
for wave in (0..scheduled.waves).rev() {
spawn_birds(
&mut commands,
args,
&windows.single().resolution,
counter,
scheduled.per_wave,
&mut bird_resources,
Some(wave),
wave,
);
}
scheduled.waves = 0;
}
commands.insert_resource(bird_resources);
commands.insert_resource(scheduled);
}
#[allow(clippy::too_many_arguments)]
fn mouse_handler(
mut commands: Commands,
args: Res<Args>,
time: Res<Time>,
mouse_button_input: Res<Input<MouseButton>>,
windows: Query<&Window>,
bird_resources: ResMut<BirdResources>,
mut counter: ResMut<BevyCounter>,
mut rng: Local<Option<StdRng>>,
mut wave: Local<usize>,
) {
if rng.is_none() {
*rng = Some(StdRng::seed_from_u64(42));
}
let rng = rng.as_mut().unwrap();
let window = windows.single();
if mouse_button_input.just_released(MouseButton::Left) {
counter.color = Color::rgb_linear(rng.gen(), rng.gen(), rng.gen());
}
if mouse_button_input.pressed(MouseButton::Left) {
let spawn_count = (BIRDS_PER_SECOND as f64 * time.delta_seconds_f64()) as usize;
spawn_birds(
&mut commands,
args.into_inner(),
&window.resolution,
&mut counter,
spawn_count,
bird_resources.into_inner(),
None,
*wave,
);
*wave += 1;
}
}
fn bird_velocity_transform(
half_extents: Vec2,
mut translation: Vec3,
velocity_rng: &mut StdRng,
waves: Option<usize>,
dt: f32,
) -> (Transform, Vec3) {
let mut velocity = Vec3::new(MAX_VELOCITY * (velocity_rng.gen::<f32>() - 0.5), 0., 0.);
if let Some(waves) = waves {
// Step the movement and handle collisions as if the wave had been spawned at fixed time intervals
// and with dt-spaced frames of simulation
for _ in 0..(waves * (FIXED_TIMESTEP / dt).round() as usize) {
step_movement(&mut translation, &mut velocity, dt);
handle_collision(half_extents, &translation, &mut velocity);
}
}
(
Transform::from_translation(translation).with_scale(Vec3::splat(BIRD_SCALE)),
velocity,
)
}
const FIXED_DELTA_TIME: f32 = 1.0 / 60.0;
#[allow(clippy::too_many_arguments)]
fn spawn_birds(
commands: &mut Commands,
args: &Args,
primary_window_resolution: &WindowResolution,
counter: &mut BevyCounter,
spawn_count: usize,
bird_resources: &mut BirdResources,
waves_to_simulate: Option<usize>,
wave: usize,
) {
let bird_x = (primary_window_resolution.width() / -2.) + HALF_BIRD_SIZE;
let bird_y = (primary_window_resolution.height() / 2.) - HALF_BIRD_SIZE;
let half_extents = 0.5
* Vec2::new(
primary_window_resolution.width(),
primary_window_resolution.height(),
);
let color = counter.color;
let current_count = counter.count;
match args.mode {
Mode::Sprite => {
let batch = (0..spawn_count)
.map(|count| {
let bird_z = if args.ordered_z {
(current_count + count) as f32 * 0.00001
} else {
bird_resources.transform_rng.gen::<f32>()
};
let (transform, velocity) = bird_velocity_transform(
half_extents,
Vec3::new(bird_x, bird_y, bird_z),
&mut bird_resources.velocity_rng,
waves_to_simulate,
FIXED_DELTA_TIME,
);
let color = if args.vary_per_instance {
Color::rgb_linear(
bird_resources.color_rng.gen(),
bird_resources.color_rng.gen(),
bird_resources.color_rng.gen(),
)
} else {
color
};
(
SpriteBundle {
texture: bird_resources
.textures
.choose(&mut bird_resources.material_rng)
.unwrap()
.clone(),
transform,
sprite: Sprite { color, ..default() },
..default()
},
Bird { velocity },
)
})
.collect::<Vec<_>>();
commands.spawn_batch(batch);
}
Mode::Mesh2d => {
let batch = (0..spawn_count)
.map(|count| {
let bird_z = if args.ordered_z {
(current_count + count) as f32 * 0.00001
} else {
bird_resources.transform_rng.gen::<f32>()
};
let (transform, velocity) = bird_velocity_transform(
half_extents,
Vec3::new(bird_x, bird_y, bird_z),
&mut bird_resources.velocity_rng,
waves_to_simulate,
FIXED_DELTA_TIME,
);
let material =
if args.vary_per_instance || args.material_texture_count > args.waves {
bird_resources
.materials
.choose(&mut bird_resources.material_rng)
.unwrap()
.clone()
} else {
bird_resources.materials[wave % bird_resources.materials.len()].clone()
};
(
MaterialMesh2dBundle {
mesh: bird_resources.quad.clone(),
material,
transform,
..default()
},
Bird { velocity },
)
})
.collect::<Vec<_>>();
commands.spawn_batch(batch);
}
}
counter.count += spawn_count;
counter.color = Color::rgb_linear(
bird_resources.color_rng.gen(),
bird_resources.color_rng.gen(),
bird_resources.color_rng.gen(),
);
}
fn step_movement(translation: &mut Vec3, velocity: &mut Vec3, dt: f32) {
translation.x += velocity.x * dt;
translation.y += velocity.y * dt;
velocity.y += GRAVITY * dt;
}
fn movement_system(
args: Res<Args>,
time: Res<Time>,
mut bird_query: Query<(&mut Bird, &mut Transform)>,
) {
let dt = if args.benchmark {
FIXED_DELTA_TIME
} else {
time.delta_seconds()
};
for (mut bird, mut transform) in &mut bird_query {
step_movement(&mut transform.translation, &mut bird.velocity, dt);
}
}
fn handle_collision(half_extents: Vec2, translation: &Vec3, velocity: &mut Vec3) {
if (velocity.x > 0. && translation.x + HALF_BIRD_SIZE > half_extents.x)
|| (velocity.x <= 0. && translation.x - HALF_BIRD_SIZE < -(half_extents.x))
{
velocity.x = -velocity.x;
}
let velocity_y = velocity.y;
if velocity_y < 0. && translation.y - HALF_BIRD_SIZE < -half_extents.y {
velocity.y = -velocity_y;
}
if translation.y + HALF_BIRD_SIZE > half_extents.y && velocity_y > 0.0 {
velocity.y = 0.0;
}
}
fn collision_system(windows: Query<&Window>, mut bird_query: Query<(&mut Bird, &Transform)>) {
let window = windows.single();
let half_extents = 0.5 * Vec2::new(window.width(), window.height());
for (mut bird, transform) in &mut bird_query {
handle_collision(half_extents, &transform.translation, &mut bird.velocity);
}
}
fn counter_system(
diagnostics: Res<DiagnosticsStore>,
counter: Res<BevyCounter>,
mut query: Query<&mut Text, With<StatsText>>,
) {
let mut text = query.single_mut();
if counter.is_changed() {
text.sections[1].value = counter.count.to_string();
}
if let Some(fps) = diagnostics.get(FrameTimeDiagnosticsPlugin::FPS) {
if let Some(raw) = fps.value() {
text.sections[3].value = format!("{raw:.2}");
}
if let Some(sma) = fps.average() {
text.sections[5].value = format!("{sma:.2}");
}
if let Some(ema) = fps.smoothed() {
text.sections[7].value = format!("{ema:.2}");
}
};
}
fn init_textures(textures: &mut Vec<Handle<Image>>, args: &Args, images: &mut Assets<Image>) {
let mut color_rng = StdRng::seed_from_u64(42);
while textures.len() < args.material_texture_count {
let pixel = [color_rng.gen(), color_rng.gen(), color_rng.gen(), 255];
textures.push(images.add(Image::new_fill(
Extent3d {
width: BIRD_TEXTURE_SIZE as u32,
height: BIRD_TEXTURE_SIZE as u32,
depth_or_array_layers: 1,
},
TextureDimension::D2,
&pixel,
TextureFormat::Rgba8UnormSrgb,
)));
}
}
fn init_materials(
args: &Args,
textures: &[Handle<Image>],
assets: &mut Assets<ColorMaterial>,
) -> Vec<Handle<ColorMaterial>> {
let capacity = if args.vary_per_instance {
args.per_wave * args.waves
} else {
args.material_texture_count.max(args.waves)
}
.max(1);
let mut materials = Vec::with_capacity(capacity);
materials.push(assets.add(ColorMaterial {
color: Color::WHITE,
texture: textures.get(0).cloned(),
}));
let mut color_rng = StdRng::seed_from_u64(42);
let mut texture_rng = StdRng::seed_from_u64(42);
materials.extend(
std::iter::repeat_with(|| {
assets.add(ColorMaterial {
color: Color::rgb_u8(color_rng.gen(), color_rng.gen(), color_rng.gen()),
texture: textures.choose(&mut texture_rng).cloned(),
})
})
.take(capacity - materials.len()),
);
materials
}