Use EntityHashMap<Entity, T> for render world entity storage for better performance (#9903)

# Objective

- Improve rendering performance, particularly by avoiding the large
system commands costs of using the ECS in the way that the render world
does.

## Solution

- Define `EntityHasher` that calculates a hash from the
`Entity.to_bits()` by `i | (i.wrapping_mul(0x517cc1b727220a95) << 32)`.
`0x517cc1b727220a95` is something like `u64::MAX / N` for N that gives a
value close to π and that works well for hashing. Thanks for @SkiFire13
for the suggestion and to @nicopap for alternative suggestions and
discussion. This approach comes from `rustc-hash` (a.k.a. `FxHasher`)
with some tweaks for the case of hashing an `Entity`. `FxHasher` and
`SeaHasher` were also tested but were significantly slower.
- Define `EntityHashMap` type that uses the `EntityHashser`
- Use `EntityHashMap<Entity, T>` for render world entity storage,
including:
- `RenderMaterialInstances` - contains the `AssetId<M>` of the material
associated with the entity. Also for 2D.
- `RenderMeshInstances` - contains mesh transforms, flags and properties
about mesh entities. Also for 2D.
- `SkinIndices` and `MorphIndices` - contains the skin and morph index
for an entity, respectively
  - `ExtractedSprites`
  - `ExtractedUiNodes`

## Benchmarks

All benchmarks have been conducted on an M1 Max connected to AC power.
The tests are run for 1500 frames. The 1000th frame is captured for
comparison to check for visual regressions. There were none.

### 2D Meshes

`bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d`

#### `--ordered-z`

This test spawns the 2D meshes with z incrementing back to front, which
is the ideal arrangement allocation order as it matches the sorted
render order which means lookups have a high cache hit rate.

<img width="1112" alt="Screenshot 2023-09-27 at 07 50 45"
src="https://github.com/bevyengine/bevy/assets/302146/e140bc98-7091-4a3b-8ae1-ab75d16d2ccb">

-39.1% median frame time.

#### Random

This test spawns the 2D meshes with random z. This not only makes the
batching and transparent 2D pass lookups get a lot of cache misses, it
also currently means that the meshes are almost certain to not be
batchable.

<img width="1108" alt="Screenshot 2023-09-27 at 07 51 28"
src="https://github.com/bevyengine/bevy/assets/302146/29c2e813-645a-43ce-982a-55df4bf7d8c4">

-7.2% median frame time.

### 3D Meshes

`many_cubes --benchmark`

<img width="1112" alt="Screenshot 2023-09-27 at 07 51 57"
src="https://github.com/bevyengine/bevy/assets/302146/1a729673-3254-4e2a-9072-55e27c69f0fc">

-7.7% median frame time.

### Sprites

**NOTE: On `main` sprites are using `SparseSet<Entity, T>`!**

`bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite`

#### `--ordered-z`

This test spawns the sprites with z incrementing back to front, which is
the ideal arrangement allocation order as it matches the sorted render
order which means lookups have a high cache hit rate.

<img width="1116" alt="Screenshot 2023-09-27 at 07 52 31"
src="https://github.com/bevyengine/bevy/assets/302146/bc8eab90-e375-4d31-b5cd-f55f6f59ab67">

+13.0% median frame time.

#### Random

This test spawns the sprites with random z. This makes the batching and
transparent 2D pass lookups get a lot of cache misses.

<img width="1109" alt="Screenshot 2023-09-27 at 07 53 01"
src="https://github.com/bevyengine/bevy/assets/302146/22073f5d-99a7-49b0-9584-d3ac3eac3033">

+0.6% median frame time.

### UI

**NOTE: On `main` UI is using `SparseSet<Entity, T>`!**

`many_buttons`

<img width="1111" alt="Screenshot 2023-09-27 at 07 53 26"
src="https://github.com/bevyengine/bevy/assets/302146/66afd56d-cbe4-49e7-8b64-2f28f6043d85">

+15.1% median frame time.

## Alternatives

- Cart originally suggested trying out `SparseSet<Entity, T>` and indeed
that is slightly faster under ideal conditions. However,
`PassHashMap<Entity, T>` has better worst case performance when data is
randomly distributed, rather than in sorted render order, and does not
have the worst case memory usage that `SparseSet`'s dense `Vec<usize>`
that maps from the `Entity` index to sparse index into `Vec<T>`. This
dense `Vec` has to be as large as the largest Entity index used with the
`SparseSet`.
- I also tested `PassHashMap<u32, T>`, intending to use `Entity.index()`
as the key, but this proved to sometimes be slower and mostly no
different.
- The only outstanding approach that has not been implemented and tested
is to _not_ clear the render world of its entities each frame. That has
its own problems, though they could perhaps be solved.
- Performance-wise, if the entities and their component data were not
cleared, then they would incur table moves on spawn, and should not
thereafter, rather just their component data would be overwritten.
Ideally we would have a neat way of either updating data in-place via
`&mut T` queries, or inserting components if not present. This would
likely be quite cumbersome to have to remember to do everywhere, but
perhaps it only needs to be done in the more performance-sensitive
systems.
- The main problem to solve however is that we want to both maintain a
mapping between main world entities and render world entities, be able
to run the render app and world in parallel with the main app and world
for pipelined rendering, and at the same time be able to spawn entities
in the render world in such a way that those Entity ids do not collide
with those spawned in the main world. This is potentially quite
solvable, but could well be a lot of ECS work to do it in a way that
makes sense.

---

## Changelog

- Changed: Component data for entities to be drawn are no longer stored
on entities in the render world. Instead, data is stored in a
`EntityHashMap<Entity, T>` in various resources. This brings significant
performance benefits due to the way the render app clears entities every
frame. Resources of most interest are `RenderMeshInstances` and
`RenderMaterialInstances`, and their 2D counterparts.

## Migration Guide

Previously the render app extracted mesh entities and their component
data from the main world and stored them as entities and components in
the render world. Now they are extracted into essentially
`EntityHashMap<Entity, T>` where `T` are structs containing an
appropriate group of data. This means that while extract set systems
will continue to run extract queries against the main world they will
store their data in hash maps. Also, systems in later sets will either
need to look up entities in the available resources such as
`RenderMeshInstances`, or maintain their own `EntityHashMap<Entity, T>`
for their own data.

Before:
```rust
fn queue_custom(
    material_meshes: Query<(Entity, &MeshTransforms, &Handle<Mesh>), With<InstanceMaterialData>>,
) {
    ...
    for (entity, mesh_transforms, mesh_handle) in &material_meshes {
        ...
    }
}
```

After:
```rust
fn queue_custom(
    render_mesh_instances: Res<RenderMeshInstances>,
    instance_entities: Query<Entity, With<InstanceMaterialData>>,
) {
    ...
    for entity in &instance_entities {
        let Some(mesh_instance) = render_mesh_instances.get(&entity) else { continue; };
        // The mesh handle in `AssetId<Mesh>` form, and the `MeshTransforms` can now
        // be found in `mesh_instance` which is a `RenderMeshInstance`
        ...
    }
}
```

---------

Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>
This commit is contained in:
Robert Swain 2023-09-27 10:28:28 +02:00 committed by GitHub
parent 35d3213071
commit b6ead2be95
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
17 changed files with 585 additions and 325 deletions

View file

@ -44,7 +44,7 @@ use crate::{
storage::{SparseSetIndex, TableId, TableRow},
};
use serde::{Deserialize, Serialize};
use std::{convert::TryFrom, fmt, mem, sync::atomic::Ordering};
use std::{convert::TryFrom, fmt, hash::Hash, mem, sync::atomic::Ordering};
#[cfg(target_has_atomic = "64")]
use std::sync::atomic::AtomicI64 as AtomicIdCursor;
@ -115,12 +115,19 @@ type IdCursor = isize;
/// [`EntityCommands`]: crate::system::EntityCommands
/// [`Query::get`]: crate::system::Query::get
/// [`World`]: crate::world::World
#[derive(Clone, Copy, Eq, Hash, Ord, PartialEq, PartialOrd)]
#[derive(Clone, Copy, Eq, Ord, PartialEq, PartialOrd)]
pub struct Entity {
generation: u32,
index: u32,
}
impl Hash for Entity {
#[inline]
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.to_bits().hash(state);
}
}
pub(crate) enum AllocAtWithoutReplacement {
Exists(EntityLocation),
DidNotExist,

View file

@ -1,6 +1,6 @@
use crate::{
render, AlphaMode, DrawMesh, DrawPrepass, EnvironmentMapLight, MeshPipeline, MeshPipelineKey,
MeshTransforms, PrepassPipelinePlugin, PrepassPlugin, ScreenSpaceAmbientOcclusionSettings,
PrepassPipelinePlugin, PrepassPlugin, RenderMeshInstances, ScreenSpaceAmbientOcclusionSettings,
SetMeshBindGroup, SetMeshViewBindGroup, Shadow,
};
use bevy_app::{App, Plugin};
@ -14,10 +14,7 @@ use bevy_core_pipeline::{
use bevy_derive::{Deref, DerefMut};
use bevy_ecs::{
prelude::*,
system::{
lifetimeless::{Read, SRes},
SystemParamItem,
},
system::{lifetimeless::SRes, SystemParamItem},
};
use bevy_render::{
mesh::{Mesh, MeshVertexBufferLayout},
@ -37,7 +34,7 @@ use bevy_render::{
view::{ExtractedView, Msaa, ViewVisibility, VisibleEntities},
Extract, ExtractSchedule, Render, RenderApp, RenderSet,
};
use bevy_utils::{tracing::error, HashMap, HashSet};
use bevy_utils::{tracing::error, EntityHashMap, HashMap, HashSet};
use std::hash::Hash;
use std::marker::PhantomData;
@ -190,6 +187,7 @@ where
.add_render_command::<AlphaMask3d, DrawMaterial<M>>()
.init_resource::<ExtractedMaterials<M>>()
.init_resource::<RenderMaterials<M>>()
.init_resource::<RenderMaterialInstances<M>>()
.init_resource::<SpecializedMeshPipelines<MaterialPipeline<M>>>()
.add_systems(
ExtractSchedule,
@ -226,26 +224,6 @@ where
}
}
fn extract_material_meshes<M: Material>(
mut commands: Commands,
mut previous_len: Local<usize>,
query: Extract<Query<(Entity, &ViewVisibility, &Handle<M>)>>,
) {
let mut values = Vec::with_capacity(*previous_len);
for (entity, view_visibility, material) in &query {
if view_visibility.get() {
// NOTE: MaterialBindGroupId is inserted here to avoid a table move. Upcoming changes
// to use SparseSet for render world entity storage will do this automatically.
values.push((
entity,
(material.clone_weak(), MaterialBindGroupId::default()),
));
}
}
*previous_len = values.len();
commands.insert_or_spawn_batch(values);
}
/// A key uniquely identifying a specialized [`MaterialPipeline`].
pub struct MaterialPipelineKey<M: Material> {
pub mesh_key: MeshPipelineKey,
@ -368,24 +346,53 @@ type DrawMaterial<M> = (
/// Sets the bind group for a given [`Material`] at the configured `I` index.
pub struct SetMaterialBindGroup<M: Material, const I: usize>(PhantomData<M>);
impl<P: PhaseItem, M: Material, const I: usize> RenderCommand<P> for SetMaterialBindGroup<M, I> {
type Param = SRes<RenderMaterials<M>>;
type Param = (SRes<RenderMaterials<M>>, SRes<RenderMaterialInstances<M>>);
type ViewWorldQuery = ();
type ItemWorldQuery = Read<Handle<M>>;
type ItemWorldQuery = ();
#[inline]
fn render<'w>(
_item: &P,
item: &P,
_view: (),
material_handle: &'_ Handle<M>,
materials: SystemParamItem<'w, '_, Self::Param>,
_item_query: (),
(materials, material_instances): SystemParamItem<'w, '_, Self::Param>,
pass: &mut TrackedRenderPass<'w>,
) -> RenderCommandResult {
let material = materials.into_inner().get(&material_handle.id()).unwrap();
let materials = materials.into_inner();
let material_instances = material_instances.into_inner();
let Some(material_asset_id) = material_instances.get(&item.entity()) else {
return RenderCommandResult::Failure;
};
let Some(material) = materials.get(material_asset_id) else {
return RenderCommandResult::Failure;
};
pass.set_bind_group(I, &material.bind_group, &[]);
RenderCommandResult::Success
}
}
#[derive(Resource, Deref, DerefMut)]
pub struct RenderMaterialInstances<M: Material>(EntityHashMap<Entity, AssetId<M>>);
impl<M: Material> Default for RenderMaterialInstances<M> {
fn default() -> Self {
Self(Default::default())
}
}
fn extract_material_meshes<M: Material>(
mut material_instances: ResMut<RenderMaterialInstances<M>>,
query: Extract<Query<(Entity, &ViewVisibility, &Handle<M>)>>,
) {
material_instances.clear();
for (entity, view_visibility, handle) in &query {
if view_visibility.get() {
material_instances.insert(entity, handle.id());
}
}
}
const fn alpha_mode_pipeline_key(alpha_mode: AlphaMode) -> MeshPipelineKey {
match alpha_mode {
// Premultiplied and Add share the same pipeline key
@ -424,12 +431,8 @@ pub fn queue_material_meshes<M: Material>(
msaa: Res<Msaa>,
render_meshes: Res<RenderAssets<Mesh>>,
render_materials: Res<RenderMaterials<M>>,
mut material_meshes: Query<(
&Handle<M>,
&mut MaterialBindGroupId,
&Handle<Mesh>,
&MeshTransforms,
)>,
mut render_mesh_instances: ResMut<RenderMeshInstances>,
render_material_instances: Res<RenderMaterialInstances<M>>,
images: Res<RenderAssets<Image>>,
mut views: Query<(
&ExtractedView,
@ -493,15 +496,16 @@ pub fn queue_material_meshes<M: Material>(
}
let rangefinder = view.rangefinder3d();
for visible_entity in &visible_entities.entities {
let Ok((material_handle, mut material_bind_group_id, mesh_handle, mesh_transforms)) =
material_meshes.get_mut(*visible_entity)
else {
let Some(material_asset_id) = render_material_instances.get(visible_entity) else {
continue;
};
let Some(mesh) = render_meshes.get(mesh_handle) else {
let Some(mesh_instance) = render_mesh_instances.get_mut(visible_entity) else {
continue;
};
let Some(material) = render_materials.get(&material_handle.id()) else {
let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else {
continue;
};
let Some(material) = render_materials.get(material_asset_id) else {
continue;
};
let mut mesh_key = view_key;
@ -530,9 +534,10 @@ pub fn queue_material_meshes<M: Material>(
}
};
*material_bind_group_id = material.get_bind_group_id();
mesh_instance.material_bind_group_id = material.get_bind_group_id();
let distance = rangefinder.distance_translation(&mesh_transforms.transform.translation)
let distance = rangefinder
.distance_translation(&mesh_instance.transforms.transform.translation)
+ material.properties.depth_bias;
match material.properties.alpha_mode {
AlphaMode::Opaque => {

View file

@ -47,7 +47,8 @@ use bevy_utils::tracing::error;
use crate::{
prepare_materials, setup_morph_and_skinning_defs, AlphaMode, DrawMesh, Material,
MaterialPipeline, MaterialPipelineKey, MeshLayouts, MeshPipeline, MeshPipelineKey,
MeshTransforms, RenderMaterials, SetMaterialBindGroup, SetMeshBindGroup,
RenderMaterialInstances, RenderMaterials, RenderMeshInstances, SetMaterialBindGroup,
SetMeshBindGroup,
};
use std::{hash::Hash, marker::PhantomData};
@ -758,8 +759,9 @@ pub fn queue_prepass_material_meshes<M: Material>(
pipeline_cache: Res<PipelineCache>,
msaa: Res<Msaa>,
render_meshes: Res<RenderAssets<Mesh>>,
render_mesh_instances: Res<RenderMeshInstances>,
render_materials: Res<RenderMaterials<M>>,
material_meshes: Query<(&Handle<M>, &Handle<Mesh>, &MeshTransforms)>,
render_material_instances: Res<RenderMaterialInstances<M>>,
mut views: Query<(
&ExtractedView,
&VisibleEntities,
@ -804,16 +806,16 @@ pub fn queue_prepass_material_meshes<M: Material>(
let rangefinder = view.rangefinder3d();
for visible_entity in &visible_entities.entities {
let Ok((material_handle, mesh_handle, mesh_transforms)) =
material_meshes.get(*visible_entity)
else {
let Some(material_asset_id) = render_material_instances.get(visible_entity) else {
continue;
};
let (Some(material), Some(mesh)) = (
render_materials.get(&material_handle.id()),
render_meshes.get(mesh_handle),
) else {
let Some(mesh_instance) = render_mesh_instances.get(visible_entity) else {
continue;
};
let Some(material) = render_materials.get(material_asset_id) else {
continue;
};
let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else {
continue;
};
@ -849,7 +851,8 @@ pub fn queue_prepass_material_meshes<M: Material>(
}
};
let distance = rangefinder.distance_translation(&mesh_transforms.transform.translation)
let distance = rangefinder
.distance_translation(&mesh_instance.transforms.transform.translation)
+ material.properties.depth_bias;
match alpha_mode {
AlphaMode::Opaque => {

View file

@ -3,10 +3,9 @@ use crate::{
CascadeShadowConfig, Cascades, CascadesVisibleEntities, Clusters, CubemapVisibleEntities,
DirectionalLight, DirectionalLightShadowMap, DrawPrepass, EnvironmentMapLight,
GlobalVisiblePointLights, Material, MaterialPipelineKey, MeshPipeline, MeshPipelineKey,
NotShadowCaster, PointLight, PointLightShadowMap, PrepassPipeline, RenderMaterials, SpotLight,
VisiblePointLights,
PointLight, PointLightShadowMap, PrepassPipeline, RenderMaterialInstances, RenderMaterials,
RenderMeshInstances, SpotLight, VisiblePointLights,
};
use bevy_asset::Handle;
use bevy_core_pipeline::core_3d::Transparent3d;
use bevy_ecs::prelude::*;
use bevy_math::{Mat4, UVec3, UVec4, Vec2, Vec3, Vec3Swizzles, Vec4, Vec4Swizzles};
@ -1553,9 +1552,10 @@ pub fn prepare_clusters(
pub fn queue_shadows<M: Material>(
shadow_draw_functions: Res<DrawFunctions<Shadow>>,
prepass_pipeline: Res<PrepassPipeline<M>>,
casting_meshes: Query<(&Handle<Mesh>, &Handle<M>), Without<NotShadowCaster>>,
render_meshes: Res<RenderAssets<Mesh>>,
render_mesh_instances: Res<RenderMeshInstances>,
render_materials: Res<RenderMaterials<M>>,
render_material_instances: Res<RenderMaterialInstances<M>>,
mut pipelines: ResMut<SpecializedMeshPipelines<PrepassPipeline<M>>>,
pipeline_cache: Res<PipelineCache>,
view_lights: Query<(Entity, &ViewLightEntities)>,
@ -1598,15 +1598,22 @@ pub fn queue_shadows<M: Material>(
// NOTE: Lights with shadow mapping disabled will have no visible entities
// so no meshes will be queued
for entity in visible_entities.iter().copied() {
let Ok((mesh_handle, material_handle)) = casting_meshes.get(entity) else {
let Some(mesh_instance) = render_mesh_instances.get(&entity) else {
continue;
};
let Some(mesh) = render_meshes.get(mesh_handle) else {
if !mesh_instance.shadow_caster {
continue;
}
let Some(material_asset_id) = render_material_instances.get(&entity) else {
continue;
};
let Some(material) = render_materials.get(&material_handle.id()) else {
let Some(material) = render_materials.get(material_asset_id) else {
continue;
};
let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else {
continue;
};
let mut mesh_key =
MeshPipelineKey::from_primitive_topology(mesh.primitive_topology)
| MeshPipelineKey::DEPTH_PREPASS;

View file

@ -5,7 +5,7 @@ use crate::{
ViewClusterBindings, ViewFogUniformOffset, ViewLightsUniformOffset, ViewShadowBindings,
CLUSTERED_FORWARD_STORAGE_BUFFER_COUNT, MAX_CASCADES_PER_LIGHT, MAX_DIRECTIONAL_LIGHTS,
};
use bevy_app::Plugin;
use bevy_app::{Plugin, PostUpdate};
use bevy_asset::{load_internal_asset, AssetId, Handle};
use bevy_core_pipeline::{
core_3d::{AlphaMask3d, Opaque3d, Transparent3d},
@ -14,6 +14,7 @@ use bevy_core_pipeline::{
get_lut_bind_group_layout_entries, get_lut_bindings, Tonemapping, TonemappingLuts,
},
};
use bevy_derive::{Deref, DerefMut};
use bevy_ecs::{
prelude::*,
query::{QueryItem, ROQueryItem},
@ -21,7 +22,10 @@ use bevy_ecs::{
};
use bevy_math::{Affine3, Vec2, Vec4};
use bevy_render::{
batching::{batch_and_prepare_render_phase, write_batched_instance_buffer, GetBatchData},
batching::{
batch_and_prepare_render_phase, write_batched_instance_buffer, GetBatchData,
NoAutomaticBatching,
},
globals::{GlobalsBuffer, GlobalsUniform},
mesh::{
GpuBufferInfo, InnerMeshVertexBufferLayout, Mesh, MeshVertexBufferLayout,
@ -40,14 +44,18 @@ use bevy_render::{
Extract, ExtractSchedule, Render, RenderApp, RenderSet,
};
use bevy_transform::components::GlobalTransform;
use bevy_utils::{tracing::error, HashMap, Hashed};
use bevy_utils::{tracing::error, EntityHashMap, HashMap, Hashed};
use crate::render::{
morph::{extract_morphs, prepare_morphs, MorphIndex, MorphUniform},
skin::{extract_skins, prepare_skins, SkinIndex, SkinUniform},
morph::{
extract_morphs, no_automatic_morph_batching, prepare_morphs, MorphIndices, MorphUniform,
},
skin::{extract_skins, no_automatic_skin_batching, prepare_skins, SkinUniform},
MeshLayouts,
};
use super::skin::SkinIndices;
#[derive(Default)]
pub struct MeshRenderPlugin;
@ -102,11 +110,19 @@ impl Plugin for MeshRenderPlugin {
load_internal_asset!(app, SKINNING_HANDLE, "skinning.wgsl", Shader::from_wgsl);
load_internal_asset!(app, MORPH_HANDLE, "morph.wgsl", Shader::from_wgsl);
app.add_systems(
PostUpdate,
(no_automatic_skin_batching, no_automatic_morph_batching),
);
if let Ok(render_app) = app.get_sub_app_mut(RenderApp) {
render_app
.init_resource::<RenderMeshInstances>()
.init_resource::<MeshBindGroups>()
.init_resource::<SkinUniform>()
.init_resource::<SkinIndices>()
.init_resource::<MorphUniform>()
.init_resource::<MorphIndices>()
.add_systems(
ExtractSchedule,
(extract_meshes, extract_skins, extract_morphs),
@ -212,10 +228,24 @@ bitflags::bitflags! {
}
}
pub struct RenderMeshInstance {
pub transforms: MeshTransforms,
pub mesh_asset_id: AssetId<Mesh>,
pub material_bind_group_id: MaterialBindGroupId,
pub shadow_caster: bool,
pub automatic_batching: bool,
}
#[derive(Default, Resource, Deref, DerefMut)]
pub struct RenderMeshInstances(EntityHashMap<Entity, RenderMeshInstance>);
#[derive(Component)]
pub struct Mesh3d;
pub fn extract_meshes(
mut commands: Commands,
mut prev_caster_commands_len: Local<usize>,
mut prev_not_caster_commands_len: Local<usize>,
mut previous_len: Local<usize>,
mut render_mesh_instances: ResMut<RenderMeshInstances>,
meshes_query: Extract<
Query<(
Entity,
@ -225,15 +255,25 @@ pub fn extract_meshes(
&Handle<Mesh>,
Option<With<NotShadowReceiver>>,
Option<With<NotShadowCaster>>,
Has<NoAutomaticBatching>,
)>,
>,
) {
let mut caster_commands = Vec::with_capacity(*prev_caster_commands_len);
let mut not_caster_commands = Vec::with_capacity(*prev_not_caster_commands_len);
render_mesh_instances.clear();
let mut entities = Vec::with_capacity(*previous_len);
let visible_meshes = meshes_query.iter().filter(|(_, vis, ..)| vis.get());
for (entity, _, transform, previous_transform, handle, not_receiver, not_caster) in
visible_meshes
for (
entity,
_,
transform,
previous_transform,
handle,
not_receiver,
not_caster,
no_automatic_batching,
) in visible_meshes
{
let transform = transform.affine();
let previous_transform = previous_transform.map(|t| t.0).unwrap_or(transform);
@ -250,16 +290,22 @@ pub fn extract_meshes(
previous_transform: (&previous_transform).into(),
flags: flags.bits(),
};
if not_caster.is_some() {
not_caster_commands.push((entity, (handle.clone_weak(), transforms, NotShadowCaster)));
} else {
caster_commands.push((entity, (handle.clone_weak(), transforms)));
}
// FIXME: Remove this - it is just a workaround to enable rendering to work as
// render commands require an entity to exist at the moment.
entities.push((entity, Mesh3d));
render_mesh_instances.insert(
entity,
RenderMeshInstance {
mesh_asset_id: handle.id(),
transforms,
shadow_caster: not_caster.is_none(),
material_bind_group_id: MaterialBindGroupId::default(),
automatic_batching: !no_automatic_batching,
},
);
}
*prev_caster_commands_len = caster_commands.len();
*prev_not_caster_commands_len = not_caster_commands.len();
commands.insert_or_spawn_batch(caster_commands);
commands.insert_or_spawn_batch(not_caster_commands);
*previous_len = entities.len();
commands.insert_or_spawn_batch(entities);
}
#[derive(Resource, Clone)]
@ -545,22 +591,26 @@ impl MeshPipeline {
}
impl GetBatchData for MeshPipeline {
type Query = (
Option<&'static MaterialBindGroupId>,
&'static Handle<Mesh>,
&'static MeshTransforms,
);
type CompareData = (Option<MaterialBindGroupId>, AssetId<Mesh>);
type Param = SRes<RenderMeshInstances>;
type Query = Entity;
type QueryFilter = With<Mesh3d>;
type CompareData = (MaterialBindGroupId, AssetId<Mesh>);
type BufferData = MeshUniform;
fn get_buffer_data(&(.., mesh_transforms): &QueryItem<Self::Query>) -> Self::BufferData {
mesh_transforms.into()
}
fn get_compare_data(
&(material_bind_group_id, mesh_handle, ..): &QueryItem<Self::Query>,
) -> Self::CompareData {
(material_bind_group_id.copied(), mesh_handle.id())
fn get_batch_data(
mesh_instances: &SystemParamItem<Self::Param>,
entity: &QueryItem<Self::Query>,
) -> (Self::BufferData, Option<Self::CompareData>) {
let mesh_instance = mesh_instances
.get(entity)
.expect("Failed to find render mesh instance");
(
(&mesh_instance.transforms).into(),
mesh_instance.automatic_batching.then_some((
mesh_instance.material_bind_group_id,
mesh_instance.mesh_asset_id,
)),
)
}
}
@ -932,12 +982,12 @@ impl MeshBindGroups {
/// Get the `BindGroup` for `GpuMesh` with given `handle_id`.
pub fn get(
&self,
handle_id: AssetId<Mesh>,
asset_id: AssetId<Mesh>,
is_skinned: bool,
morph: bool,
) -> Option<&BindGroup> {
match (is_skinned, morph) {
(_, true) => self.morph_targets.get(&handle_id),
(_, true) => self.morph_targets.get(&asset_id),
(true, false) => self.skinned.as_ref(),
(false, false) => self.model_only.as_ref(),
}
@ -1176,27 +1226,44 @@ impl<P: PhaseItem, const I: usize> RenderCommand<P> for SetMeshViewBindGroup<I>
pub struct SetMeshBindGroup<const I: usize>;
impl<P: PhaseItem, const I: usize> RenderCommand<P> for SetMeshBindGroup<I> {
type Param = SRes<MeshBindGroups>;
type ViewWorldQuery = ();
type ItemWorldQuery = (
Read<Handle<Mesh>>,
Option<Read<SkinIndex>>,
Option<Read<MorphIndex>>,
type Param = (
SRes<MeshBindGroups>,
SRes<RenderMeshInstances>,
SRes<SkinIndices>,
SRes<MorphIndices>,
);
type ViewWorldQuery = ();
type ItemWorldQuery = ();
#[inline]
fn render<'w>(
item: &P,
_view: (),
(mesh, skin_index, morph_index): ROQueryItem<Self::ItemWorldQuery>,
bind_groups: SystemParamItem<'w, '_, Self::Param>,
_item_query: (),
(bind_groups, mesh_instances, skin_indices, morph_indices): SystemParamItem<
'w,
'_,
Self::Param,
>,
pass: &mut TrackedRenderPass<'w>,
) -> RenderCommandResult {
let bind_groups = bind_groups.into_inner();
let mesh_instances = mesh_instances.into_inner();
let skin_indices = skin_indices.into_inner();
let morph_indices = morph_indices.into_inner();
let entity = &item.entity();
let Some(mesh) = mesh_instances.get(entity) else {
return RenderCommandResult::Success;
};
let skin_index = skin_indices.get(entity);
let morph_index = morph_indices.get(entity);
let is_skinned = skin_index.is_some();
let is_morphed = morph_index.is_some();
let Some(bind_group) = bind_groups.get(mesh.id(), is_skinned, is_morphed) else {
let Some(bind_group) = bind_groups.get(mesh.mesh_asset_id, is_skinned, is_morphed) else {
error!(
"The MeshBindGroups resource wasn't set in the render phase. \
It should be set by the queue_mesh_bind_group system.\n\
@ -1227,43 +1294,50 @@ impl<P: PhaseItem, const I: usize> RenderCommand<P> for SetMeshBindGroup<I> {
pub struct DrawMesh;
impl<P: PhaseItem> RenderCommand<P> for DrawMesh {
type Param = SRes<RenderAssets<Mesh>>;
type Param = (SRes<RenderAssets<Mesh>>, SRes<RenderMeshInstances>);
type ViewWorldQuery = ();
type ItemWorldQuery = Read<Handle<Mesh>>;
type ItemWorldQuery = ();
#[inline]
fn render<'w>(
item: &P,
_view: (),
mesh_handle: ROQueryItem<'_, Self::ItemWorldQuery>,
meshes: SystemParamItem<'w, '_, Self::Param>,
_item_query: (),
(meshes, mesh_instances): SystemParamItem<'w, '_, Self::Param>,
pass: &mut TrackedRenderPass<'w>,
) -> RenderCommandResult {
if let Some(gpu_mesh) = meshes.into_inner().get(mesh_handle) {
let batch_range = item.batch_range();
pass.set_vertex_buffer(0, gpu_mesh.vertex_buffer.slice(..));
#[cfg(all(feature = "webgl", target_arch = "wasm32"))]
pass.set_push_constants(
ShaderStages::VERTEX,
0,
&(batch_range.start as i32).to_le_bytes(),
);
match &gpu_mesh.buffer_info {
GpuBufferInfo::Indexed {
buffer,
index_format,
count,
} => {
pass.set_index_buffer(buffer.slice(..), 0, *index_format);
pass.draw_indexed(0..*count, 0, batch_range.clone());
}
GpuBufferInfo::NonIndexed => {
pass.draw(0..gpu_mesh.vertex_count, batch_range.clone());
}
let meshes = meshes.into_inner();
let mesh_instances = mesh_instances.into_inner();
let Some(mesh_instance) = mesh_instances.get(&item.entity()) else {
return RenderCommandResult::Failure;
};
let Some(gpu_mesh) = meshes.get(mesh_instance.mesh_asset_id) else {
return RenderCommandResult::Failure;
};
pass.set_vertex_buffer(0, gpu_mesh.vertex_buffer.slice(..));
let batch_range = item.batch_range();
#[cfg(all(feature = "webgl", target_arch = "wasm32"))]
pass.set_push_constants(
ShaderStages::VERTEX,
0,
&(batch_range.start as i32).to_le_bytes(),
);
match &gpu_mesh.buffer_info {
GpuBufferInfo::Indexed {
buffer,
index_format,
count,
} => {
pass.set_index_buffer(buffer.slice(..), 0, *index_format);
pass.draw_indexed(0..*count, 0, batch_range.clone());
}
GpuBufferInfo::NonIndexed => {
pass.draw(0..gpu_mesh.vertex_count, batch_range.clone());
}
RenderCommandResult::Success
} else {
RenderCommandResult::Failure
}
RenderCommandResult::Success
}
}

View file

@ -1,5 +1,6 @@
use std::{iter, mem};
use bevy_derive::{Deref, DerefMut};
use bevy_ecs::prelude::*;
use bevy_render::{
batching::NoAutomaticBatching,
@ -9,16 +10,22 @@ use bevy_render::{
view::ViewVisibility,
Extract,
};
use bevy_utils::EntityHashMap;
use bytemuck::Pod;
#[derive(Component)]
pub struct MorphIndex {
pub(super) index: u32,
}
#[derive(Default, Resource, Deref, DerefMut)]
pub struct MorphIndices(EntityHashMap<Entity, MorphIndex>);
#[derive(Resource)]
pub struct MorphUniform {
pub buffer: BufferVec<f32>,
}
impl Default for MorphUniform {
fn default() -> Self {
Self {
@ -43,6 +50,7 @@ pub fn prepare_morphs(
const fn can_align(step: usize, target: usize) -> bool {
step % target == 0 || target % step == 0
}
const WGPU_MIN_ALIGN: usize = 256;
/// Align a [`BufferVec`] to `N` bytes by padding the end with `T::default()` values.
@ -72,15 +80,13 @@ fn add_to_alignment<T: Pod + Default>(buffer: &mut BufferVec<T>) {
// Notes on implementation: see comment on top of the extract_skins system in skin module.
// This works similarly, but for `f32` instead of `Mat4`
pub fn extract_morphs(
mut commands: Commands,
mut previous_len: Local<usize>,
mut morph_indices: ResMut<MorphIndices>,
mut uniform: ResMut<MorphUniform>,
query: Extract<Query<(Entity, &ViewVisibility, &MeshMorphWeights)>>,
) {
morph_indices.clear();
uniform.buffer.clear();
let mut values = Vec::with_capacity(*previous_len);
for (entity, view_visibility, morph_weights) in &query {
if !view_visibility.get() {
continue;
@ -92,10 +98,17 @@ pub fn extract_morphs(
add_to_alignment::<f32>(&mut uniform.buffer);
let index = (start * mem::size_of::<f32>()) as u32;
// NOTE: Because morph targets require per-morph target texture bindings, they cannot
// currently be batched.
values.push((entity, (MorphIndex { index }, NoAutomaticBatching)));
morph_indices.insert(entity, MorphIndex { index });
}
}
// NOTE: Because morph targets require per-morph target texture bindings, they cannot
// currently be batched.
pub fn no_automatic_morph_batching(
mut commands: Commands,
query: Query<Entity, (With<MeshMorphWeights>, Without<NoAutomaticBatching>)>,
) {
for entity in &query {
commands.entity(entity).insert(NoAutomaticBatching);
}
*previous_len = values.len();
commands.insert_or_spawn_batch(values);
}

View file

@ -1,4 +1,5 @@
use bevy_asset::Assets;
use bevy_derive::{Deref, DerefMut};
use bevy_ecs::prelude::*;
use bevy_math::Mat4;
use bevy_render::{
@ -10,6 +11,7 @@ use bevy_render::{
Extract,
};
use bevy_transform::prelude::GlobalTransform;
use bevy_utils::EntityHashMap;
/// Maximum number of joints supported for skinned meshes.
pub const MAX_JOINTS: usize = 256;
@ -18,6 +20,7 @@ pub const MAX_JOINTS: usize = 256;
pub struct SkinIndex {
pub index: u32,
}
impl SkinIndex {
/// Index to be in address space based on [`SkinUniform`] size.
const fn new(start: usize) -> Self {
@ -27,11 +30,15 @@ impl SkinIndex {
}
}
#[derive(Default, Resource, Deref, DerefMut)]
pub struct SkinIndices(EntityHashMap<Entity, SkinIndex>);
// Notes on implementation: see comment on top of the `extract_skins` system.
#[derive(Resource)]
pub struct SkinUniform {
pub buffer: BufferVec<Mat4>,
}
impl Default for SkinUniform {
fn default() -> Self {
Self {
@ -81,16 +88,14 @@ pub fn prepare_skins(
// which normally only support fixed size arrays. You just have to make sure
// in the shader that you only read the values that are valid for that binding.
pub fn extract_skins(
mut commands: Commands,
mut previous_len: Local<usize>,
mut skin_indices: ResMut<SkinIndices>,
mut uniform: ResMut<SkinUniform>,
query: Extract<Query<(Entity, &ViewVisibility, &SkinnedMesh)>>,
inverse_bindposes: Extract<Res<Assets<SkinnedMeshInverseBindposes>>>,
joints: Extract<Query<&GlobalTransform>>,
) {
uniform.buffer.clear();
let mut values = Vec::with_capacity(*previous_len);
skin_indices.clear();
let mut last_start = 0;
// PERF: This can be expensive, can we move this to prepare?
@ -124,16 +129,23 @@ pub fn extract_skins(
while buffer.len() % 4 != 0 {
buffer.push(Mat4::ZERO);
}
// NOTE: The skinned joints uniform buffer has to be bound at a dynamic offset per
// entity and so cannot currently be batched.
values.push((entity, (SkinIndex::new(start), NoAutomaticBatching)));
skin_indices.insert(entity, SkinIndex::new(start));
}
// Pad out the buffer to ensure that there's enough space for bindings
while uniform.buffer.len() - last_start < MAX_JOINTS {
uniform.buffer.push(Mat4::ZERO);
}
*previous_len = values.len();
commands.insert_or_spawn_batch(values);
}
// NOTE: The skinned joints uniform buffer has to be bound at a dynamic offset per
// entity and so cannot currently be batched.
pub fn no_automatic_skin_batching(
mut commands: Commands,
query: Query<Entity, (With<SkinnedMesh>, Without<NoAutomaticBatching>)>,
) {
for entity in &query {
commands.entity(entity).insert(NoAutomaticBatching);
}
}

View file

@ -1,13 +1,15 @@
use crate::{DrawMesh, MeshPipelineKey, SetMeshBindGroup, SetMeshViewBindGroup};
use crate::{MeshPipeline, MeshTransforms};
use crate::MeshPipeline;
use crate::{
DrawMesh, MeshPipelineKey, RenderMeshInstance, RenderMeshInstances, SetMeshBindGroup,
SetMeshViewBindGroup,
};
use bevy_app::Plugin;
use bevy_asset::{load_internal_asset, Handle};
use bevy_core_pipeline::core_3d::Opaque3d;
use bevy_derive::{Deref, DerefMut};
use bevy_ecs::{prelude::*, reflect::ReflectComponent};
use bevy_reflect::std_traits::ReflectDefault;
use bevy_reflect::Reflect;
use bevy_render::extract_component::{ExtractComponent, ExtractComponentPlugin};
use bevy_render::Render;
use bevy_render::{
extract_resource::{ExtractResource, ExtractResourcePlugin},
mesh::{Mesh, MeshVertexBufferLayout},
@ -20,7 +22,9 @@ use bevy_render::{
view::{ExtractedView, Msaa, VisibleEntities},
RenderApp, RenderSet,
};
use bevy_render::{Extract, ExtractSchedule, Render};
use bevy_utils::tracing::error;
use bevy_utils::EntityHashSet;
pub const WIREFRAME_SHADER_HANDLE: Handle<Shader> = Handle::weak_from_u128(192598014480025766);
@ -39,15 +43,14 @@ impl Plugin for WireframePlugin {
app.register_type::<Wireframe>()
.register_type::<WireframeConfig>()
.init_resource::<WireframeConfig>()
.add_plugins((
ExtractResourcePlugin::<WireframeConfig>::default(),
ExtractComponentPlugin::<Wireframe>::default(),
));
.add_plugins((ExtractResourcePlugin::<WireframeConfig>::default(),));
if let Ok(render_app) = app.get_sub_app_mut(RenderApp) {
render_app
.add_render_command::<Opaque3d, DrawWireframes>()
.init_resource::<SpecializedMeshPipelines<WireframePipeline>>()
.init_resource::<Wireframes>()
.add_systems(ExtractSchedule, extract_wireframes)
.add_systems(Render, queue_wireframes.in_set(RenderSet::QueueMeshes));
}
}
@ -60,7 +63,7 @@ impl Plugin for WireframePlugin {
}
/// Controls whether an entity should rendered in wireframe-mode if the [`WireframePlugin`] is enabled
#[derive(Component, Debug, Clone, Default, ExtractComponent, Reflect)]
#[derive(Component, Debug, Clone, Default, Reflect)]
#[reflect(Component, Default)]
pub struct Wireframe;
@ -71,6 +74,17 @@ pub struct WireframeConfig {
pub global: bool,
}
#[derive(Resource, Default, Deref, DerefMut)]
pub struct Wireframes(EntityHashSet<Entity>);
fn extract_wireframes(
mut wireframes: ResMut<Wireframes>,
query: Extract<Query<Entity, With<Wireframe>>>,
) {
wireframes.clear();
wireframes.extend(&query);
}
#[derive(Resource, Clone)]
pub struct WireframePipeline {
mesh_pipeline: MeshPipeline,
@ -110,15 +124,13 @@ impl SpecializedMeshPipeline for WireframePipeline {
fn queue_wireframes(
opaque_3d_draw_functions: Res<DrawFunctions<Opaque3d>>,
render_meshes: Res<RenderAssets<Mesh>>,
render_mesh_instances: Res<RenderMeshInstances>,
wireframes: Res<Wireframes>,
wireframe_config: Res<WireframeConfig>,
wireframe_pipeline: Res<WireframePipeline>,
mut pipelines: ResMut<SpecializedMeshPipelines<WireframePipeline>>,
pipeline_cache: Res<PipelineCache>,
msaa: Res<Msaa>,
mut material_meshes: ParamSet<(
Query<(Entity, &Handle<Mesh>, &MeshTransforms)>,
Query<(Entity, &Handle<Mesh>, &MeshTransforms), With<Wireframe>>,
)>,
mut views: Query<(&ExtractedView, &VisibleEntities, &mut RenderPhase<Opaque3d>)>,
) {
let draw_custom = opaque_3d_draw_functions.read().id::<DrawWireframes>();
@ -127,10 +139,10 @@ fn queue_wireframes(
let rangefinder = view.rangefinder3d();
let view_key = msaa_key | MeshPipelineKey::from_hdr(view.hdr);
let add_render_phase = |phase_item: (Entity, &Handle<Mesh>, &MeshTransforms)| {
let (entity, mesh_handle, mesh_transforms) = phase_item;
let add_render_phase = |phase_item: (Entity, &RenderMeshInstance)| {
let (entity, mesh_instance) = phase_item;
let Some(mesh) = render_meshes.get(mesh_handle) else {
let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else {
return;
};
let mut key =
@ -151,25 +163,36 @@ fn queue_wireframes(
entity,
pipeline: pipeline_id,
draw_function: draw_custom,
distance: rangefinder.distance_translation(&mesh_transforms.transform.translation),
distance: rangefinder
.distance_translation(&mesh_instance.transforms.transform.translation),
batch_range: 0..1,
dynamic_offset: None,
});
};
if wireframe_config.global {
let query = material_meshes.p0();
visible_entities
.entities
.iter()
.filter_map(|visible_entity| query.get(*visible_entity).ok())
.filter_map(|visible_entity| {
render_mesh_instances
.get(visible_entity)
.map(|mesh_instance| (*visible_entity, mesh_instance))
})
.for_each(add_render_phase);
} else {
let query = material_meshes.p1();
visible_entities
.entities
.iter()
.filter_map(|visible_entity| query.get(*visible_entity).ok())
.filter_map(|visible_entity| {
if wireframes.contains(visible_entity) {
render_mesh_instances
.get(visible_entity)
.map(|mesh_instance| (*visible_entity, mesh_instance))
} else {
None
}
})
.for_each(add_render_phase);
}
}

View file

@ -1,8 +1,8 @@
use bevy_ecs::{
component::Component,
prelude::Res,
query::{Has, QueryItem, ReadOnlyWorldQuery},
system::{Query, ResMut},
query::{QueryItem, ReadOnlyWorldQuery},
system::{Query, ResMut, StaticSystemParam, SystemParam, SystemParamItem},
};
use bevy_utils::nonmax::NonMaxU32;
@ -56,7 +56,9 @@ impl<T: PartialEq> BatchMeta<T> {
/// A trait to support getting data used for batching draw commands via phase
/// items.
pub trait GetBatchData {
type Param: SystemParam + 'static;
type Query: ReadOnlyWorldQuery;
type QueryFilter: ReadOnlyWorldQuery;
/// Data used for comparison between phase items. If the pipeline id, draw
/// function id, per-instance data buffer dynamic offset and this data
/// matches, the draws can be batched.
@ -65,10 +67,13 @@ pub trait GetBatchData {
/// containing these data for all instances.
type BufferData: GpuArrayBufferable + Sync + Send + 'static;
/// Get the per-instance data to be inserted into the [`GpuArrayBuffer`].
fn get_buffer_data(query_item: &QueryItem<Self::Query>) -> Self::BufferData;
/// Get the data used for comparison when deciding whether draws can be
/// batched.
fn get_compare_data(query_item: &QueryItem<Self::Query>) -> Self::CompareData;
/// If the instance can be batched, also return the data used for
/// comparison when deciding whether draws can be batched, else return None
/// for the `CompareData`.
fn get_batch_data(
param: &SystemParamItem<Self::Param>,
query_item: &QueryItem<Self::Query>,
) -> (Self::BufferData, Option<Self::CompareData>);
}
/// Batch the items in a render phase. This means comparing metadata needed to draw each phase item
@ -76,24 +81,23 @@ pub trait GetBatchData {
pub fn batch_and_prepare_render_phase<I: CachedRenderPipelinePhaseItem, F: GetBatchData>(
gpu_array_buffer: ResMut<GpuArrayBuffer<F::BufferData>>,
mut views: Query<&mut RenderPhase<I>>,
query: Query<(Has<NoAutomaticBatching>, F::Query)>,
query: Query<F::Query, F::QueryFilter>,
param: StaticSystemParam<F::Param>,
) {
let gpu_array_buffer = gpu_array_buffer.into_inner();
let system_param_item = param.into_inner();
let mut process_item = |item: &mut I| {
let (no_auto_batching, batch_query_item) = query.get(item.entity()).ok()?;
let batch_query_item = query.get(item.entity()).ok()?;
let buffer_data = F::get_buffer_data(&batch_query_item);
let (buffer_data, compare_data) = F::get_batch_data(&system_param_item, &batch_query_item);
let buffer_index = gpu_array_buffer.push(buffer_data);
let index = buffer_index.index.get();
*item.batch_range_mut() = index..index + 1;
*item.dynamic_offset_mut() = buffer_index.dynamic_offset;
(!no_auto_batching).then(|| {
let compare_data = F::get_compare_data(&batch_query_item);
BatchMeta::new(item, compare_data)
})
compare_data.map(|compare_data| BatchMeta::new(item, compare_data))
};
for mut phase in &mut views {

View file

@ -7,11 +7,7 @@ use bevy_core_pipeline::{
use bevy_derive::{Deref, DerefMut};
use bevy_ecs::{
prelude::*,
query::ROQueryItem,
system::{
lifetimeless::{Read, SRes},
SystemParamItem,
},
system::{lifetimeless::SRes, SystemParamItem},
};
use bevy_log::error;
use bevy_render::{
@ -33,12 +29,12 @@ use bevy_render::{
Extract, ExtractSchedule, Render, RenderApp, RenderSet,
};
use bevy_transform::components::{GlobalTransform, Transform};
use bevy_utils::{FloatOrd, HashMap, HashSet};
use bevy_utils::{EntityHashMap, FloatOrd, HashMap, HashSet};
use std::hash::Hash;
use std::marker::PhantomData;
use crate::{
DrawMesh2d, Mesh2dHandle, Mesh2dPipeline, Mesh2dPipelineKey, Mesh2dTransforms,
DrawMesh2d, Mesh2dHandle, Mesh2dPipeline, Mesh2dPipelineKey, RenderMesh2dInstances,
SetMesh2dBindGroup, SetMesh2dViewBindGroup,
};
@ -150,6 +146,7 @@ where
.add_render_command::<Transparent2d, DrawMaterial2d<M>>()
.init_resource::<ExtractedMaterials2d<M>>()
.init_resource::<RenderMaterials2d<M>>()
.init_resource::<RenderMaterial2dInstances<M>>()
.init_resource::<SpecializedMeshPipelines<Material2dPipeline<M>>>()
.add_systems(
ExtractSchedule,
@ -176,24 +173,25 @@ where
}
}
#[derive(Resource, Deref, DerefMut)]
pub struct RenderMaterial2dInstances<M: Material2d>(EntityHashMap<Entity, AssetId<M>>);
impl<M: Material2d> Default for RenderMaterial2dInstances<M> {
fn default() -> Self {
Self(Default::default())
}
}
fn extract_material_meshes_2d<M: Material2d>(
mut commands: Commands,
mut previous_len: Local<usize>,
mut material_instances: ResMut<RenderMaterial2dInstances<M>>,
query: Extract<Query<(Entity, &ViewVisibility, &Handle<M>)>>,
) {
let mut values = Vec::with_capacity(*previous_len);
for (entity, view_visibility, material) in &query {
material_instances.clear();
for (entity, view_visibility, handle) in &query {
if view_visibility.get() {
// NOTE: Material2dBindGroupId is inserted here to avoid a table move. Upcoming changes
// to use SparseSet for render world entity storage will do this automatically.
values.push((
entity,
(material.clone_weak(), Material2dBindGroupId::default()),
));
material_instances.insert(entity, handle.id());
}
}
*previous_len = values.len();
commands.insert_or_spawn_batch(values);
}
/// Render pipeline data for a given [`Material2d`]
@ -322,19 +320,29 @@ pub struct SetMaterial2dBindGroup<M: Material2d, const I: usize>(PhantomData<M>)
impl<P: PhaseItem, M: Material2d, const I: usize> RenderCommand<P>
for SetMaterial2dBindGroup<M, I>
{
type Param = SRes<RenderMaterials2d<M>>;
type Param = (
SRes<RenderMaterials2d<M>>,
SRes<RenderMaterial2dInstances<M>>,
);
type ViewWorldQuery = ();
type ItemWorldQuery = Read<Handle<M>>;
type ItemWorldQuery = ();
#[inline]
fn render<'w>(
_item: &P,
item: &P,
_view: (),
material2d_handle: ROQueryItem<'_, Self::ItemWorldQuery>,
materials: SystemParamItem<'w, '_, Self::Param>,
_item_query: (),
(materials, material_instances): SystemParamItem<'w, '_, Self::Param>,
pass: &mut TrackedRenderPass<'w>,
) -> RenderCommandResult {
let material2d = materials.into_inner().get(&material2d_handle.id()).unwrap();
let materials = materials.into_inner();
let material_instances = material_instances.into_inner();
let Some(material_instance) = material_instances.get(&item.entity()) else {
return RenderCommandResult::Failure;
};
let Some(material2d) = materials.get(material_instance) else {
return RenderCommandResult::Failure;
};
pass.set_bind_group(I, &material2d.bind_group, &[]);
RenderCommandResult::Success
}
@ -364,12 +372,8 @@ pub fn queue_material2d_meshes<M: Material2d>(
msaa: Res<Msaa>,
render_meshes: Res<RenderAssets<Mesh>>,
render_materials: Res<RenderMaterials2d<M>>,
mut material2d_meshes: Query<(
&Handle<M>,
&mut Material2dBindGroupId,
&Mesh2dHandle,
&Mesh2dTransforms,
)>,
mut render_mesh_instances: ResMut<RenderMesh2dInstances>,
render_material_instances: Res<RenderMaterial2dInstances<M>>,
mut views: Query<(
&ExtractedView,
&VisibleEntities,
@ -380,7 +384,7 @@ pub fn queue_material2d_meshes<M: Material2d>(
) where
M::Data: PartialEq + Eq + Hash + Clone,
{
if material2d_meshes.is_empty() {
if render_material_instances.is_empty() {
return;
}
@ -400,19 +404,16 @@ pub fn queue_material2d_meshes<M: Material2d>(
}
}
for visible_entity in &visible_entities.entities {
let Ok((
material2d_handle,
mut material2d_bind_group_id,
mesh2d_handle,
mesh2d_uniform,
)) = material2d_meshes.get_mut(*visible_entity)
else {
let Some(material_asset_id) = render_material_instances.get(visible_entity) else {
continue;
};
let Some(material2d) = render_materials.get(&material2d_handle.id()) else {
let Some(mesh_instance) = render_mesh_instances.get_mut(visible_entity) else {
continue;
};
let Some(mesh) = render_meshes.get(&mesh2d_handle.0) else {
let Some(material2d) = render_materials.get(material_asset_id) else {
continue;
};
let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else {
continue;
};
let mesh_key =
@ -436,8 +437,9 @@ pub fn queue_material2d_meshes<M: Material2d>(
}
};
*material2d_bind_group_id = material2d.get_bind_group_id();
let mesh_z = mesh2d_uniform.transform.translation.z;
mesh_instance.material_bind_group_id = material2d.get_bind_group_id();
let mesh_z = mesh_instance.transforms.transform.translation.z;
transparent_phase.add(Transparent2d {
entity: *visible_entity,
draw_function: draw_transparent_pbr,
@ -580,7 +582,7 @@ pub fn prepare_materials_2d<M: Material2d>(
render_materials.remove(&removed);
}
for (handle, material) in std::mem::take(&mut extracted_assets.extracted) {
for (asset_id, material) in std::mem::take(&mut extracted_assets.extracted) {
match prepare_material2d(
&material,
&render_device,
@ -589,10 +591,10 @@ pub fn prepare_materials_2d<M: Material2d>(
&pipeline,
) {
Ok(prepared_asset) => {
render_materials.insert(handle, prepared_asset);
render_materials.insert(asset_id, prepared_asset);
}
Err(AsBindGroupError::RetryNextUpdate) => {
prepare_next_frame.assets.push((handle, material));
prepare_next_frame.assets.push((asset_id, material));
}
}
}

View file

@ -2,6 +2,7 @@ use bevy_app::Plugin;
use bevy_asset::{load_internal_asset, AssetId, Handle};
use bevy_core_pipeline::core_2d::Transparent2d;
use bevy_derive::{Deref, DerefMut};
use bevy_ecs::{
prelude::*,
query::{QueryItem, ROQueryItem},
@ -10,7 +11,10 @@ use bevy_ecs::{
use bevy_math::{Affine3, Vec2, Vec4};
use bevy_reflect::Reflect;
use bevy_render::{
batching::{batch_and_prepare_render_phase, write_batched_instance_buffer, GetBatchData},
batching::{
batch_and_prepare_render_phase, write_batched_instance_buffer, GetBatchData,
NoAutomaticBatching,
},
globals::{GlobalsBuffer, GlobalsUniform},
mesh::{GpuBufferInfo, Mesh, MeshVertexBufferLayout},
render_asset::RenderAssets,
@ -26,6 +30,7 @@ use bevy_render::{
Extract, ExtractSchedule, Render, RenderApp, RenderSet,
};
use bevy_transform::components::GlobalTransform;
use bevy_utils::EntityHashMap;
use crate::Material2dBindGroupId;
@ -89,6 +94,7 @@ impl Plugin for Mesh2dRenderPlugin {
if let Ok(render_app) = app.get_sub_app_mut(RenderApp) {
render_app
.init_resource::<RenderMesh2dInstances>()
.init_resource::<SpecializedMeshPipelines<Mesh2dPipeline>>()
.add_systems(ExtractSchedule, extract_mesh2d)
.add_systems(
@ -178,29 +184,58 @@ bitflags::bitflags! {
}
}
pub struct RenderMesh2dInstance {
pub transforms: Mesh2dTransforms,
pub mesh_asset_id: AssetId<Mesh>,
pub material_bind_group_id: Material2dBindGroupId,
pub automatic_batching: bool,
}
#[derive(Default, Resource, Deref, DerefMut)]
pub struct RenderMesh2dInstances(EntityHashMap<Entity, RenderMesh2dInstance>);
#[derive(Component)]
pub struct Mesh2d;
pub fn extract_mesh2d(
mut commands: Commands,
mut previous_len: Local<usize>,
query: Extract<Query<(Entity, &ViewVisibility, &GlobalTransform, &Mesh2dHandle)>>,
mut render_mesh_instances: ResMut<RenderMesh2dInstances>,
query: Extract<
Query<(
Entity,
&ViewVisibility,
&GlobalTransform,
&Mesh2dHandle,
Has<NoAutomaticBatching>,
)>,
>,
) {
let mut values = Vec::with_capacity(*previous_len);
for (entity, view_visibility, transform, handle) in &query {
render_mesh_instances.clear();
let mut entities = Vec::with_capacity(*previous_len);
for (entity, view_visibility, transform, handle, no_automatic_batching) in &query {
if !view_visibility.get() {
continue;
}
values.push((
// FIXME: Remove this - it is just a workaround to enable rendering to work as
// render commands require an entity to exist at the moment.
entities.push((entity, Mesh2d));
render_mesh_instances.insert(
entity,
(
Mesh2dHandle(handle.0.clone_weak()),
Mesh2dTransforms {
RenderMesh2dInstance {
transforms: Mesh2dTransforms {
transform: (&transform.affine()).into(),
flags: MeshFlags::empty().bits(),
},
),
));
mesh_asset_id: handle.0.id(),
material_bind_group_id: Material2dBindGroupId::default(),
automatic_batching: !no_automatic_batching,
},
);
}
*previous_len = values.len();
commands.insert_or_spawn_batch(values);
*previous_len = entities.len();
commands.insert_or_spawn_batch(entities);
}
#[derive(Resource, Clone)]
@ -325,22 +360,26 @@ impl Mesh2dPipeline {
}
impl GetBatchData for Mesh2dPipeline {
type Query = (
Option<&'static Material2dBindGroupId>,
&'static Mesh2dHandle,
&'static Mesh2dTransforms,
);
type CompareData = (Option<Material2dBindGroupId>, AssetId<Mesh>);
type Param = SRes<RenderMesh2dInstances>;
type Query = Entity;
type QueryFilter = With<Mesh2d>;
type CompareData = (Material2dBindGroupId, AssetId<Mesh>);
type BufferData = Mesh2dUniform;
fn get_buffer_data(&(.., mesh_transforms): &QueryItem<Self::Query>) -> Self::BufferData {
mesh_transforms.into()
}
fn get_compare_data(
&(material_bind_group_id, mesh_handle, ..): &QueryItem<Self::Query>,
) -> Self::CompareData {
(material_bind_group_id.copied(), mesh_handle.0.id())
fn get_batch_data(
mesh_instances: &SystemParamItem<Self::Param>,
entity: &QueryItem<Self::Query>,
) -> (Self::BufferData, Option<Self::CompareData>) {
let mesh_instance = mesh_instances
.get(entity)
.expect("Failed to find render mesh2d instance");
(
(&mesh_instance.transforms).into(),
mesh_instance.automatic_batching.then_some((
mesh_instance.material_bind_group_id,
mesh_instance.mesh_asset_id,
)),
)
}
}
@ -653,43 +692,52 @@ impl<P: PhaseItem, const I: usize> RenderCommand<P> for SetMesh2dBindGroup<I> {
pub struct DrawMesh2d;
impl<P: PhaseItem> RenderCommand<P> for DrawMesh2d {
type Param = SRes<RenderAssets<Mesh>>;
type Param = (SRes<RenderAssets<Mesh>>, SRes<RenderMesh2dInstances>);
type ViewWorldQuery = ();
type ItemWorldQuery = Read<Mesh2dHandle>;
type ItemWorldQuery = ();
#[inline]
fn render<'w>(
item: &P,
_view: (),
mesh_handle: ROQueryItem<'w, Self::ItemWorldQuery>,
meshes: SystemParamItem<'w, '_, Self::Param>,
_item_query: (),
(meshes, render_mesh2d_instances): SystemParamItem<'w, '_, Self::Param>,
pass: &mut TrackedRenderPass<'w>,
) -> RenderCommandResult {
let meshes = meshes.into_inner();
let render_mesh2d_instances = render_mesh2d_instances.into_inner();
let Some(RenderMesh2dInstance { mesh_asset_id, .. }) =
render_mesh2d_instances.get(&item.entity())
else {
return RenderCommandResult::Failure;
};
let Some(gpu_mesh) = meshes.get(*mesh_asset_id) else {
return RenderCommandResult::Failure;
};
pass.set_vertex_buffer(0, gpu_mesh.vertex_buffer.slice(..));
let batch_range = item.batch_range();
if let Some(gpu_mesh) = meshes.into_inner().get(&mesh_handle.0) {
pass.set_vertex_buffer(0, gpu_mesh.vertex_buffer.slice(..));
#[cfg(all(feature = "webgl", target_arch = "wasm32"))]
pass.set_push_constants(
ShaderStages::VERTEX,
0,
&(batch_range.start as i32).to_le_bytes(),
);
match &gpu_mesh.buffer_info {
GpuBufferInfo::Indexed {
buffer,
index_format,
count,
} => {
pass.set_index_buffer(buffer.slice(..), 0, *index_format);
pass.draw_indexed(0..*count, 0, batch_range.clone());
}
GpuBufferInfo::NonIndexed => {
pass.draw(0..gpu_mesh.vertex_count, batch_range.clone());
}
#[cfg(all(feature = "webgl", target_arch = "wasm32"))]
pass.set_push_constants(
ShaderStages::VERTEX,
0,
&(batch_range.start as i32).to_le_bytes(),
);
match &gpu_mesh.buffer_info {
GpuBufferInfo::Indexed {
buffer,
index_format,
count,
} => {
pass.set_index_buffer(buffer.slice(..), 0, *index_format);
pass.draw_indexed(0..*count, 0, batch_range.clone());
}
GpuBufferInfo::NonIndexed => {
pass.draw(0..gpu_mesh.vertex_count, batch_range.clone());
}
RenderCommandResult::Success
} else {
RenderCommandResult::Failure
}
RenderCommandResult::Success
}
}

View file

@ -11,7 +11,6 @@ use bevy_core_pipeline::{
};
use bevy_ecs::{
prelude::*,
storage::SparseSet,
system::{lifetimeless::*, SystemParamItem, SystemState},
};
use bevy_math::{Affine3A, Quat, Rect, Vec2, Vec4};
@ -34,7 +33,7 @@ use bevy_render::{
Extract,
};
use bevy_transform::components::GlobalTransform;
use bevy_utils::{FloatOrd, HashMap};
use bevy_utils::{EntityHashMap, FloatOrd, HashMap};
use bytemuck::{Pod, Zeroable};
use fixedbitset::FixedBitSet;
@ -330,7 +329,7 @@ pub struct ExtractedSprite {
#[derive(Resource, Default)]
pub struct ExtractedSprites {
pub sprites: SparseSet<Entity, ExtractedSprite>,
pub sprites: EntityHashMap<Entity, ExtractedSprite>,
}
#[derive(Resource, Default)]
@ -641,7 +640,7 @@ pub fn prepare_sprites(
// Compatible items share the same entity.
for item_index in 0..transparent_phase.items.len() {
let item = &transparent_phase.items[item_index];
let Some(extracted_sprite) = extracted_sprites.sprites.get(item.entity) else {
let Some(extracted_sprite) = extracted_sprites.sprites.get(&item.entity) else {
// If there is a phase item that is not a sprite, then we must start a new
// batch to draw the other phase item(s) and to respect draw order. This can be
// done by invalidating the batch_image_handle

View file

@ -2,7 +2,6 @@ mod pipeline;
mod render_pass;
use bevy_core_pipeline::{core_2d::Camera2d, core_3d::Camera3d};
use bevy_ecs::storage::SparseSet;
use bevy_hierarchy::Parent;
use bevy_render::render_phase::PhaseItem;
use bevy_render::view::ViewVisibility;
@ -36,7 +35,7 @@ use bevy_sprite::{SpriteAssetEvents, TextureAtlas};
#[cfg(feature = "bevy_text")]
use bevy_text::{PositionedGlyph, Text, TextLayoutInfo};
use bevy_transform::components::GlobalTransform;
use bevy_utils::{FloatOrd, HashMap};
use bevy_utils::{EntityHashMap, FloatOrd, HashMap};
use bytemuck::{Pod, Zeroable};
use std::ops::Range;
@ -164,7 +163,7 @@ pub struct ExtractedUiNode {
#[derive(Resource, Default)]
pub struct ExtractedUiNodes {
pub uinodes: SparseSet<Entity, ExtractedUiNode>,
pub uinodes: EntityHashMap<Entity, ExtractedUiNode>,
}
pub fn extract_atlas_uinodes(
@ -733,7 +732,7 @@ pub fn prepare_uinodes(
for item_index in 0..ui_phase.items.len() {
let item = &mut ui_phase.items[item_index];
if let Some(extracted_uinode) = extracted_uinodes.uinodes.get(item.entity) {
if let Some(extracted_uinode) = extracted_uinodes.uinodes.get(&item.entity) {
let mut existing_batch = batches
.last_mut()
.filter(|_| batch_image_handle == extracted_uinode.image);

View file

@ -250,6 +250,58 @@ impl<K: Hash + Eq + PartialEq + Clone, V> PreHashMapExt<K, V> for PreHashMap<K,
}
}
/// A [`BuildHasher`] that results in a [`EntityHasher`].
#[derive(Default)]
pub struct EntityHash;
impl BuildHasher for EntityHash {
type Hasher = EntityHasher;
fn build_hasher(&self) -> Self::Hasher {
EntityHasher::default()
}
}
/// A very fast hash that is only designed to work on generational indices
/// like `Entity`. It will panic if attempting to hash a type containing
/// non-u64 fields.
#[derive(Debug, Default)]
pub struct EntityHasher {
hash: u64,
}
// This value comes from rustc-hash (also known as FxHasher) which in turn got
// it from Firefox. It is something like `u64::MAX / N` for an N that gives a
// value close to π and works well for distributing bits for hashing when using
// with a wrapping multiplication.
const FRAC_U64MAX_PI: u64 = 0x517cc1b727220a95;
impl Hasher for EntityHasher {
fn write(&mut self, _bytes: &[u8]) {
panic!("can only hash u64 using EntityHasher");
}
#[inline]
fn write_u64(&mut self, i: u64) {
// Apparently hashbrown's hashmap uses the upper 7 bits for some SIMD
// optimisation that uses those bits for binning. This hash function
// was faster than i | (i << (64 - 7)) in the worst cases, and was
// faster than PassHasher for all cases tested.
self.hash = i | (i.wrapping_mul(FRAC_U64MAX_PI) << 32);
}
#[inline]
fn finish(&self) -> u64 {
self.hash
}
}
/// A [`HashMap`] pre-configured to use [`EntityHash`] hashing.
pub type EntityHashMap<K, V> = hashbrown::HashMap<K, V, EntityHash>;
/// A [`HashSet`] pre-configured to use [`EntityHash`] hashing.
pub type EntityHashSet<T> = hashbrown::HashSet<T, EntityHash>;
/// A type which calls a function when dropped.
/// This can be used to ensure that cleanup code is run even in case of a panic.
///

View file

@ -6,7 +6,9 @@ use bevy::{
query::QueryItem,
system::{lifetimeless::*, SystemParamItem},
},
pbr::{MeshPipeline, MeshPipelineKey, MeshTransforms, SetMeshBindGroup, SetMeshViewBindGroup},
pbr::{
MeshPipeline, MeshPipelineKey, RenderMeshInstances, SetMeshBindGroup, SetMeshViewBindGroup,
},
prelude::*,
render::{
extract_component::{ExtractComponent, ExtractComponentPlugin},
@ -113,7 +115,8 @@ fn queue_custom(
mut pipelines: ResMut<SpecializedMeshPipelines<CustomPipeline>>,
pipeline_cache: Res<PipelineCache>,
meshes: Res<RenderAssets<Mesh>>,
material_meshes: Query<(Entity, &MeshTransforms, &Handle<Mesh>), With<InstanceMaterialData>>,
render_mesh_instances: Res<RenderMeshInstances>,
material_meshes: Query<Entity, With<InstanceMaterialData>>,
mut views: Query<(&ExtractedView, &mut RenderPhase<Transparent3d>)>,
) {
let draw_custom = transparent_3d_draw_functions.read().id::<DrawCustom>();
@ -123,23 +126,26 @@ fn queue_custom(
for (view, mut transparent_phase) in &mut views {
let view_key = msaa_key | MeshPipelineKey::from_hdr(view.hdr);
let rangefinder = view.rangefinder3d();
for (entity, mesh_transforms, mesh_handle) in &material_meshes {
if let Some(mesh) = meshes.get(mesh_handle) {
let key =
view_key | MeshPipelineKey::from_primitive_topology(mesh.primitive_topology);
let pipeline = pipelines
.specialize(&pipeline_cache, &custom_pipeline, key, &mesh.layout)
.unwrap();
transparent_phase.add(Transparent3d {
entity,
pipeline,
draw_function: draw_custom,
distance: rangefinder
.distance_translation(&mesh_transforms.transform.translation),
batch_range: 0..1,
dynamic_offset: None,
});
}
for entity in &material_meshes {
let Some(mesh_instance) = render_mesh_instances.get(&entity) else {
continue;
};
let Some(mesh) = meshes.get(mesh_instance.mesh_asset_id) else {
continue;
};
let key = view_key | MeshPipelineKey::from_primitive_topology(mesh.primitive_topology);
let pipeline = pipelines
.specialize(&pipeline_cache, &custom_pipeline, key, &mesh.layout)
.unwrap();
transparent_phase.add(Transparent3d {
entity,
pipeline,
draw_function: draw_custom,
distance: rangefinder
.distance_translation(&mesh_instance.transforms.transform.translation),
batch_range: 0..1,
dynamic_offset: None,
});
}
}
}
@ -238,19 +244,22 @@ type DrawCustom = (
pub struct DrawMeshInstanced;
impl<P: PhaseItem> RenderCommand<P> for DrawMeshInstanced {
type Param = SRes<RenderAssets<Mesh>>;
type Param = (SRes<RenderAssets<Mesh>>, SRes<RenderMeshInstances>);
type ViewWorldQuery = ();
type ItemWorldQuery = (Read<Handle<Mesh>>, Read<InstanceBuffer>);
type ItemWorldQuery = Read<InstanceBuffer>;
#[inline]
fn render<'w>(
_item: &P,
item: &P,
_view: (),
(mesh_handle, instance_buffer): (&'w Handle<Mesh>, &'w InstanceBuffer),
meshes: SystemParamItem<'w, '_, Self::Param>,
instance_buffer: &'w InstanceBuffer,
(meshes, render_mesh_instances): SystemParamItem<'w, '_, Self::Param>,
pass: &mut TrackedRenderPass<'w>,
) -> RenderCommandResult {
let gpu_mesh = match meshes.into_inner().get(mesh_handle) {
let Some(mesh_instance) = render_mesh_instances.get(&item.entity()) else {
return RenderCommandResult::Failure;
};
let gpu_mesh = match meshes.into_inner().get(mesh_instance.mesh_asset_id) {
Some(gpu_mesh) => gpu_mesh,
None => return RenderCommandResult::Failure,
};

View file

@ -97,7 +97,8 @@ fn main() {
DefaultPlugins.set(WindowPlugin {
primary_window: Some(Window {
title: "BevyMark".into(),
resolution: (800., 600.).into(),
resolution: WindowResolution::new(1920.0, 1080.0)
.with_scale_factor_override(1.0),
present_mode: PresentMode::AutoNoVsync,
..default()
}),

View file

@ -16,7 +16,7 @@ use bevy::{
math::{DVec2, DVec3},
prelude::*,
render::render_resource::{Extent3d, TextureDimension, TextureFormat},
window::{PresentMode, WindowPlugin},
window::{PresentMode, WindowPlugin, WindowResolution},
};
use rand::{rngs::StdRng, seq::SliceRandom, Rng, SeedableRng};
@ -70,6 +70,8 @@ fn main() {
DefaultPlugins.set(WindowPlugin {
primary_window: Some(Window {
present_mode: PresentMode::AutoNoVsync,
resolution: WindowResolution::new(1920.0, 1080.0)
.with_scale_factor_override(1.0),
..default()
}),
..default()