mirror of
https://github.com/bevyengine/bevy
synced 2024-12-23 19:43:07 +00:00
5adf831b42
This patch adds the infrastructure necessary for Bevy to support *bindless resources*, by adding a new `#[bindless]` attribute to `AsBindGroup`. Classically, only a single texture (or sampler, or buffer) can be attached to each shader binding. This means that switching materials requires breaking a batch and issuing a new drawcall, even if the mesh is otherwise identical. This adds significant overhead not only in the driver but also in `wgpu`, as switching bind groups increases the amount of validation work that `wgpu` must do. *Bindless resources* are the typical solution to this problem. Instead of switching bindings between each texture, the renderer instead supplies a large *array* of all textures in the scene up front, and the material contains an index into that array. This pattern is repeated for buffers and samplers as well. The renderer now no longer needs to switch binding descriptor sets while drawing the scene. Unfortunately, as things currently stand, this approach won't quite work for Bevy. Two aspects of `wgpu` conspire to make this ideal approach unacceptably slow: 1. In the DX12 backend, all binding arrays (bindless resources) must have a constant size declared in the shader, and all textures in an array must be bound to actual textures. Changing the size requires a recompile. 2. Changing even one texture incurs revalidation of all textures, a process that takes time that's linear in the total size of the binding array. This means that declaring a large array of textures big enough to encompass the entire scene is presently unacceptably slow. For example, if you declare 4096 textures, then `wgpu` will have to revalidate all 4096 textures if even a single one changes. This process can take multiple frames. To work around this problem, this PR groups bindless resources into small *slabs* and maintains a free list for each. The size of each slab for the bindless arrays associated with a material is specified via the `#[bindless(N)]` attribute. For instance, consider the following declaration: ```rust #[derive(AsBindGroup)] #[bindless(16)] struct MyMaterial { #[buffer(0)] color: Vec4, #[texture(1)] #[sampler(2)] diffuse: Handle<Image>, } ``` The `#[bindless(N)]` attribute specifies that, if bindless arrays are supported on the current platform, each resource becomes a binding array of N instances of that resource. So, for `MyMaterial` above, the `color` attribute is exposed to the shader as `binding_array<vec4<f32>, 16>`, the `diffuse` texture is exposed to the shader as `binding_array<texture_2d<f32>, 16>`, and the `diffuse` sampler is exposed to the shader as `binding_array<sampler, 16>`. Inside the material's vertex and fragment shaders, the applicable index is available via the `material_bind_group_slot` field of the `Mesh` structure. So, for instance, you can access the current color like so: ```wgsl // `uniform` binding arrays are a non-sequitur, so `uniform` is automatically promoted // to `storage` in bindless mode. @group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>; ... @fragment fn fragment(in: VertexOutput) -> @location(0) vec4<f32> { let color = material_color[mesh[in.instance_index].material_bind_group_slot]; ... } ``` Note that portable shader code can't guarantee that the current platform supports bindless textures. Indeed, bindless mode is only available in Vulkan and DX12. The `BINDLESS` shader definition is available for your use to determine whether you're on a bindless platform or not. Thus a portable version of the shader above would look like: ```wgsl #ifdef BINDLESS @group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>; #else // BINDLESS @group(2) @binding(0) var<uniform> material_color: Color; #endif // BINDLESS ... @fragment fn fragment(in: VertexOutput) -> @location(0) vec4<f32> { #ifdef BINDLESS let color = material_color[mesh[in.instance_index].material_bind_group_slot]; #else // BINDLESS let color = material_color; #endif // BINDLESS ... } ``` Importantly, this PR *doesn't* update `StandardMaterial` to be bindless. So, for example, `scene_viewer` will currently not run any faster. I intend to update `StandardMaterial` to use bindless mode in a follow-up patch. A new example, `shaders/shader_material_bindless`, has been added to demonstrate how to use this new feature. Here's a Tracy profile of `submit_graph_commands` of this patch and an additional patch (not submitted yet) that makes `StandardMaterial` use bindless. Red is those patches; yellow is `main`. The scene was Bistro Exterior with a hack that forces all textures to opaque. You can see a 1.47x mean speedup. ![Screenshot 2024-11-12 161713](https://github.com/user-attachments/assets/4334b362-42c8-4d64-9cfb-6835f019b95c) ## Migration Guide * `RenderAssets::prepare_asset` now takes an `AssetId` parameter. * Bin keys now have Bevy-specific material bind group indices instead of `wgpu` material bind group IDs, as part of the bindless change. Use the new `MaterialBindGroupAllocator` to map from bind group index to bind group ID.
385 lines
14 KiB
Rust
385 lines
14 KiB
Rust
//! Demonstrates how to enqueue custom draw commands in a render phase.
|
|
//!
|
|
//! This example shows how to use the built-in
|
|
//! [`bevy_render::render_phase::BinnedRenderPhase`] functionality with a
|
|
//! custom [`RenderCommand`] to allow inserting arbitrary GPU drawing logic
|
|
//! into Bevy's pipeline. This is not the only way to add custom rendering code
|
|
//! into Bevy—render nodes are another, lower-level method—but it does allow
|
|
//! for better reuse of parts of Bevy's built-in mesh rendering logic.
|
|
|
|
use bevy::{
|
|
core_pipeline::core_3d::{Opaque3d, Opaque3dBinKey, CORE_3D_DEPTH_FORMAT},
|
|
ecs::{
|
|
query::ROQueryItem,
|
|
system::{lifetimeless::SRes, SystemParamItem},
|
|
},
|
|
prelude::*,
|
|
render::{
|
|
extract_component::{ExtractComponent, ExtractComponentPlugin},
|
|
primitives::Aabb,
|
|
render_phase::{
|
|
AddRenderCommand, BinnedRenderPhaseType, DrawFunctions, PhaseItem, RenderCommand,
|
|
RenderCommandResult, SetItemPipeline, TrackedRenderPass, ViewBinnedRenderPhases,
|
|
},
|
|
render_resource::{
|
|
BufferUsages, ColorTargetState, ColorWrites, CompareFunction, DepthStencilState,
|
|
FragmentState, IndexFormat, MultisampleState, PipelineCache, PrimitiveState,
|
|
RawBufferVec, RenderPipelineDescriptor, SpecializedRenderPipeline,
|
|
SpecializedRenderPipelines, TextureFormat, VertexAttribute, VertexBufferLayout,
|
|
VertexFormat, VertexState, VertexStepMode,
|
|
},
|
|
renderer::{RenderDevice, RenderQueue},
|
|
view::{self, ExtractedView, RenderVisibleEntities, VisibilitySystems},
|
|
Render, RenderApp, RenderSet,
|
|
},
|
|
};
|
|
use bytemuck::{Pod, Zeroable};
|
|
|
|
/// A marker component that represents an entity that is to be rendered using
|
|
/// our custom phase item.
|
|
///
|
|
/// Note the [`ExtractComponent`] trait implementation. This is necessary to
|
|
/// tell Bevy that this object should be pulled into the render world.
|
|
#[derive(Clone, Component, ExtractComponent)]
|
|
struct CustomRenderedEntity;
|
|
|
|
/// Holds a reference to our shader.
|
|
///
|
|
/// This is loaded at app creation time.
|
|
#[derive(Resource)]
|
|
struct CustomPhasePipeline {
|
|
shader: Handle<Shader>,
|
|
}
|
|
|
|
/// A [`RenderCommand`] that binds the vertex and index buffers and issues the
|
|
/// draw command for our custom phase item.
|
|
struct DrawCustomPhaseItem;
|
|
|
|
impl<P> RenderCommand<P> for DrawCustomPhaseItem
|
|
where
|
|
P: PhaseItem,
|
|
{
|
|
type Param = SRes<CustomPhaseItemBuffers>;
|
|
|
|
type ViewQuery = ();
|
|
|
|
type ItemQuery = ();
|
|
|
|
fn render<'w>(
|
|
_: &P,
|
|
_: ROQueryItem<'w, Self::ViewQuery>,
|
|
_: Option<ROQueryItem<'w, Self::ItemQuery>>,
|
|
custom_phase_item_buffers: SystemParamItem<'w, '_, Self::Param>,
|
|
pass: &mut TrackedRenderPass<'w>,
|
|
) -> RenderCommandResult {
|
|
// Borrow check workaround.
|
|
let custom_phase_item_buffers = custom_phase_item_buffers.into_inner();
|
|
|
|
// Tell the GPU where the vertices are.
|
|
pass.set_vertex_buffer(
|
|
0,
|
|
custom_phase_item_buffers
|
|
.vertices
|
|
.buffer()
|
|
.unwrap()
|
|
.slice(..),
|
|
);
|
|
|
|
// Tell the GPU where the indices are.
|
|
pass.set_index_buffer(
|
|
custom_phase_item_buffers
|
|
.indices
|
|
.buffer()
|
|
.unwrap()
|
|
.slice(..),
|
|
0,
|
|
IndexFormat::Uint32,
|
|
);
|
|
|
|
// Draw one triangle (3 vertices).
|
|
pass.draw_indexed(0..3, 0, 0..1);
|
|
|
|
RenderCommandResult::Success
|
|
}
|
|
}
|
|
|
|
/// The GPU vertex and index buffers for our custom phase item.
|
|
///
|
|
/// As the custom phase item is a single triangle, these are uploaded once and
|
|
/// then left alone.
|
|
#[derive(Resource)]
|
|
struct CustomPhaseItemBuffers {
|
|
/// The vertices for the single triangle.
|
|
///
|
|
/// This is a [`RawBufferVec`] because that's the simplest and fastest type
|
|
/// of GPU buffer, and [`Vertex`] objects are simple.
|
|
vertices: RawBufferVec<Vertex>,
|
|
|
|
/// The indices of the single triangle.
|
|
///
|
|
/// As above, this is a [`RawBufferVec`] because `u32` values have trivial
|
|
/// size and alignment.
|
|
indices: RawBufferVec<u32>,
|
|
}
|
|
|
|
/// The CPU-side structure that describes a single vertex of the triangle.
|
|
#[derive(Clone, Copy, Pod, Zeroable)]
|
|
#[repr(C)]
|
|
struct Vertex {
|
|
/// The 3D position of the triangle vertex.
|
|
position: Vec3,
|
|
/// Padding.
|
|
pad0: u32,
|
|
/// The color of the triangle vertex.
|
|
color: Vec3,
|
|
/// Padding.
|
|
pad1: u32,
|
|
}
|
|
|
|
impl Vertex {
|
|
/// Creates a new vertex structure.
|
|
const fn new(position: Vec3, color: Vec3) -> Vertex {
|
|
Vertex {
|
|
position,
|
|
color,
|
|
pad0: 0,
|
|
pad1: 0,
|
|
}
|
|
}
|
|
}
|
|
|
|
/// The custom draw commands that Bevy executes for each entity we enqueue into
|
|
/// the render phase.
|
|
type DrawCustomPhaseItemCommands = (SetItemPipeline, DrawCustomPhaseItem);
|
|
|
|
/// A query filter that tells [`view::check_visibility`] about our custom
|
|
/// rendered entity.
|
|
type WithCustomRenderedEntity = With<CustomRenderedEntity>;
|
|
|
|
/// A single triangle's worth of vertices, for demonstration purposes.
|
|
static VERTICES: [Vertex; 3] = [
|
|
Vertex::new(vec3(-0.866, -0.5, 0.5), vec3(1.0, 0.0, 0.0)),
|
|
Vertex::new(vec3(0.866, -0.5, 0.5), vec3(0.0, 1.0, 0.0)),
|
|
Vertex::new(vec3(0.0, 1.0, 0.5), vec3(0.0, 0.0, 1.0)),
|
|
];
|
|
|
|
/// The entry point.
|
|
fn main() {
|
|
let mut app = App::new();
|
|
app.add_plugins(DefaultPlugins)
|
|
.add_plugins(ExtractComponentPlugin::<CustomRenderedEntity>::default())
|
|
.add_systems(Startup, setup)
|
|
// Make sure to tell Bevy to check our entity for visibility. Bevy won't
|
|
// do this by default, for efficiency reasons.
|
|
.add_systems(
|
|
PostUpdate,
|
|
view::check_visibility::<WithCustomRenderedEntity>
|
|
.in_set(VisibilitySystems::CheckVisibility),
|
|
);
|
|
|
|
// We make sure to add these to the render app, not the main app.
|
|
app.get_sub_app_mut(RenderApp)
|
|
.unwrap()
|
|
.init_resource::<CustomPhasePipeline>()
|
|
.init_resource::<SpecializedRenderPipelines<CustomPhasePipeline>>()
|
|
.add_render_command::<Opaque3d, DrawCustomPhaseItemCommands>()
|
|
.add_systems(
|
|
Render,
|
|
prepare_custom_phase_item_buffers.in_set(RenderSet::Prepare),
|
|
)
|
|
.add_systems(Render, queue_custom_phase_item.in_set(RenderSet::Queue));
|
|
|
|
app.run();
|
|
}
|
|
|
|
/// Spawns the objects in the scene.
|
|
fn setup(mut commands: Commands) {
|
|
// Spawn a single entity that has custom rendering. It'll be extracted into
|
|
// the render world via [`ExtractComponent`].
|
|
commands.spawn((
|
|
Visibility::default(),
|
|
Transform::default(),
|
|
// This `Aabb` is necessary for the visibility checks to work.
|
|
Aabb {
|
|
center: Vec3A::ZERO,
|
|
half_extents: Vec3A::splat(0.5),
|
|
},
|
|
CustomRenderedEntity,
|
|
));
|
|
|
|
// Spawn the camera.
|
|
commands.spawn((
|
|
Camera3d::default(),
|
|
Transform::from_xyz(0.0, 0.0, 1.0).looking_at(Vec3::ZERO, Vec3::Y),
|
|
));
|
|
}
|
|
|
|
/// Creates the [`CustomPhaseItemBuffers`] resource.
|
|
///
|
|
/// This must be done in a startup system because it needs the [`RenderDevice`]
|
|
/// and [`RenderQueue`] to exist, and they don't until [`App::run`] is called.
|
|
fn prepare_custom_phase_item_buffers(mut commands: Commands) {
|
|
commands.init_resource::<CustomPhaseItemBuffers>();
|
|
}
|
|
|
|
/// A render-world system that enqueues the entity with custom rendering into
|
|
/// the opaque render phases of each view.
|
|
fn queue_custom_phase_item(
|
|
pipeline_cache: Res<PipelineCache>,
|
|
custom_phase_pipeline: Res<CustomPhasePipeline>,
|
|
mut opaque_render_phases: ResMut<ViewBinnedRenderPhases<Opaque3d>>,
|
|
opaque_draw_functions: Res<DrawFunctions<Opaque3d>>,
|
|
mut specialized_render_pipelines: ResMut<SpecializedRenderPipelines<CustomPhasePipeline>>,
|
|
views: Query<(Entity, &RenderVisibleEntities, &Msaa), With<ExtractedView>>,
|
|
) {
|
|
let draw_custom_phase_item = opaque_draw_functions
|
|
.read()
|
|
.id::<DrawCustomPhaseItemCommands>();
|
|
|
|
// Render phases are per-view, so we need to iterate over all views so that
|
|
// the entity appears in them. (In this example, we have only one view, but
|
|
// it's good practice to loop over all views anyway.)
|
|
for (view_entity, view_visible_entities, msaa) in views.iter() {
|
|
let Some(opaque_phase) = opaque_render_phases.get_mut(&view_entity) else {
|
|
continue;
|
|
};
|
|
|
|
// Find all the custom rendered entities that are visible from this
|
|
// view.
|
|
for &entity in view_visible_entities
|
|
.get::<WithCustomRenderedEntity>()
|
|
.iter()
|
|
{
|
|
// Ordinarily, the [`SpecializedRenderPipeline::Key`] would contain
|
|
// some per-view settings, such as whether the view is HDR, but for
|
|
// simplicity's sake we simply hard-code the view's characteristics,
|
|
// with the exception of number of MSAA samples.
|
|
let pipeline_id = specialized_render_pipelines.specialize(
|
|
&pipeline_cache,
|
|
&custom_phase_pipeline,
|
|
*msaa,
|
|
);
|
|
|
|
// Add the custom render item. We use the
|
|
// [`BinnedRenderPhaseType::NonMesh`] type to skip the special
|
|
// handling that Bevy has for meshes (preprocessing, indirect
|
|
// draws, etc.)
|
|
//
|
|
// The asset ID is arbitrary; we simply use [`AssetId::invalid`],
|
|
// but you can use anything you like. Note that the asset ID need
|
|
// not be the ID of a [`Mesh`].
|
|
opaque_phase.add(
|
|
Opaque3dBinKey {
|
|
draw_function: draw_custom_phase_item,
|
|
pipeline: pipeline_id,
|
|
asset_id: AssetId::<Mesh>::invalid().untyped(),
|
|
material_bind_group_index: None,
|
|
lightmap_image: None,
|
|
},
|
|
entity,
|
|
BinnedRenderPhaseType::NonMesh,
|
|
);
|
|
}
|
|
}
|
|
}
|
|
|
|
impl SpecializedRenderPipeline for CustomPhasePipeline {
|
|
type Key = Msaa;
|
|
|
|
fn specialize(&self, msaa: Self::Key) -> RenderPipelineDescriptor {
|
|
RenderPipelineDescriptor {
|
|
label: Some("custom render pipeline".into()),
|
|
layout: vec![],
|
|
push_constant_ranges: vec![],
|
|
vertex: VertexState {
|
|
shader: self.shader.clone(),
|
|
shader_defs: vec![],
|
|
entry_point: "vertex".into(),
|
|
buffers: vec![VertexBufferLayout {
|
|
array_stride: size_of::<Vertex>() as u64,
|
|
step_mode: VertexStepMode::Vertex,
|
|
// This needs to match the layout of [`Vertex`].
|
|
attributes: vec![
|
|
VertexAttribute {
|
|
format: VertexFormat::Float32x3,
|
|
offset: 0,
|
|
shader_location: 0,
|
|
},
|
|
VertexAttribute {
|
|
format: VertexFormat::Float32x3,
|
|
offset: 16,
|
|
shader_location: 1,
|
|
},
|
|
],
|
|
}],
|
|
},
|
|
fragment: Some(FragmentState {
|
|
shader: self.shader.clone(),
|
|
shader_defs: vec![],
|
|
entry_point: "fragment".into(),
|
|
targets: vec![Some(ColorTargetState {
|
|
// Ordinarily, you'd want to check whether the view has the
|
|
// HDR format and substitute the appropriate texture format
|
|
// here, but we omit that for simplicity.
|
|
format: TextureFormat::bevy_default(),
|
|
blend: None,
|
|
write_mask: ColorWrites::ALL,
|
|
})],
|
|
}),
|
|
primitive: PrimitiveState::default(),
|
|
// Note that if your view has no depth buffer this will need to be
|
|
// changed.
|
|
depth_stencil: Some(DepthStencilState {
|
|
format: CORE_3D_DEPTH_FORMAT,
|
|
depth_write_enabled: false,
|
|
depth_compare: CompareFunction::Always,
|
|
stencil: default(),
|
|
bias: default(),
|
|
}),
|
|
multisample: MultisampleState {
|
|
count: msaa.samples(),
|
|
mask: !0,
|
|
alpha_to_coverage_enabled: false,
|
|
},
|
|
zero_initialize_workgroup_memory: false,
|
|
}
|
|
}
|
|
}
|
|
|
|
impl FromWorld for CustomPhaseItemBuffers {
|
|
fn from_world(world: &mut World) -> Self {
|
|
let render_device = world.resource::<RenderDevice>();
|
|
let render_queue = world.resource::<RenderQueue>();
|
|
|
|
// Create the vertex and index buffers.
|
|
let mut vbo = RawBufferVec::new(BufferUsages::VERTEX);
|
|
let mut ibo = RawBufferVec::new(BufferUsages::INDEX);
|
|
|
|
for vertex in &VERTICES {
|
|
vbo.push(*vertex);
|
|
}
|
|
for index in 0..3 {
|
|
ibo.push(index);
|
|
}
|
|
|
|
// These two lines are required in order to trigger the upload to GPU.
|
|
vbo.write_buffer(render_device, render_queue);
|
|
ibo.write_buffer(render_device, render_queue);
|
|
|
|
CustomPhaseItemBuffers {
|
|
vertices: vbo,
|
|
indices: ibo,
|
|
}
|
|
}
|
|
}
|
|
|
|
impl FromWorld for CustomPhasePipeline {
|
|
fn from_world(world: &mut World) -> Self {
|
|
// Load and compile the shader in the background.
|
|
let asset_server = world.resource::<AssetServer>();
|
|
|
|
CustomPhasePipeline {
|
|
shader: asset_server.load("shaders/custom_phase_item.wgsl"),
|
|
}
|
|
}
|
|
}
|