bevy/examples/tools/scene_viewer/main.rs
Patrick Walton dda97880c4
Implement experimental GPU two-phase occlusion culling for the standard 3D mesh pipeline. (#17413)
*Occlusion culling* allows the GPU to skip the vertex and fragment
shading overhead for objects that can be quickly proved to be invisible
because they're behind other geometry. A depth prepass already
eliminates most fragment shading overhead for occluded objects, but the
vertex shading overhead, as well as the cost of testing and rejecting
fragments against the Z-buffer, is presently unavoidable for standard
meshes. We currently perform occlusion culling only for meshlets. But
other meshes, such as skinned meshes, can benefit from occlusion culling
too in order to avoid the transform and skinning overhead for unseen
meshes.

This commit adapts the same [*two-phase occlusion culling*] technique
that meshlets use to Bevy's standard 3D mesh pipeline when the new
`OcclusionCulling` component, as well as the `DepthPrepass` component,
are present on the camera. It has these steps:

1. *Early depth prepass*: We use the hierarchical Z-buffer from the
previous frame to cull meshes for the initial depth prepass, effectively
rendering only the meshes that were visible in the last frame.

2. *Early depth downsample*: We downsample the depth buffer to create
another hierarchical Z-buffer, this time with the current view
transform.

3. *Late depth prepass*: We use the new hierarchical Z-buffer to test
all meshes that weren't rendered in the early depth prepass. Any meshes
that pass this check are rendered.

4. *Late depth downsample*: Again, we downsample the depth buffer to
create a hierarchical Z-buffer in preparation for the early depth
prepass of the next frame. This step is done after all the rendering, in
order to account for custom phase items that might write to the depth
buffer.

Note that this patch has no effect on the per-mesh CPU overhead for
occluded objects, which remains high for a GPU-driven renderer due to
the lack of `cold-specialization` and retained bins. If
`cold-specialization` and retained bins weren't on the horizon, then a
more traditional approach like potentially visible sets (PVS) or low-res
CPU rendering would probably be more efficient than the GPU-driven
approach that this patch implements for most scenes. However, at this
point the amount of effort required to implement a PVS baking tool or a
low-res CPU renderer would probably be greater than landing
`cold-specialization` and retained bins, and the GPU driven approach is
the more modern one anyway. It does mean that the performance
improvements from occlusion culling as implemented in this patch *today*
are likely to be limited, because of the high CPU overhead for occluded
meshes.

Note also that this patch currently doesn't implement occlusion culling
for 2D objects or shadow maps. Those can be addressed in a follow-up.
Additionally, note that the techniques in this patch require compute
shaders, which excludes support for WebGL 2.

This PR is marked experimental because of known precision issues with
the downsampling approach when applied to non-power-of-two framebuffer
sizes (i.e. most of them). These precision issues can, in rare cases,
cause objects to be judged occluded that in fact are not. (I've never
seen this in practice, but I know it's possible; it tends to be likelier
to happen with small meshes.) As a follow-up to this patch, we desire to
switch to the [SPD-based hi-Z buffer shader from the Granite engine],
which doesn't suffer from these problems, at which point we should be
able to graduate this feature from experimental status. I opted not to
include that rewrite in this patch for two reasons: (1) @JMS55 is
planning on doing the rewrite to coincide with the new availability of
image atomic operations in Naga; (2) to reduce the scope of this patch.

A new example, `occlusion_culling`, has been added. It demonstrates
objects becoming quickly occluded and disoccluded by dynamic geometry
and shows the number of objects that are actually being rendered. Also,
a new `--occlusion-culling` switch has been added to `scene_viewer`, in
order to make it easy to test this patch with large scenes like Bistro.

[*two-phase occlusion culling*]:
https://medium.com/@mil_kru/two-pass-occlusion-culling-4100edcad501

[Aaltonen SIGGRAPH 2015]:

https://www.advances.realtimerendering.com/s2015/aaltonenhaar_siggraph2015_combined_final_footer_220dpi.pdf

[Some literature]:

https://gist.github.com/reduz/c5769d0e705d8ab7ac187d63be0099b5?permalink_comment_id=5040452#gistcomment-5040452

[SPD-based hi-Z buffer shader from the Granite engine]:
https://github.com/Themaister/Granite/blob/master/assets/shaders/post/hiz.comp

## Migration guide

* When enqueuing a custom mesh pipeline, work item buffers are now
created with
`bevy::render::batching::gpu_preprocessing::get_or_create_work_item_buffer`,
not `PreprocessWorkItemBuffers::new`. See the
`specialized_mesh_pipeline` example.

## Showcase

Occlusion culling example:
![Screenshot 2025-01-15
175051](https://github.com/user-attachments/assets/1544f301-68a3-45f8-84a6-7af3ad431258)

Bistro zoomed out, before occlusion culling:
![Screenshot 2025-01-16
185425](https://github.com/user-attachments/assets/5114bbdf-5dec-4de9-b17e-7aa77e7b61ed)

Bistro zoomed out, after occlusion culling:
![Screenshot 2025-01-16
184949](https://github.com/user-attachments/assets/9dd67713-656c-4276-9768-6d261ca94300)

In this scene, occlusion culling reduces the number of meshes Bevy has
to render from 1591 to 585.
2025-01-27 05:02:46 +00:00

217 lines
7.3 KiB
Rust

//! A simple glTF scene viewer made with Bevy.
//!
//! Just run `cargo run --release --example scene_viewer /path/to/model.gltf`,
//! replacing the path as appropriate.
//! In case of multiple scenes, you can select which to display by adapting the file path: `/path/to/model.gltf#Scene1`.
//! With no arguments it will load the `FlightHelmet` glTF model from the repository assets subdirectory.
//! Pass `--help` to see all the supported arguments.
//!
//! If you want to hot reload asset changes, enable the `file_watcher` cargo feature.
use argh::FromArgs;
use bevy::{
core_pipeline::prepass::{DeferredPrepass, DepthPrepass},
pbr::DefaultOpaqueRendererMethod,
prelude::*,
render::{
experimental::occlusion_culling::OcclusionCulling,
primitives::{Aabb, Sphere},
},
};
#[path = "../../helpers/camera_controller.rs"]
mod camera_controller;
#[cfg(feature = "animation")]
mod animation_plugin;
mod morph_viewer_plugin;
mod scene_viewer_plugin;
use camera_controller::{CameraController, CameraControllerPlugin};
use morph_viewer_plugin::MorphViewerPlugin;
use scene_viewer_plugin::{SceneHandle, SceneViewerPlugin};
/// A simple glTF scene viewer made with Bevy
#[derive(FromArgs, Resource)]
struct Args {
/// the path to the glTF scene
#[argh(
positional,
default = "\"assets/models/FlightHelmet/FlightHelmet.gltf\".to_string()"
)]
scene_path: String,
/// enable a depth prepass
#[argh(switch)]
depth_prepass: Option<bool>,
/// enable occlusion culling
#[argh(switch)]
occlusion_culling: Option<bool>,
/// enable deferred shading
#[argh(switch)]
deferred: Option<bool>,
}
fn main() {
#[cfg(not(target_arch = "wasm32"))]
let args: Args = argh::from_env();
#[cfg(target_arch = "wasm32")]
let args: Args = Args::from_args(&[], &[]).unwrap();
let deferred = args.deferred;
let mut app = App::new();
app.add_plugins((
DefaultPlugins
.set(WindowPlugin {
primary_window: Some(Window {
title: "bevy scene viewer".to_string(),
..default()
}),
..default()
})
.set(AssetPlugin {
file_path: std::env::var("CARGO_MANIFEST_DIR").unwrap_or_else(|_| ".".to_string()),
..default()
}),
CameraControllerPlugin,
SceneViewerPlugin,
MorphViewerPlugin,
))
.insert_resource(args)
.add_systems(Startup, setup)
.add_systems(PreUpdate, setup_scene_after_load);
// If deferred shading was requested, turn it on.
if deferred == Some(true) {
app.insert_resource(DefaultOpaqueRendererMethod::deferred());
}
#[cfg(feature = "animation")]
app.add_plugins(animation_plugin::AnimationManipulationPlugin);
app.run();
}
fn parse_scene(scene_path: String) -> (String, usize) {
if scene_path.contains('#') {
let gltf_and_scene = scene_path.split('#').collect::<Vec<_>>();
if let Some((last, path)) = gltf_and_scene.split_last() {
if let Some(index) = last
.strip_prefix("Scene")
.and_then(|index| index.parse::<usize>().ok())
{
return (path.join("#"), index);
}
}
}
(scene_path, 0)
}
fn setup(mut commands: Commands, asset_server: Res<AssetServer>, args: Res<Args>) {
let scene_path = &args.scene_path;
info!("Loading {}", scene_path);
let (file_path, scene_index) = parse_scene((*scene_path).clone());
commands.insert_resource(SceneHandle::new(asset_server.load(file_path), scene_index));
}
fn setup_scene_after_load(
mut commands: Commands,
mut setup: Local<bool>,
mut scene_handle: ResMut<SceneHandle>,
asset_server: Res<AssetServer>,
args: Res<Args>,
meshes: Query<(&GlobalTransform, Option<&Aabb>), With<Mesh3d>>,
) {
if scene_handle.is_loaded && !*setup {
*setup = true;
// Find an approximate bounding box of the scene from its meshes
if meshes.iter().any(|(_, maybe_aabb)| maybe_aabb.is_none()) {
return;
}
let mut min = Vec3A::splat(f32::MAX);
let mut max = Vec3A::splat(f32::MIN);
for (transform, maybe_aabb) in &meshes {
let aabb = maybe_aabb.unwrap();
// If the Aabb had not been rotated, applying the non-uniform scale would produce the
// correct bounds. However, it could very well be rotated and so we first convert to
// a Sphere, and then back to an Aabb to find the conservative min and max points.
let sphere = Sphere {
center: Vec3A::from(transform.transform_point(Vec3::from(aabb.center))),
radius: transform.radius_vec3a(aabb.half_extents),
};
let aabb = Aabb::from(sphere);
min = min.min(aabb.min());
max = max.max(aabb.max());
}
let size = (max - min).length();
let aabb = Aabb::from_min_max(Vec3::from(min), Vec3::from(max));
info!("Spawning a controllable 3D perspective camera");
let mut projection = PerspectiveProjection::default();
projection.far = projection.far.max(size * 10.0);
let walk_speed = size * 3.0;
let camera_controller = CameraController {
walk_speed,
run_speed: 3.0 * walk_speed,
..default()
};
// Display the controls of the scene viewer
info!("{}", camera_controller);
info!("{}", *scene_handle);
let mut camera = commands.spawn((
Camera3d::default(),
Projection::from(projection),
Transform::from_translation(Vec3::from(aabb.center) + size * Vec3::new(0.5, 0.25, 0.5))
.looking_at(Vec3::from(aabb.center), Vec3::Y),
Camera {
is_active: false,
..default()
},
EnvironmentMapLight {
diffuse_map: asset_server
.load("assets/environment_maps/pisa_diffuse_rgb9e5_zstd.ktx2"),
specular_map: asset_server
.load("assets/environment_maps/pisa_specular_rgb9e5_zstd.ktx2"),
intensity: 150.0,
..default()
},
camera_controller,
));
// If occlusion culling was requested, include the relevant components.
// The Z-prepass is currently required.
if args.occlusion_culling == Some(true) {
camera.insert((DepthPrepass, OcclusionCulling));
}
// If the depth prepass was requested, include it.
if args.depth_prepass == Some(true) {
camera.insert(DepthPrepass);
}
// If deferred shading was requested, include the prepass.
if args.deferred == Some(true) {
camera
.insert(Msaa::Off)
.insert(DepthPrepass)
.insert(DeferredPrepass);
}
// Spawn a default light if the scene does not have one
if !scene_handle.has_light {
info!("Spawning a directional light");
commands.spawn((
DirectionalLight::default(),
Transform::from_xyz(1.0, 1.0, 0.0).looking_at(Vec3::ZERO, Vec3::Y),
));
scene_handle.has_light = true;
}
}
}