bevy/crates/bevy_pbr/src/light.rs

1435 lines
55 KiB
Rust
Raw Normal View History

use std::collections::HashSet;
use bevy_asset::Assets;
use bevy_ecs::prelude::*;
use bevy_math::{Mat4, UVec2, UVec3, Vec2, Vec3, Vec3A, Vec3Swizzles, Vec4, Vec4Swizzles};
use bevy_reflect::Reflect;
use bevy_render::{
camera::{Camera, CameraProjection, OrthographicProjection},
color::Color,
prelude::Image,
primitives::{Aabb, CubemapFrusta, Frustum, Plane, Sphere},
Use storage buffers for clustered forward point lights (#3989) # Objective - Make use of storage buffers, where they are available, for clustered forward bindings to support far more point lights in a scene - Fixes #3605 - Based on top of #4079 This branch on an M1 Max can keep 60fps with about 2150 point lights of radius 1m in the Sponza scene where I've been testing. The bottleneck is mostly assigning lights to clusters which grows faster than linearly (I think 1000 lights was about 1.5ms and 5000 was 7.5ms). I have seen papers and presentations leveraging compute shaders that can get this up to over 1 million. That said, I think any further optimisations should probably be done in a separate PR. ## Solution - Add `RenderDevice` to the `Material` and `SpecializedMaterial` trait `::key()` functions to allow setting flags on the keys depending on feature/limit availability - Make `GpuPointLights` and `ViewClusterBuffers` into enums containing `UniformVec` and `StorageBuffer` variants. Implement the necessary API on them to make usage the same for both cases, and the only difference is at initialisation time. - Appropriate shader defs in the shader code to handle the two cases ## Context on some decisions / open questions - I'm using `max_storage_buffers_per_shader_stage >= 3` as a check to see if storage buffers are supported. I was thinking about diving into 'binding resource management' but it feels like we don't have enough use cases to understand the problem yet, and it is mostly a separate concern to this PR, so I think it should be handled separately. - Should `ViewClusterBuffers` and `ViewClusterBindings` be merged, duplicating the count variables into the enum variants? Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2022-04-07 16:16:35 +00:00
render_resource::BufferBindingType,
renderer::RenderDevice,
view::{ComputedVisibility, RenderLayers, Visibility, VisibleEntities},
};
use bevy_transform::components::GlobalTransform;
use bevy_utils::tracing::warn;
use bevy_window::Windows;
use crate::{
bevy_pbr2: Fix clustering for orthographic projections (#3316) # Objective PBR lighting was broken in the new renderer when using orthographic projections due to the way the depth slicing works for the clusters. Fix it. ## Solution - The default orthographic projection near plane is 0.0. The perspective projection depth slicing does a division by the near plane which gives a floating point NaN and the clustering all breaks down. - Orthographic projections have a linear depth mapping, so it made intuitive sense to me to do depth slicing with a linear mapping too. The alternative I saw was to try to handle the near plane being at 0.0 and using the exponential depth slicing, but that felt like a hack that didn't make sense. - As such, I have added code that detects whether the projection is orthographic based on `projection[3][3] == 1.0` and then implemented the orthographic mapping case throughout (when computing cluster AABBs, and when mapping a view space position (or light) to a cluster id in both the rust and shader code). ## Screenshots Before: ![before](https://user-images.githubusercontent.com/302146/145847278-5b1bca74-fbad-4cc5-8b49-384f6a377fdc.png) After: <img width="1392" alt="Screenshot 2021-12-13 at 16 36 53" src="https://user-images.githubusercontent.com/302146/145847314-6f3a2035-5d87-4896-8032-0c3e35e15b7d.png"> Old renderer (slightly lighter due to slight difference in configured intensity): <img width="1392" alt="Screenshot 2021-12-13 at 16 42 23" src="https://user-images.githubusercontent.com/302146/145847391-6a5e6fe0-22da-4fc1-a6c7-440543689a63.png">
2021-12-14 23:42:35 +00:00
calculate_cluster_factors, CubeMapFace, CubemapVisibleEntities, ViewClusterBindings,
Use storage buffers for clustered forward point lights (#3989) # Objective - Make use of storage buffers, where they are available, for clustered forward bindings to support far more point lights in a scene - Fixes #3605 - Based on top of #4079 This branch on an M1 Max can keep 60fps with about 2150 point lights of radius 1m in the Sponza scene where I've been testing. The bottleneck is mostly assigning lights to clusters which grows faster than linearly (I think 1000 lights was about 1.5ms and 5000 was 7.5ms). I have seen papers and presentations leveraging compute shaders that can get this up to over 1 million. That said, I think any further optimisations should probably be done in a separate PR. ## Solution - Add `RenderDevice` to the `Material` and `SpecializedMaterial` trait `::key()` functions to allow setting flags on the keys depending on feature/limit availability - Make `GpuPointLights` and `ViewClusterBuffers` into enums containing `UniformVec` and `StorageBuffer` variants. Implement the necessary API on them to make usage the same for both cases, and the only difference is at initialisation time. - Appropriate shader defs in the shader code to handle the two cases ## Context on some decisions / open questions - I'm using `max_storage_buffers_per_shader_stage >= 3` as a check to see if storage buffers are supported. I was thinking about diving into 'binding resource management' but it feels like we don't have enough use cases to understand the problem yet, and it is mostly a separate concern to this PR, so I think it should be handled separately. - Should `ViewClusterBuffers` and `ViewClusterBindings` be merged, duplicating the count variables into the enum variants? Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2022-04-07 16:16:35 +00:00
CLUSTERED_FORWARD_STORAGE_BUFFER_COUNT, CUBE_MAP_FACES, MAX_UNIFORM_BUFFER_POINT_LIGHTS,
POINT_LIGHT_NEAR_Z,
};
2019-12-02 04:03:04 +00:00
/// A light that emits light in all directions from a central point.
///
/// Real-world values for `intensity` (luminous power in lumens) based on the electrical power
/// consumption of the type of real-world light are:
///
/// | Luminous Power (lumen) (i.e. the intensity member) | Incandescent non-halogen (Watts) | Incandescent halogen (Watts) | Compact fluorescent (Watts) | LED (Watts |
/// |------|-----|----|--------|-------|
/// | 200 | 25 | | 3-5 | 3 |
/// | 450 | 40 | 29 | 9-11 | 5-8 |
/// | 800 | 60 | | 13-15 | 8-12 |
/// | 1100 | 75 | 53 | 18-20 | 10-16 |
/// | 1600 | 100 | 72 | 24-28 | 14-17 |
/// | 2400 | 150 | | 30-52 | 24-30 |
/// | 3100 | 200 | | 49-75 | 32 |
/// | 4000 | 300 | | 75-100 | 40.5 |
///
/// Source: [Wikipedia](https://en.wikipedia.org/wiki/Lumen_(unit)#Lighting)
#[derive(Component, Debug, Clone, Copy, Reflect)]
#[reflect(Component)]
pub struct PointLight {
2020-03-10 06:43:40 +00:00
pub color: Color,
pub intensity: f32,
pub range: f32,
pub radius: f32,
pub shadows_enabled: bool,
pub shadow_depth_bias: f32,
/// A bias applied along the direction of the fragment's surface normal. It is scaled to the
/// shadow map's texel size so that it can be small close to the camera and gets larger further
/// away.
pub shadow_normal_bias: f32,
2020-02-18 03:06:12 +00:00
}
impl Default for PointLight {
2020-02-18 03:06:12 +00:00
fn default() -> Self {
PointLight {
2020-03-10 06:43:40 +00:00
color: Color::rgb(1.0, 1.0, 1.0),
/// Luminous power in lumens
intensity: 800.0, // Roughly a 60W non-halogen incandescent bulb
range: 20.0,
radius: 0.0,
shadows_enabled: false,
shadow_depth_bias: Self::DEFAULT_SHADOW_DEPTH_BIAS,
shadow_normal_bias: Self::DEFAULT_SHADOW_NORMAL_BIAS,
2020-02-18 03:06:12 +00:00
}
}
2019-12-02 04:03:04 +00:00
}
impl PointLight {
pub const DEFAULT_SHADOW_DEPTH_BIAS: f32 = 0.02;
pub const DEFAULT_SHADOW_NORMAL_BIAS: f32 = 0.6;
2019-12-02 04:03:04 +00:00
}
#[derive(Clone, Debug)]
pub struct PointLightShadowMap {
pub size: usize,
}
impl Default for PointLightShadowMap {
fn default() -> Self {
Self { size: 1024 }
2019-12-02 04:03:04 +00:00
}
2020-01-11 10:11:27 +00:00
}
/// A Directional light.
///
/// Directional lights don't exist in reality but they are a good
/// approximation for light sources VERY far away, like the sun or
/// the moon.
///
/// Valid values for `illuminance` are:
///
/// | Illuminance (lux) | Surfaces illuminated by |
/// |-------------------|------------------------------------------------|
/// | 0.0001 | Moonless, overcast night sky (starlight) |
/// | 0.002 | Moonless clear night sky with airglow |
/// | 0.050.3 | Full moon on a clear night |
/// | 3.4 | Dark limit of civil twilight under a clear sky |
/// | 2050 | Public areas with dark surroundings |
/// | 50 | Family living room lights |
/// | 80 | Office building hallway/toilet lighting |
/// | 100 | Very dark overcast day |
/// | 150 | Train station platforms |
/// | 320500 | Office lighting |
/// | 400 | Sunrise or sunset on a clear day. |
/// | 1000 | Overcast day; typical TV studio lighting |
/// | 10,00025,000 | Full daylight (not direct sun) |
/// | 32,000100,000 | Direct sunlight |
///
/// Source: [Wikipedia](https://en.wikipedia.org/wiki/Lux)
#[derive(Component, Debug, Clone, Reflect)]
#[reflect(Component)]
pub struct DirectionalLight {
pub color: Color,
/// Illuminance in lux
pub illuminance: f32,
pub shadows_enabled: bool,
pub shadow_projection: OrthographicProjection,
pub shadow_depth_bias: f32,
/// A bias applied along the direction of the fragment's surface normal. It is scaled to the
/// shadow map's texel size so that it is automatically adjusted to the orthographic projection.
pub shadow_normal_bias: f32,
}
impl Default for DirectionalLight {
fn default() -> Self {
let size = 100.0;
DirectionalLight {
color: Color::rgb(1.0, 1.0, 1.0),
illuminance: 100000.0,
shadows_enabled: false,
shadow_projection: OrthographicProjection {
left: -size,
right: size,
bottom: -size,
top: size,
near: -size,
far: size,
..Default::default()
},
shadow_depth_bias: Self::DEFAULT_SHADOW_DEPTH_BIAS,
shadow_normal_bias: Self::DEFAULT_SHADOW_NORMAL_BIAS,
}
}
}
impl DirectionalLight {
pub const DEFAULT_SHADOW_DEPTH_BIAS: f32 = 0.02;
pub const DEFAULT_SHADOW_NORMAL_BIAS: f32 = 0.6;
}
#[derive(Clone, Debug)]
pub struct DirectionalLightShadowMap {
pub size: usize,
}
impl Default for DirectionalLightShadowMap {
fn default() -> Self {
#[cfg(feature = "webgl")]
return Self { size: 2048 };
#[cfg(not(feature = "webgl"))]
return Self { size: 4096 };
}
}
/// An ambient light, which lights the entire scene equally.
#[derive(Debug)]
pub struct AmbientLight {
pub color: Color,
/// A direct scale factor multiplied with `color` before being passed to the shader.
pub brightness: f32,
}
impl Default for AmbientLight {
fn default() -> Self {
Self {
color: Color::rgb(1.0, 1.0, 1.0),
brightness: 0.05,
}
}
}
/// Add this component to make a [`Mesh`](bevy_render::mesh::Mesh) not cast shadows.
#[derive(Component)]
pub struct NotShadowCaster;
/// Add this component to make a [`Mesh`](bevy_render::mesh::Mesh) not receive shadows.
#[derive(Component)]
pub struct NotShadowReceiver;
#[derive(Debug, Hash, PartialEq, Eq, Clone, SystemLabel)]
pub enum SimulationLightSystems {
AddClusters,
AssignLightsToClusters,
UpdateDirectionalLightFrusta,
UpdatePointLightFrusta,
CheckLightVisibility,
}
// Clustered-forward rendering notes
// The main initial reference material used was this rather accessible article:
// http://www.aortiz.me/2018/12/21/CG.html
// Some inspiration was taken from “Practical Clustered Shading” which is part 2 of:
// https://efficientshading.com/2015/01/01/real-time-many-light-management-and-shadows-with-clustered-shading/
// (Also note that Part 3 of the above shows how we could support the shadow mapping for many lights.)
// The z-slicing method mentioned in the aortiz article is originally from Tiago Sousas Siggraph 2016 talk about Doom 2016:
// http://advances.realtimerendering.com/s2016/Siggraph2016_idTech6.pdf
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
/// Configure the far z-plane mode used for the furthest depth slice for clustered forward
/// rendering
#[derive(Debug, Copy, Clone)]
pub enum ClusterFarZMode {
/// Use the camera far-plane to determine the z-depth of the furthest cluster layer
CameraFarPlane,
/// Calculate the required maximum z-depth based on currently visible lights.
/// Makes better use of available clusters, speeding up GPU lighting operations
/// at the expense of some CPU time and using more indices in the cluster light
/// index lists.
MaxLightRange,
/// Constant max z-depth
Constant(f32),
}
/// Configure the depth-slicing strategy for clustered forward rendering
#[derive(Debug, Copy, Clone)]
pub struct ClusterZConfig {
/// Far z plane of the first depth slice
pub first_slice_depth: f32,
/// Strategy for how to evaluate the far z plane of the furthest depth slice
pub far_z_mode: ClusterFarZMode,
}
impl Default for ClusterZConfig {
fn default() -> Self {
Self {
first_slice_depth: 5.0,
far_z_mode: ClusterFarZMode::MaxLightRange,
}
}
}
/// Configuration of the clustering strategy for clustered forward rendering
#[derive(Debug, Copy, Clone, Component)]
pub enum ClusterConfig {
/// Disable light cluster calculations for this view
None,
/// One single cluster. Optimal for low-light complexity scenes or scenes where
/// most lights affect the entire scene.
Single,
/// Explicit x, y and z counts (may yield non-square x/y clusters depending on the aspect ratio)
XYZ {
dimensions: UVec3,
z_config: ClusterZConfig,
/// Specify if clusters should automatically resize in x/y if there is a risk of exceeding
/// the available cluster-light index limit
dynamic_resizing: bool,
},
/// Fixed number of z slices, x and y calculated to give square clusters
/// with at most total clusters. For top-down games where lights will generally always be within a
/// short depth range, it may be useful to use this configuration with 1 or few z slices. This
/// would reduce the number of lights per cluster by distributing more clusters in screen space
/// x/y which matches how lights are distributed in the scene.
FixedZ {
total: u32,
z_slices: u32,
z_config: ClusterZConfig,
/// Specify if clusters should automatically resize in x/y if there is a risk of exceeding
/// the available cluster-light index limit
dynamic_resizing: bool,
},
}
impl Default for ClusterConfig {
fn default() -> Self {
// 24 depth slices, square clusters with at most 4096 total clusters
// use max light distance as clusters max Z-depth, first slice extends to 5.0
Self::FixedZ {
total: 4096,
z_slices: 24,
z_config: ClusterZConfig::default(),
dynamic_resizing: true,
}
}
}
impl ClusterConfig {
fn dimensions_for_screen_size(&self, screen_size: UVec2) -> UVec3 {
match &self {
ClusterConfig::None => UVec3::ZERO,
ClusterConfig::Single => UVec3::ONE,
ClusterConfig::XYZ { dimensions, .. } => *dimensions,
ClusterConfig::FixedZ {
total, z_slices, ..
} => {
let aspect_ratio = screen_size.x as f32 / screen_size.y as f32;
let mut z_slices = *z_slices;
if *total < z_slices {
warn!("ClusterConfig has more z-slices than total clusters!");
z_slices = *total;
}
let per_layer = *total as f32 / z_slices as f32;
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
let y = f32::sqrt(per_layer / aspect_ratio);
let mut x = (y * aspect_ratio) as u32;
let mut y = y as u32;
// check extremes
if x == 0 {
x = 1;
y = per_layer as u32;
}
if y == 0 {
x = per_layer as u32;
y = 1;
}
UVec3::new(x, y, z_slices)
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
}
}
}
fn first_slice_depth(&self) -> f32 {
match self {
ClusterConfig::None => 0.0,
ClusterConfig::Single => 0.0,
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
ClusterConfig::XYZ { z_config, .. } | ClusterConfig::FixedZ { z_config, .. } => {
z_config.first_slice_depth
}
}
}
fn far_z_mode(&self) -> ClusterFarZMode {
match self {
ClusterConfig::None => ClusterFarZMode::Constant(0.0),
ClusterConfig::Single => ClusterFarZMode::MaxLightRange,
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
ClusterConfig::XYZ { z_config, .. } | ClusterConfig::FixedZ { z_config, .. } => {
z_config.far_z_mode
}
}
}
fn dynamic_resizing(&self) -> bool {
match self {
ClusterConfig::None | ClusterConfig::Single => false,
ClusterConfig::XYZ {
dynamic_resizing, ..
}
| ClusterConfig::FixedZ {
dynamic_resizing, ..
} => *dynamic_resizing,
}
}
}
#[derive(Component, Debug, Default)]
pub struct Clusters {
/// Tile size
pub(crate) tile_size: UVec2,
/// Number of clusters in x / y / z in the view frustum
pub(crate) dimensions: UVec3,
/// Distance to the far plane of the first depth slice. The first depth slice is special
/// and explicitly-configured to avoid having unnecessarily many slices close to the camera.
pub(crate) near: f32,
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
pub(crate) far: f32,
pub(crate) lights: Vec<VisiblePointLights>,
}
impl Clusters {
fn update(&mut self, screen_size: UVec2, requested_dimensions: UVec3) {
debug_assert!(
requested_dimensions.x > 0 && requested_dimensions.y > 0 && requested_dimensions.z > 0
);
let tile_size = (screen_size.as_vec2() / requested_dimensions.xy().as_vec2())
.ceil()
.as_uvec2()
.max(UVec2::ONE);
self.tile_size = tile_size;
self.dimensions = (screen_size.as_vec2() / tile_size.as_vec2())
.ceil()
.as_uvec2()
.extend(requested_dimensions.z)
.max(UVec3::ONE);
// NOTE: Maximum 4096 clusters due to uniform buffer size constraints
debug_assert!(self.dimensions.x * self.dimensions.y * self.dimensions.z <= 4096);
}
}
fn clip_to_view(inverse_projection: Mat4, clip: Vec4) -> Vec4 {
let view = inverse_projection * clip;
view / view.w
}
pub fn add_clusters(
mut commands: Commands,
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
cameras: Query<(Entity, Option<&ClusterConfig>), (With<Camera>, Without<Clusters>)>,
) {
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
for (entity, config) in cameras.iter() {
let config = config.copied().unwrap_or_default();
// actual settings here don't matter - they will be overwritten in assign_lights_to_clusters
commands
.entity(entity)
.insert_bundle((Clusters::default(), config));
}
}
#[derive(Clone, Component, Debug, Default)]
pub struct VisiblePointLights {
pub(crate) entities: Vec<Entity>,
}
impl VisiblePointLights {
#[inline]
pub fn iter(&self) -> impl DoubleEndedIterator<Item = &Entity> {
self.entities.iter()
}
#[inline]
pub fn len(&self) -> usize {
self.entities.len()
}
#[inline]
pub fn is_empty(&self) -> bool {
self.entities.is_empty()
}
}
// NOTE: Keep in sync with bevy_pbr/src/render/pbr.wgsl
fn view_z_to_z_slice(
cluster_factors: Vec2,
z_slices: u32,
view_z: f32,
is_orthographic: bool,
) -> u32 {
let z_slice = if is_orthographic {
bevy_pbr2: Fix clustering for orthographic projections (#3316) # Objective PBR lighting was broken in the new renderer when using orthographic projections due to the way the depth slicing works for the clusters. Fix it. ## Solution - The default orthographic projection near plane is 0.0. The perspective projection depth slicing does a division by the near plane which gives a floating point NaN and the clustering all breaks down. - Orthographic projections have a linear depth mapping, so it made intuitive sense to me to do depth slicing with a linear mapping too. The alternative I saw was to try to handle the near plane being at 0.0 and using the exponential depth slicing, but that felt like a hack that didn't make sense. - As such, I have added code that detects whether the projection is orthographic based on `projection[3][3] == 1.0` and then implemented the orthographic mapping case throughout (when computing cluster AABBs, and when mapping a view space position (or light) to a cluster id in both the rust and shader code). ## Screenshots Before: ![before](https://user-images.githubusercontent.com/302146/145847278-5b1bca74-fbad-4cc5-8b49-384f6a377fdc.png) After: <img width="1392" alt="Screenshot 2021-12-13 at 16 36 53" src="https://user-images.githubusercontent.com/302146/145847314-6f3a2035-5d87-4896-8032-0c3e35e15b7d.png"> Old renderer (slightly lighter due to slight difference in configured intensity): <img width="1392" alt="Screenshot 2021-12-13 at 16 42 23" src="https://user-images.githubusercontent.com/302146/145847391-6a5e6fe0-22da-4fc1-a6c7-440543689a63.png">
2021-12-14 23:42:35 +00:00
// NOTE: view_z is correct in the orthographic case
((view_z - cluster_factors.x) * cluster_factors.y).floor() as u32
} else {
// NOTE: had to use -view_z to make it positive else log(negative) is nan
((-view_z).ln() * cluster_factors.x - cluster_factors.y + 1.0) as u32
};
// NOTE: We use min as we may limit the far z plane used for clustering to be closeer than
// the furthest thing being drawn. This means that we need to limit to the maximum cluster.
z_slice.min(z_slices - 1)
}
// NOTE: Keep in sync as the inverse of view_z_to_z_slice above
fn z_slice_to_view_z(
near: f32,
far: f32,
z_slices: u32,
z_slice: u32,
is_orthographic: bool,
) -> f32 {
if is_orthographic {
return -near - (far - near) * z_slice as f32 / z_slices as f32;
}
// Perspective
if z_slice == 0 {
0.0
} else {
-near * (far / near).powf((z_slice - 1) as f32 / (z_slices - 1) as f32)
bevy_pbr2: Fix clustering for orthographic projections (#3316) # Objective PBR lighting was broken in the new renderer when using orthographic projections due to the way the depth slicing works for the clusters. Fix it. ## Solution - The default orthographic projection near plane is 0.0. The perspective projection depth slicing does a division by the near plane which gives a floating point NaN and the clustering all breaks down. - Orthographic projections have a linear depth mapping, so it made intuitive sense to me to do depth slicing with a linear mapping too. The alternative I saw was to try to handle the near plane being at 0.0 and using the exponential depth slicing, but that felt like a hack that didn't make sense. - As such, I have added code that detects whether the projection is orthographic based on `projection[3][3] == 1.0` and then implemented the orthographic mapping case throughout (when computing cluster AABBs, and when mapping a view space position (or light) to a cluster id in both the rust and shader code). ## Screenshots Before: ![before](https://user-images.githubusercontent.com/302146/145847278-5b1bca74-fbad-4cc5-8b49-384f6a377fdc.png) After: <img width="1392" alt="Screenshot 2021-12-13 at 16 36 53" src="https://user-images.githubusercontent.com/302146/145847314-6f3a2035-5d87-4896-8032-0c3e35e15b7d.png"> Old renderer (slightly lighter due to slight difference in configured intensity): <img width="1392" alt="Screenshot 2021-12-13 at 16 42 23" src="https://user-images.githubusercontent.com/302146/145847391-6a5e6fe0-22da-4fc1-a6c7-440543689a63.png">
2021-12-14 23:42:35 +00:00
}
}
fn ndc_position_to_cluster(
cluster_dimensions: UVec3,
cluster_factors: Vec2,
bevy_pbr2: Fix clustering for orthographic projections (#3316) # Objective PBR lighting was broken in the new renderer when using orthographic projections due to the way the depth slicing works for the clusters. Fix it. ## Solution - The default orthographic projection near plane is 0.0. The perspective projection depth slicing does a division by the near plane which gives a floating point NaN and the clustering all breaks down. - Orthographic projections have a linear depth mapping, so it made intuitive sense to me to do depth slicing with a linear mapping too. The alternative I saw was to try to handle the near plane being at 0.0 and using the exponential depth slicing, but that felt like a hack that didn't make sense. - As such, I have added code that detects whether the projection is orthographic based on `projection[3][3] == 1.0` and then implemented the orthographic mapping case throughout (when computing cluster AABBs, and when mapping a view space position (or light) to a cluster id in both the rust and shader code). ## Screenshots Before: ![before](https://user-images.githubusercontent.com/302146/145847278-5b1bca74-fbad-4cc5-8b49-384f6a377fdc.png) After: <img width="1392" alt="Screenshot 2021-12-13 at 16 36 53" src="https://user-images.githubusercontent.com/302146/145847314-6f3a2035-5d87-4896-8032-0c3e35e15b7d.png"> Old renderer (slightly lighter due to slight difference in configured intensity): <img width="1392" alt="Screenshot 2021-12-13 at 16 42 23" src="https://user-images.githubusercontent.com/302146/145847391-6a5e6fe0-22da-4fc1-a6c7-440543689a63.png">
2021-12-14 23:42:35 +00:00
is_orthographic: bool,
ndc_p: Vec3,
view_z: f32,
) -> UVec3 {
let cluster_dimensions_f32 = cluster_dimensions.as_vec3();
let frag_coord =
(ndc_p.xy() * Vec2::new(0.5, -0.5) + Vec2::splat(0.5)).clamp(Vec2::ZERO, Vec2::ONE);
let xy = (frag_coord * cluster_dimensions_f32.xy()).floor();
let z_slice = view_z_to_z_slice(
cluster_factors,
cluster_dimensions.z,
view_z,
is_orthographic,
);
xy.as_uvec2()
.extend(z_slice)
.clamp(UVec3::ZERO, cluster_dimensions - UVec3::ONE)
}
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
// Calculate bounds for the light using a view space aabb.
// Returns a (Vec3, Vec3) containing min and max with
// x and y in normalized device coordinates with range [-1, 1]
// z in view space, with range [-inf, -f32::MIN_POSITIVE]
fn cluster_space_light_aabb(
inverse_view_transform: Mat4,
projection_matrix: Mat4,
light_sphere: &Sphere,
) -> (Vec3, Vec3) {
let light_aabb_view = Aabb {
center: Vec3A::from(inverse_view_transform * light_sphere.center.extend(1.0)),
half_extents: Vec3A::splat(light_sphere.radius),
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
};
let (mut light_aabb_view_min, mut light_aabb_view_max) =
(light_aabb_view.min(), light_aabb_view.max());
// Constrain view z to be negative - i.e. in front of the camera
// When view z is >= 0.0 and we're using a perspective projection, bad things happen.
// At view z == 0.0, ndc x,y are mathematically undefined. At view z > 0.0, i.e. behind the camera,
// the perspective projection flips the directions of the axes. This breaks assumptions about
// use of min/max operations as something that was to the left in view space is now returning a
// coordinate that for view z in front of the camera would be on the right, but at view z behind the
// camera is on the left. So, we just constrain view z to be < 0.0 and necessarily in front of the camera.
light_aabb_view_min.z = light_aabb_view_min.z.min(-f32::MIN_POSITIVE);
light_aabb_view_max.z = light_aabb_view_max.z.min(-f32::MIN_POSITIVE);
// Is there a cheaper way to do this? The problem is that because of perspective
// the point at max z but min xy may be less xy in screenspace, and similar. As
// such, projecting the min and max xy at both the closer and further z and taking
// the min and max of those projected points addresses this.
let (
light_aabb_view_xymin_near,
light_aabb_view_xymin_far,
light_aabb_view_xymax_near,
light_aabb_view_xymax_far,
) = (
light_aabb_view_min,
light_aabb_view_min.xy().extend(light_aabb_view_max.z),
light_aabb_view_max.xy().extend(light_aabb_view_min.z),
light_aabb_view_max,
);
let (
light_aabb_clip_xymin_near,
light_aabb_clip_xymin_far,
light_aabb_clip_xymax_near,
light_aabb_clip_xymax_far,
) = (
projection_matrix * light_aabb_view_xymin_near.extend(1.0),
projection_matrix * light_aabb_view_xymin_far.extend(1.0),
projection_matrix * light_aabb_view_xymax_near.extend(1.0),
projection_matrix * light_aabb_view_xymax_far.extend(1.0),
);
let (
light_aabb_ndc_xymin_near,
light_aabb_ndc_xymin_far,
light_aabb_ndc_xymax_near,
light_aabb_ndc_xymax_far,
) = (
light_aabb_clip_xymin_near.xyz() / light_aabb_clip_xymin_near.w,
light_aabb_clip_xymin_far.xyz() / light_aabb_clip_xymin_far.w,
light_aabb_clip_xymax_near.xyz() / light_aabb_clip_xymax_near.w,
light_aabb_clip_xymax_far.xyz() / light_aabb_clip_xymax_far.w,
);
let (light_aabb_ndc_min, light_aabb_ndc_max) = (
light_aabb_ndc_xymin_near
.min(light_aabb_ndc_xymin_far)
.min(light_aabb_ndc_xymax_near)
.min(light_aabb_ndc_xymax_far),
light_aabb_ndc_xymin_near
.max(light_aabb_ndc_xymin_far)
.max(light_aabb_ndc_xymax_near)
.max(light_aabb_ndc_xymax_far),
);
// pack unadjusted z depth into the vecs
let (aabb_min, aabb_max) = (
light_aabb_ndc_min.xy().extend(light_aabb_view_min.z),
light_aabb_ndc_max.xy().extend(light_aabb_view_max.z),
);
// clamp to ndc coords
(
aabb_min.clamp(
Vec3::new(-1.0, -1.0, f32::MIN),
Vec3::new(1.0, 1.0, f32::MAX),
),
aabb_max.clamp(
Vec3::new(-1.0, -1.0, f32::MIN),
Vec3::new(1.0, 1.0, f32::MAX),
),
)
}
// Sort point lights with shadows enabled first, then by a stable key so that the index
// can be used to limit the number of point light shadows to render based on the device and
// we keep a stable set of lights visible
pub(crate) fn point_light_order(
(entity_1, shadows_enabled_1): (&Entity, &bool),
(entity_2, shadows_enabled_2): (&Entity, &bool),
) -> std::cmp::Ordering {
shadows_enabled_1
.cmp(shadows_enabled_2)
.reverse()
.then_with(|| entity_1.cmp(entity_2))
}
#[derive(Clone, Copy)]
// data required for assigning lights to clusters
pub(crate) struct PointLightAssignmentData {
entity: Entity,
translation: Vec3,
range: f32,
shadows_enabled: bool,
}
#[derive(Default)]
pub struct GlobalVisiblePointLights {
entities: HashSet<Entity>,
}
impl GlobalVisiblePointLights {
#[inline]
pub fn iter(&self) -> impl Iterator<Item = &Entity> {
self.entities.iter()
}
#[inline]
pub fn contains(&self, entity: Entity) -> bool {
self.entities.contains(&entity)
}
}
// NOTE: Run this before update_point_light_frusta!
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
#[allow(clippy::too_many_arguments)]
pub(crate) fn assign_lights_to_clusters(
mut commands: Commands,
mut global_lights: ResMut<GlobalVisiblePointLights>,
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
windows: Res<Windows>,
images: Res<Assets<Image>>,
mut views: Query<(
Entity,
&GlobalTransform,
&Camera,
&Frustum,
&ClusterConfig,
&mut Clusters,
Option<&mut VisiblePointLights>,
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
)>,
lights_query: Query<(Entity, &GlobalTransform, &PointLight, &Visibility)>,
mut lights: Local<Vec<PointLightAssignmentData>>,
mut max_point_lights_warning_emitted: Local<bool>,
render_device: Option<Res<RenderDevice>>,
) {
let render_device = match render_device {
Some(render_device) => render_device,
None => return,
};
global_lights.entities.clear();
lights.clear();
// collect just the relevant light query data into a persisted vec to avoid reallocating each frame
lights.extend(
lights_query
.iter()
.filter(|(.., visibility)| visibility.is_visible)
.map(
|(entity, transform, light, _visibility)| PointLightAssignmentData {
entity,
translation: transform.translation,
shadows_enabled: light.shadows_enabled,
range: light.range,
},
),
);
Use storage buffers for clustered forward point lights (#3989) # Objective - Make use of storage buffers, where they are available, for clustered forward bindings to support far more point lights in a scene - Fixes #3605 - Based on top of #4079 This branch on an M1 Max can keep 60fps with about 2150 point lights of radius 1m in the Sponza scene where I've been testing. The bottleneck is mostly assigning lights to clusters which grows faster than linearly (I think 1000 lights was about 1.5ms and 5000 was 7.5ms). I have seen papers and presentations leveraging compute shaders that can get this up to over 1 million. That said, I think any further optimisations should probably be done in a separate PR. ## Solution - Add `RenderDevice` to the `Material` and `SpecializedMaterial` trait `::key()` functions to allow setting flags on the keys depending on feature/limit availability - Make `GpuPointLights` and `ViewClusterBuffers` into enums containing `UniformVec` and `StorageBuffer` variants. Implement the necessary API on them to make usage the same for both cases, and the only difference is at initialisation time. - Appropriate shader defs in the shader code to handle the two cases ## Context on some decisions / open questions - I'm using `max_storage_buffers_per_shader_stage >= 3` as a check to see if storage buffers are supported. I was thinking about diving into 'binding resource management' but it feels like we don't have enough use cases to understand the problem yet, and it is mostly a separate concern to this PR, so I think it should be handled separately. - Should `ViewClusterBuffers` and `ViewClusterBindings` be merged, duplicating the count variables into the enum variants? Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2022-04-07 16:16:35 +00:00
let clustered_forward_buffer_binding_type =
render_device.get_supported_read_only_binding_type(CLUSTERED_FORWARD_STORAGE_BUFFER_COUNT);
let supports_storage_buffers = matches!(
clustered_forward_buffer_binding_type,
BufferBindingType::Storage { .. }
);
if lights.len() > MAX_UNIFORM_BUFFER_POINT_LIGHTS && !supports_storage_buffers {
lights.sort_by(|light_1, light_2| {
point_light_order(
(&light_1.entity, &light_1.shadows_enabled),
(&light_2.entity, &light_2.shadows_enabled),
)
});
// check each light against each view's frustum, keep only those that affect at least one of our views
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
let frusta: Vec<_> = views
.iter()
.map(|(_, _, _, frustum, _, _, _)| *frustum)
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
.collect();
let mut lights_in_view_count = 0;
lights.retain(|light| {
// take one extra light to check if we should emit the warning
Use storage buffers for clustered forward point lights (#3989) # Objective - Make use of storage buffers, where they are available, for clustered forward bindings to support far more point lights in a scene - Fixes #3605 - Based on top of #4079 This branch on an M1 Max can keep 60fps with about 2150 point lights of radius 1m in the Sponza scene where I've been testing. The bottleneck is mostly assigning lights to clusters which grows faster than linearly (I think 1000 lights was about 1.5ms and 5000 was 7.5ms). I have seen papers and presentations leveraging compute shaders that can get this up to over 1 million. That said, I think any further optimisations should probably be done in a separate PR. ## Solution - Add `RenderDevice` to the `Material` and `SpecializedMaterial` trait `::key()` functions to allow setting flags on the keys depending on feature/limit availability - Make `GpuPointLights` and `ViewClusterBuffers` into enums containing `UniformVec` and `StorageBuffer` variants. Implement the necessary API on them to make usage the same for both cases, and the only difference is at initialisation time. - Appropriate shader defs in the shader code to handle the two cases ## Context on some decisions / open questions - I'm using `max_storage_buffers_per_shader_stage >= 3` as a check to see if storage buffers are supported. I was thinking about diving into 'binding resource management' but it feels like we don't have enough use cases to understand the problem yet, and it is mostly a separate concern to this PR, so I think it should be handled separately. - Should `ViewClusterBuffers` and `ViewClusterBindings` be merged, duplicating the count variables into the enum variants? Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2022-04-07 16:16:35 +00:00
if lights_in_view_count == MAX_UNIFORM_BUFFER_POINT_LIGHTS + 1 {
false
} else {
let light_sphere = Sphere {
center: Vec3A::from(light.translation),
radius: light.range,
};
let light_in_view = frusta
.iter()
.any(|frustum| frustum.intersects_sphere(&light_sphere, true));
if light_in_view {
lights_in_view_count += 1;
}
light_in_view
}
});
Use storage buffers for clustered forward point lights (#3989) # Objective - Make use of storage buffers, where they are available, for clustered forward bindings to support far more point lights in a scene - Fixes #3605 - Based on top of #4079 This branch on an M1 Max can keep 60fps with about 2150 point lights of radius 1m in the Sponza scene where I've been testing. The bottleneck is mostly assigning lights to clusters which grows faster than linearly (I think 1000 lights was about 1.5ms and 5000 was 7.5ms). I have seen papers and presentations leveraging compute shaders that can get this up to over 1 million. That said, I think any further optimisations should probably be done in a separate PR. ## Solution - Add `RenderDevice` to the `Material` and `SpecializedMaterial` trait `::key()` functions to allow setting flags on the keys depending on feature/limit availability - Make `GpuPointLights` and `ViewClusterBuffers` into enums containing `UniformVec` and `StorageBuffer` variants. Implement the necessary API on them to make usage the same for both cases, and the only difference is at initialisation time. - Appropriate shader defs in the shader code to handle the two cases ## Context on some decisions / open questions - I'm using `max_storage_buffers_per_shader_stage >= 3` as a check to see if storage buffers are supported. I was thinking about diving into 'binding resource management' but it feels like we don't have enough use cases to understand the problem yet, and it is mostly a separate concern to this PR, so I think it should be handled separately. - Should `ViewClusterBuffers` and `ViewClusterBindings` be merged, duplicating the count variables into the enum variants? Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2022-04-07 16:16:35 +00:00
if lights.len() > MAX_UNIFORM_BUFFER_POINT_LIGHTS && !*max_point_lights_warning_emitted {
warn!(
"MAX_UNIFORM_BUFFER_POINT_LIGHTS ({}) exceeded",
MAX_UNIFORM_BUFFER_POINT_LIGHTS
);
*max_point_lights_warning_emitted = true;
}
Use storage buffers for clustered forward point lights (#3989) # Objective - Make use of storage buffers, where they are available, for clustered forward bindings to support far more point lights in a scene - Fixes #3605 - Based on top of #4079 This branch on an M1 Max can keep 60fps with about 2150 point lights of radius 1m in the Sponza scene where I've been testing. The bottleneck is mostly assigning lights to clusters which grows faster than linearly (I think 1000 lights was about 1.5ms and 5000 was 7.5ms). I have seen papers and presentations leveraging compute shaders that can get this up to over 1 million. That said, I think any further optimisations should probably be done in a separate PR. ## Solution - Add `RenderDevice` to the `Material` and `SpecializedMaterial` trait `::key()` functions to allow setting flags on the keys depending on feature/limit availability - Make `GpuPointLights` and `ViewClusterBuffers` into enums containing `UniformVec` and `StorageBuffer` variants. Implement the necessary API on them to make usage the same for both cases, and the only difference is at initialisation time. - Appropriate shader defs in the shader code to handle the two cases ## Context on some decisions / open questions - I'm using `max_storage_buffers_per_shader_stage >= 3` as a check to see if storage buffers are supported. I was thinking about diving into 'binding resource management' but it feels like we don't have enough use cases to understand the problem yet, and it is mostly a separate concern to this PR, so I think it should be handled separately. - Should `ViewClusterBuffers` and `ViewClusterBindings` be merged, duplicating the count variables into the enum variants? Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2022-04-07 16:16:35 +00:00
lights.truncate(MAX_UNIFORM_BUFFER_POINT_LIGHTS);
}
for (view_entity, camera_transform, camera, frustum, config, clusters, mut visible_lights) in
views.iter_mut()
{
if matches!(config, ClusterConfig::None) && visible_lights.is_some() {
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
commands.entity(view_entity).remove::<VisiblePointLights>();
continue;
}
let clusters = clusters.into_inner();
let screen_size = camera.target.get_physical_size(&windows, &images);
clusters.lights.clear();
let screen_size = screen_size.unwrap_or_default();
let mut requested_cluster_dimensions = config.dimensions_for_screen_size(screen_size);
let view_transform = camera_transform.compute_matrix();
let inverse_view_transform = view_transform.inverse();
bevy_pbr2: Fix clustering for orthographic projections (#3316) # Objective PBR lighting was broken in the new renderer when using orthographic projections due to the way the depth slicing works for the clusters. Fix it. ## Solution - The default orthographic projection near plane is 0.0. The perspective projection depth slicing does a division by the near plane which gives a floating point NaN and the clustering all breaks down. - Orthographic projections have a linear depth mapping, so it made intuitive sense to me to do depth slicing with a linear mapping too. The alternative I saw was to try to handle the near plane being at 0.0 and using the exponential depth slicing, but that felt like a hack that didn't make sense. - As such, I have added code that detects whether the projection is orthographic based on `projection[3][3] == 1.0` and then implemented the orthographic mapping case throughout (when computing cluster AABBs, and when mapping a view space position (or light) to a cluster id in both the rust and shader code). ## Screenshots Before: ![before](https://user-images.githubusercontent.com/302146/145847278-5b1bca74-fbad-4cc5-8b49-384f6a377fdc.png) After: <img width="1392" alt="Screenshot 2021-12-13 at 16 36 53" src="https://user-images.githubusercontent.com/302146/145847314-6f3a2035-5d87-4896-8032-0c3e35e15b7d.png"> Old renderer (slightly lighter due to slight difference in configured intensity): <img width="1392" alt="Screenshot 2021-12-13 at 16 42 23" src="https://user-images.githubusercontent.com/302146/145847391-6a5e6fe0-22da-4fc1-a6c7-440543689a63.png">
2021-12-14 23:42:35 +00:00
let is_orthographic = camera.projection_matrix.w_axis.w == 1.0;
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
let far_z = match config.far_z_mode() {
ClusterFarZMode::CameraFarPlane => camera.far,
ClusterFarZMode::MaxLightRange => {
let inverse_view_row_2 = inverse_view_transform.row(2);
lights
.iter()
.map(|light| {
-inverse_view_row_2.dot(light.translation.extend(1.0)) + light.range
})
.reduce(f32::max)
.unwrap_or(0.0)
}
ClusterFarZMode::Constant(far) => far,
};
let first_slice_depth = match (is_orthographic, requested_cluster_dimensions.z) {
(true, _) => camera.near,
(false, 1) => config.first_slice_depth().max(far_z),
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
_ => config.first_slice_depth(),
};
// NOTE: Ensure the far_z is at least as far as the first_depth_slice to avoid clustering problems.
let far_z = far_z.max(first_slice_depth);
bevy_pbr2: Fix clustering for orthographic projections (#3316) # Objective PBR lighting was broken in the new renderer when using orthographic projections due to the way the depth slicing works for the clusters. Fix it. ## Solution - The default orthographic projection near plane is 0.0. The perspective projection depth slicing does a division by the near plane which gives a floating point NaN and the clustering all breaks down. - Orthographic projections have a linear depth mapping, so it made intuitive sense to me to do depth slicing with a linear mapping too. The alternative I saw was to try to handle the near plane being at 0.0 and using the exponential depth slicing, but that felt like a hack that didn't make sense. - As such, I have added code that detects whether the projection is orthographic based on `projection[3][3] == 1.0` and then implemented the orthographic mapping case throughout (when computing cluster AABBs, and when mapping a view space position (or light) to a cluster id in both the rust and shader code). ## Screenshots Before: ![before](https://user-images.githubusercontent.com/302146/145847278-5b1bca74-fbad-4cc5-8b49-384f6a377fdc.png) After: <img width="1392" alt="Screenshot 2021-12-13 at 16 36 53" src="https://user-images.githubusercontent.com/302146/145847314-6f3a2035-5d87-4896-8032-0c3e35e15b7d.png"> Old renderer (slightly lighter due to slight difference in configured intensity): <img width="1392" alt="Screenshot 2021-12-13 at 16 42 23" src="https://user-images.githubusercontent.com/302146/145847391-6a5e6fe0-22da-4fc1-a6c7-440543689a63.png">
2021-12-14 23:42:35 +00:00
let cluster_factors = calculate_cluster_factors(
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
first_slice_depth,
far_z,
requested_cluster_dimensions.z as f32,
bevy_pbr2: Fix clustering for orthographic projections (#3316) # Objective PBR lighting was broken in the new renderer when using orthographic projections due to the way the depth slicing works for the clusters. Fix it. ## Solution - The default orthographic projection near plane is 0.0. The perspective projection depth slicing does a division by the near plane which gives a floating point NaN and the clustering all breaks down. - Orthographic projections have a linear depth mapping, so it made intuitive sense to me to do depth slicing with a linear mapping too. The alternative I saw was to try to handle the near plane being at 0.0 and using the exponential depth slicing, but that felt like a hack that didn't make sense. - As such, I have added code that detects whether the projection is orthographic based on `projection[3][3] == 1.0` and then implemented the orthographic mapping case throughout (when computing cluster AABBs, and when mapping a view space position (or light) to a cluster id in both the rust and shader code). ## Screenshots Before: ![before](https://user-images.githubusercontent.com/302146/145847278-5b1bca74-fbad-4cc5-8b49-384f6a377fdc.png) After: <img width="1392" alt="Screenshot 2021-12-13 at 16 36 53" src="https://user-images.githubusercontent.com/302146/145847314-6f3a2035-5d87-4896-8032-0c3e35e15b7d.png"> Old renderer (slightly lighter due to slight difference in configured intensity): <img width="1392" alt="Screenshot 2021-12-13 at 16 42 23" src="https://user-images.githubusercontent.com/302146/145847391-6a5e6fe0-22da-4fc1-a6c7-440543689a63.png">
2021-12-14 23:42:35 +00:00
is_orthographic,
);
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
if config.dynamic_resizing() {
let mut cluster_index_estimate = 0.0;
for light in lights.iter() {
let light_sphere = Sphere {
center: Vec3A::from(light.translation),
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
radius: light.range,
};
// Check if the light is within the view frustum
if !frustum.intersects_sphere(&light_sphere, true) {
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
continue;
}
// calculate a conservative aabb estimate of number of clusters affected by this light
// this overestimates index counts by at most 50% (and typically much less) when the whole light range is in view
// it can overestimate more significantly when light ranges are only partially in view
let (light_aabb_min, light_aabb_max) = cluster_space_light_aabb(
inverse_view_transform,
camera.projection_matrix,
&light_sphere,
);
// since we won't adjust z slices we can calculate exact number of slices required in z dimension
let z_cluster_min = view_z_to_z_slice(
cluster_factors,
requested_cluster_dimensions.z,
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
light_aabb_min.z,
is_orthographic,
);
let z_cluster_max = view_z_to_z_slice(
cluster_factors,
requested_cluster_dimensions.z,
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
light_aabb_max.z,
is_orthographic,
);
let z_count =
z_cluster_min.max(z_cluster_max) - z_cluster_min.min(z_cluster_max) + 1;
// calculate x/y count using floats to avoid overestimating counts due to large initial tile sizes
let xy_min = light_aabb_min.xy();
let xy_max = light_aabb_max.xy();
// multiply by 0.5 to move from [-1,1] to [-0.5, 0.5], max extent of 1 in each dimension
let xy_count = (xy_max - xy_min)
* 0.5
* Vec2::new(
requested_cluster_dimensions.x as f32,
requested_cluster_dimensions.y as f32,
);
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
// add up to 2 to each axis to account for overlap
let x_overlap = if xy_min.x <= -1.0 { 0.0 } else { 1.0 }
+ if xy_max.x >= 1.0 { 0.0 } else { 1.0 };
let y_overlap = if xy_min.y <= -1.0 { 0.0 } else { 1.0 }
+ if xy_max.y >= 1.0 { 0.0 } else { 1.0 };
cluster_index_estimate +=
(xy_count.x + x_overlap) * (xy_count.y + y_overlap) * z_count as f32;
}
if cluster_index_estimate > ViewClusterBindings::MAX_INDICES as f32 {
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
// scale x and y cluster count to be able to fit all our indices
// we take the ratio of the actual indices over the index estimate.
// this not not guaranteed to be small enough due to overlapped tiles, but
// the conservative estimate is more than sufficient to cover the
// difference
let index_ratio =
ViewClusterBindings::MAX_INDICES as f32 / cluster_index_estimate as f32;
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
let xy_ratio = index_ratio.sqrt();
requested_cluster_dimensions.x =
((requested_cluster_dimensions.x as f32 * xy_ratio).floor() as u32).max(1);
requested_cluster_dimensions.y =
((requested_cluster_dimensions.y as f32 * xy_ratio).floor() as u32).max(1);
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
}
}
clusters.update(screen_size, requested_cluster_dimensions);
clusters.near = first_slice_depth;
clusters.far = far_z;
// NOTE: Maximum 4096 clusters due to uniform buffer size constraints
debug_assert!(
clusters.dimensions.x * clusters.dimensions.y * clusters.dimensions.z <= 4096
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
);
let inverse_projection = camera.projection_matrix.inverse();
for lights in clusters.lights.iter_mut() {
lights.entities.clear();
}
clusters.lights.resize_with(
(clusters.dimensions.x * clusters.dimensions.y * clusters.dimensions.z) as usize,
VisiblePointLights::default,
);
if screen_size.x == 0 || screen_size.y == 0 {
continue;
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
}
// Calculate the x/y/z cluster frustum planes in view space
let mut x_planes = Vec::with_capacity(clusters.dimensions.x as usize + 1);
let mut y_planes = Vec::with_capacity(clusters.dimensions.y as usize + 1);
let mut z_planes = Vec::with_capacity(clusters.dimensions.z as usize + 1);
if is_orthographic {
let x_slices = clusters.dimensions.x as f32;
for x in 0..=clusters.dimensions.x {
let x_proportion = x as f32 / x_slices;
let x_pos = x_proportion * 2.0 - 1.0;
let view_x = clip_to_view(inverse_projection, Vec4::new(x_pos, 0.0, 1.0, 1.0)).x;
let normal = Vec3::X;
let d = view_x * normal.x;
x_planes.push(Plane::new(normal.extend(d)));
}
let y_slices = clusters.dimensions.y as f32;
for y in 0..=clusters.dimensions.y {
let y_proportion = 1.0 - y as f32 / y_slices;
let y_pos = y_proportion * 2.0 - 1.0;
let view_y = clip_to_view(inverse_projection, Vec4::new(0.0, y_pos, 1.0, 1.0)).y;
let normal = Vec3::Y;
let d = view_y * normal.y;
y_planes.push(Plane::new(normal.extend(d)));
}
} else {
let x_slices = clusters.dimensions.x as f32;
for x in 0..=clusters.dimensions.x {
let x_proportion = x as f32 / x_slices;
let x_pos = x_proportion * 2.0 - 1.0;
let nb = clip_to_view(inverse_projection, Vec4::new(x_pos, -1.0, 1.0, 1.0)).xyz();
let nt = clip_to_view(inverse_projection, Vec4::new(x_pos, 1.0, 1.0, 1.0)).xyz();
let normal = nb.cross(nt);
let d = nb.dot(normal);
x_planes.push(Plane::new(normal.extend(d)));
}
let y_slices = clusters.dimensions.y as f32;
for y in 0..=clusters.dimensions.y {
let y_proportion = 1.0 - y as f32 / y_slices;
let y_pos = y_proportion * 2.0 - 1.0;
let nl = clip_to_view(inverse_projection, Vec4::new(-1.0, y_pos, 1.0, 1.0)).xyz();
let nr = clip_to_view(inverse_projection, Vec4::new(1.0, y_pos, 1.0, 1.0)).xyz();
let normal = nr.cross(nl);
let d = nr.dot(normal);
y_planes.push(Plane::new(normal.extend(d)));
}
}
let z_slices = clusters.dimensions.z;
for z in 0..=z_slices {
let view_z = z_slice_to_view_z(first_slice_depth, far_z, z_slices, z, is_orthographic);
let normal = -Vec3::Z;
let d = view_z * normal.z;
z_planes.push(Plane::new(normal.extend(d)));
}
let mut visible_lights_scratch = Vec::new();
{
// reuse existing visible lights Vec, if it exists
let visible_lights = if let Some(visible_lights) = visible_lights.as_mut() {
visible_lights.entities.clear();
&mut visible_lights.entities
} else {
&mut visible_lights_scratch
};
for light in lights.iter() {
let light_sphere = Sphere {
center: Vec3A::from(light.translation),
radius: light.range,
};
// Check if the light is within the view frustum
if !frustum.intersects_sphere(&light_sphere, true) {
continue;
}
// NOTE: The light intersects the frustum so it must be visible and part of the global set
global_lights.entities.insert(light.entity);
visible_lights.push(light.entity);
// note: caching seems to be slower than calling twice for this aabb calculation
let (light_aabb_xy_ndc_z_view_min, light_aabb_xy_ndc_z_view_max) =
cluster_space_light_aabb(
inverse_view_transform,
camera.projection_matrix,
&light_sphere,
);
Dynamic light clusters (#3968) # Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
2022-03-08 04:56:42 +00:00
let min_cluster = ndc_position_to_cluster(
clusters.dimensions,
cluster_factors,
is_orthographic,
light_aabb_xy_ndc_z_view_min,
light_aabb_xy_ndc_z_view_min.z,
);
let max_cluster = ndc_position_to_cluster(
clusters.dimensions,
cluster_factors,
is_orthographic,
light_aabb_xy_ndc_z_view_max,
light_aabb_xy_ndc_z_view_max.z,
);
let (min_cluster, max_cluster) =
(min_cluster.min(max_cluster), min_cluster.max(max_cluster));
// What follows is the Iterative Sphere Refinement algorithm from Just Cause 3
// Persson et al, Practical Clustered Shading
// http://newq.net/dl/pub/s2015_practical.pdf
// NOTE: A sphere under perspective projection is no longer a sphere. It gets
// stretched and warped, which prevents simpler algorithms from being correct
// as they often assume that the widest part of the sphere under projection is the
// center point on the axis of interest plus the radius, and that is not true!
let view_light_sphere = Sphere {
center: Vec3A::from(inverse_view_transform * light_sphere.center.extend(1.0)),
radius: light_sphere.radius,
};
let light_center_clip =
camera.projection_matrix * view_light_sphere.center.extend(1.0);
let light_center_ndc = light_center_clip.xyz() / light_center_clip.w;
let cluster_coordinates = ndc_position_to_cluster(
clusters.dimensions,
cluster_factors,
is_orthographic,
light_center_ndc,
view_light_sphere.center.z,
);
let z_center = if light_center_ndc.z <= 1.0 {
Some(cluster_coordinates.z)
} else {
None
};
let y_center = if light_center_ndc.y > 1.0 {
None
} else if light_center_ndc.y < -1.0 {
Some(clusters.dimensions.y + 1)
} else {
Some(cluster_coordinates.y)
};
for z in min_cluster.z..=max_cluster.z {
let mut z_light = view_light_sphere.clone();
if z_center.is_none() || z != z_center.unwrap() {
// The z plane closer to the light has the larger radius circle where the
// light sphere intersects the z plane.
let z_plane = if z_center.is_some() && z < z_center.unwrap() {
z_planes[(z + 1) as usize]
} else {
z_planes[z as usize]
};
// Project the sphere to this z plane and use its radius as the radius of a
// new, refined sphere.
if let Some(projected) = project_to_plane_z(z_light, z_plane) {
z_light = projected;
} else {
continue;
}
}
for y in min_cluster.y..=max_cluster.y {
let mut y_light = z_light.clone();
if y_center.is_none() || y != y_center.unwrap() {
// The y plane closer to the light has the larger radius circle where the
// light sphere intersects the y plane.
let y_plane = if y_center.is_some() && y < y_center.unwrap() {
y_planes[(y + 1) as usize]
} else {
y_planes[y as usize]
};
// Project the refined sphere to this y plane and use its radius as the
// radius of a new, even more refined sphere.
if let Some(projected) =
project_to_plane_y(y_light, y_plane, is_orthographic)
{
y_light = projected;
} else {
continue;
}
}
// Loop from the left to find the first affected cluster
let mut min_x = min_cluster.x;
loop {
if min_x >= max_cluster.x
|| -get_distance_x(
x_planes[(min_x + 1) as usize],
y_light.center,
is_orthographic,
) + y_light.radius
> 0.0
{
break;
}
min_x += 1;
}
// Loop from the right to find the last affected cluster
let mut max_x = max_cluster.x;
loop {
if max_x <= min_x
|| get_distance_x(
x_planes[max_x as usize],
y_light.center,
is_orthographic,
) + y_light.radius
> 0.0
{
break;
}
max_x -= 1;
}
let mut cluster_index = ((y * clusters.dimensions.x + min_x)
* clusters.dimensions.z
+ z) as usize;
// Mark the clusters in the range as affected
for _ in min_x..=max_x {
clusters.lights[cluster_index].entities.push(light.entity);
cluster_index += clusters.dimensions.z as usize;
}
}
}
}
}
if visible_lights.is_none() {
commands.entity(view_entity).insert(VisiblePointLights {
entities: visible_lights_scratch,
});
}
}
}
// NOTE: This exploits the fact that a x-plane normal has only x and z components
fn get_distance_x(plane: Plane, point: Vec3A, is_orthographic: bool) -> f32 {
if is_orthographic {
point.x - plane.d()
} else {
// Distance from a point to a plane:
// signed distance to plane = (nx * px + ny * py + nz * pz + d) / n.length()
// NOTE: For a x-plane, ny and d are 0 and we have a unit normal
// = nx * px + nz * pz
plane.normal_d().xz().dot(point.xz())
}
}
// NOTE: This exploits the fact that a z-plane normal has only a z component
fn project_to_plane_z(z_light: Sphere, z_plane: Plane) -> Option<Sphere> {
// p = sphere center
// n = plane normal
// d = n.p if p is in the plane
// NOTE: For a z-plane, nx and ny are both 0
// d = px * nx + py * ny + pz * nz
// = pz * nz
// => pz = d / nz
let z = z_plane.d() / z_plane.normal_d().z;
let distance_to_plane = z - z_light.center.z;
if distance_to_plane.abs() > z_light.radius {
return None;
}
Some(Sphere {
center: Vec3A::from(z_light.center.xy().extend(z)),
// hypotenuse length = radius
// pythagorus = (distance to plane)^2 + b^2 = radius^2
radius: (z_light.radius * z_light.radius - distance_to_plane * distance_to_plane).sqrt(),
})
}
// NOTE: This exploits the fact that a y-plane normal has only y and z components
fn project_to_plane_y(y_light: Sphere, y_plane: Plane, is_orthographic: bool) -> Option<Sphere> {
let distance_to_plane = if is_orthographic {
y_plane.d() - y_light.center.y
} else {
-y_light.center.yz().dot(y_plane.normal_d().yz())
};
if distance_to_plane.abs() > y_light.radius {
return None;
}
Some(Sphere {
center: y_light.center + distance_to_plane * y_plane.normal(),
radius: (y_light.radius * y_light.radius - distance_to_plane * distance_to_plane).sqrt(),
})
}
pub fn update_directional_light_frusta(
mut views: Query<
(
&GlobalTransform,
&DirectionalLight,
&mut Frustum,
&Visibility,
),
Or<(Changed<GlobalTransform>, Changed<DirectionalLight>)>,
>,
) {
for (transform, directional_light, mut frustum, visibility) in views.iter_mut() {
// The frustum is used for culling meshes to the light for shadow mapping
// so if shadow mapping is disabled for this light, then the frustum is
// not needed.
if !directional_light.shadows_enabled || !visibility.is_visible {
continue;
}
let view_projection = directional_light.shadow_projection.get_projection_matrix()
* transform.compute_matrix().inverse();
*frustum = Frustum::from_view_projection(
&view_projection,
&transform.translation,
&transform.back(),
directional_light.shadow_projection.far(),
);
}
}
// NOTE: Run this after assign_lights_to_clusters!
pub fn update_point_light_frusta(
global_lights: Res<GlobalVisiblePointLights>,
mut views: Query<
(Entity, &GlobalTransform, &PointLight, &mut CubemapFrusta),
Or<(Changed<GlobalTransform>, Changed<PointLight>)>,
>,
) {
let projection =
Mat4::perspective_infinite_reverse_rh(std::f32::consts::FRAC_PI_2, 1.0, POINT_LIGHT_NEAR_Z);
let view_rotations = CUBE_MAP_FACES
.iter()
.map(|CubeMapFace { target, up }| GlobalTransform::identity().looking_at(*target, *up))
.collect::<Vec<_>>();
for (entity, transform, point_light, mut cubemap_frusta) in views.iter_mut() {
// The frusta are used for culling meshes to the light for shadow mapping
// so if shadow mapping is disabled for this light, then the frusta are
// not needed.
// Also, if the light is not relevant for any cluster, it will not be in the
// global lights set and so there is no need to update its frusta.
if !point_light.shadows_enabled || !global_lights.entities.contains(&entity) {
continue;
}
// ignore scale because we don't want to effectively scale light radius and range
// by applying those as a view transform to shadow map rendering of objects
// and ignore rotation because we want the shadow map projections to align with the axes
let view_translation = GlobalTransform::from_translation(transform.translation);
let view_backward = transform.back();
for (view_rotation, frustum) in view_rotations.iter().zip(cubemap_frusta.iter_mut()) {
let view = view_translation * *view_rotation;
let view_projection = projection * view.compute_matrix().inverse();
*frustum = Frustum::from_view_projection(
&view_projection,
&transform.translation,
&view_backward,
point_light.range,
);
}
}
}
pub fn check_light_mesh_visibility(
visible_point_lights: Query<&VisiblePointLights>,
mut point_lights: Query<(
&PointLight,
&GlobalTransform,
&CubemapFrusta,
&mut CubemapVisibleEntities,
Option<&RenderLayers>,
)>,
mut directional_lights: Query<(
&DirectionalLight,
&Frustum,
&mut VisibleEntities,
Option<&RenderLayers>,
&Visibility,
)>,
mut visible_entity_query: Query<
(
Entity,
&Visibility,
&mut ComputedVisibility,
Option<&RenderLayers>,
Option<&Aabb>,
Option<&GlobalTransform>,
),
Without<NotShadowCaster>,
>,
) {
// Directonal lights
for (directional_light, frustum, mut visible_entities, maybe_view_mask, visibility) in
directional_lights.iter_mut()
{
visible_entities.entities.clear();
// NOTE: If shadow mapping is disabled for the light then it must have no visible entities
if !directional_light.shadows_enabled || !visibility.is_visible {
continue;
}
let view_mask = maybe_view_mask.copied().unwrap_or_default();
for (
entity,
visibility,
mut computed_visibility,
maybe_entity_mask,
maybe_aabb,
maybe_transform,
) in visible_entity_query.iter_mut()
{
if !visibility.is_visible {
continue;
}
let entity_mask = maybe_entity_mask.copied().unwrap_or_default();
if !view_mask.intersects(&entity_mask) {
continue;
}
// If we have an aabb and transform, do frustum culling
if let (Some(aabb), Some(transform)) = (maybe_aabb, maybe_transform) {
if !frustum.intersects_obb(aabb, &transform.compute_matrix(), true) {
continue;
}
}
computed_visibility.is_visible = true;
visible_entities.entities.push(entity);
}
// TODO: check for big changes in visible entities len() vs capacity() (ex: 2x) and resize
// to prevent holding unneeded memory
}
// Point lights
for visible_lights in visible_point_lights.iter() {
for light_entity in visible_lights.entities.iter().copied() {
if let Ok((
point_light,
transform,
cubemap_frusta,
mut cubemap_visible_entities,
maybe_view_mask,
)) = point_lights.get_mut(light_entity)
{
for visible_entities in cubemap_visible_entities.iter_mut() {
visible_entities.entities.clear();
}
// NOTE: If shadow mapping is disabled for the light then it must have no visible entities
if !point_light.shadows_enabled {
continue;
}
let view_mask = maybe_view_mask.copied().unwrap_or_default();
let light_sphere = Sphere {
center: Vec3A::from(transform.translation),
radius: point_light.range,
};
for (
entity,
visibility,
mut computed_visibility,
maybe_entity_mask,
maybe_aabb,
maybe_transform,
) in visible_entity_query.iter_mut()
{
if !visibility.is_visible {
continue;
}
let entity_mask = maybe_entity_mask.copied().unwrap_or_default();
if !view_mask.intersects(&entity_mask) {
continue;
}
// If we have an aabb and transform, do frustum culling
if let (Some(aabb), Some(transform)) = (maybe_aabb, maybe_transform) {
let model_to_world = transform.compute_matrix();
// Do a cheap sphere vs obb test to prune out most meshes outside the sphere of the light
if !light_sphere.intersects_obb(aabb, &model_to_world) {
continue;
}
for (frustum, visible_entities) in cubemap_frusta
.iter()
.zip(cubemap_visible_entities.iter_mut())
{
if frustum.intersects_obb(aabb, &model_to_world, true) {
computed_visibility.is_visible = true;
visible_entities.entities.push(entity);
}
}
} else {
computed_visibility.is_visible = true;
for visible_entities in cubemap_visible_entities.iter_mut() {
visible_entities.entities.push(entity);
}
}
}
// TODO: check for big changes in visible entities len() vs capacity() (ex: 2x) and resize
// to prevent holding unneeded memory
}
}
}
}
#[cfg(test)]
mod test {
use super::*;
fn test_cluster_tiling(config: ClusterConfig, screen_size: UVec2) -> Clusters {
let dims = config.dimensions_for_screen_size(screen_size);
// note: near & far do not affect tiling
let mut clusters = Clusters::default();
clusters.update(screen_size, dims);
// check we cover the screen
assert!(clusters.tile_size.x * clusters.dimensions.x >= screen_size.x);
assert!(clusters.tile_size.y * clusters.dimensions.y >= screen_size.y);
// check a smaller number of clusters would not cover the screen
assert!(clusters.tile_size.x * (clusters.dimensions.x - 1) < screen_size.x);
assert!(clusters.tile_size.y * (clusters.dimensions.y - 1) < screen_size.y);
// check a smaller tilesize would not cover the screen
assert!((clusters.tile_size.x - 1) * clusters.dimensions.x < screen_size.x);
assert!((clusters.tile_size.y - 1) * clusters.dimensions.y < screen_size.y);
// check we don't have more clusters than pixels
assert!(clusters.dimensions.x <= screen_size.x);
assert!(clusters.dimensions.y <= screen_size.y);
clusters
}
#[test]
// check tiling for small screen sizes
fn test_default_cluster_setup_small_screensizes() {
for x in 1..100 {
for y in 1..100 {
let screen_size = UVec2::new(x, y);
let clusters = test_cluster_tiling(ClusterConfig::default(), screen_size);
assert!(
clusters.dimensions.x * clusters.dimensions.y * clusters.dimensions.z <= 4096
);
}
}
}
#[test]
// check tiling for long thin screen sizes
fn test_default_cluster_setup_small_x() {
for x in 1..10 {
for y in 1..5000 {
let screen_size = UVec2::new(x, y);
let clusters = test_cluster_tiling(ClusterConfig::default(), screen_size);
assert!(
clusters.dimensions.x * clusters.dimensions.y * clusters.dimensions.z <= 4096
);
let screen_size = UVec2::new(y, x);
let clusters = test_cluster_tiling(ClusterConfig::default(), screen_size);
assert!(
clusters.dimensions.x * clusters.dimensions.y * clusters.dimensions.z <= 4096
);
}
}
}
}