No description
Find a file
Robert Swain 2abf5cc618 Clustered forward rendering (#3153)
# Objective

Implement clustered-forward rendering.

## Solution

~~FIXME - in the interest of keeping the merge train moving, I'm submitting this PR now before the description is ready. I want to add in some comments into the code with references for the various bits and pieces and I want to describe some of the key decisions I made here. I'll do that as soon as I can.~~ Anyone reviewing is welcome to add review comments where you want to know more about how something or other works.

* The summary of the technique is that the view frustum is divided into a grid of sub-volumes called clusters, point lights are tested against each of the clusters to see if they would affect that volume within the scene and if so, added to a list of lights affecting that cluster. Then when shading a fragment which is a point on the surface of a mesh within the scene, the point is mapped to a cluster and only the lights affecting that clusters are used in lighting calculations. This brings huge performance and scalability benefits as most of the time lights are placed so that there are not that many that overlap each other in terms of their sphere of influence, but there may be many distinct point lights visible in the scene. Doing all the lighting calculations for all visible lights in the scene for every pixel on the screen quickly becomes a performance limitation. Clustered forward rendering allows us to make an approximate list of lights that affect each pixel, indeed each surface in the scene (as it works along the view z axis too, unlike tiled/forward+).
* WebGL2 is a platform we want to support and it does not support storage buffers. Uniform buffer bindings are limited to a maximum of 16384 bytes per binding. I used bit shifting and masking to pack the cluster light lists and various indices into a uniform buffer and the 16kB limit is very likely the first bottleneck in scaling the number of lights in a scene at the moment if the lights can affect many clusters due to their range or proximity to the camera (there are a lot of clusters close to the camera, which is an area for improvement). We could store the information in textures instead of uniform buffers to remove this bottleneck though I don’t know if there are performance implications to reading from textures instead if uniform buffers.
* Because of the uniform buffer binding size limitations we can support a maximum of 256 lights with the current size of the PointLight struct
* The z-slicing method (i.e. the mapping from view space z to a depth slice which defines the near and far planes of a cluster) is using the Doom 2016 method. I need to add comments with references to this. It’s an exponential function that simplifies well for the purposes of optimising the fragment shader. xy grid divisions are regular in screen space.
* Some optimisation work was done on the allocation of lights to clusters, which involves intersection tests, and for this number of clusters and lights the system has insignificant cost using a fairly naïve algorithm. I think for more lights / finer-grained clusters we could use a BVH, but at some point it would be just much better to use compute shaders and storage buffers.
* Something else to note is that it is absolutely infeasible to use plain cube map point light shadow mapping for many lights. It does not scale in terms of performance nor memory usage. There are some interesting methods I saw discussed in reference material that I will add a link to which render and update shadow maps piece-wise, but they also need compute shaders to work well. Basically for now you need to sacrifice point light shadows for all but a handful of point lights if you don’t want to kill performance. I set the limit to 10 but that’s just what we had from before where 10 was the maximum number of point lights before this PR.
* I added a couple of debug visualisations behind a shader def that were useful for seeing performance impact of light distribution - I should make the debug mode configurable without modifying the shader code. One mode shows the number of lights affecting each cluster by tinting toward red for few lights or green for many lights (maxes out at 16, but not sure that’s a reasonable max). The other shows which cluster the surface at a fragment belongs to by tinting it with a randomish colour. This can help to understand deeper performance issues due to screen space tiles spanning multiple clusters in depth with divergent shader execution times.

Also, there are more things that could be done as improvements, and I will document those somewhere (I'm not sure where will be the best place... in a todo alongside the code, a GitHub issue, somewhere else?) but I think it works well enough and brings significant performance and scalability benefits that it's worth integrating already now and then iterating on.
* Calculate the light’s effective range based on its intensity and physical falloff and either just use this, or take the minimum of the user-supplied range and this. This would avoid unnecessary lighting calculations for clusters that cannot be affected. This would need to take into account HDR tone mapping as in my not-fully-understanding-the-details understanding, the threshold is relative to how bright the scene is.
* Improve the z-slicing to use a larger first slice.
* More gracefully handle the cluster light list uniform buffer binding size limitations by prioritising which lights are included (some heuristic for most significant like closest to the camera, brightest, affecting the most pixels, …)
* Switch to using a texture instead of uniform buffer
* Figure out the / a better story for shadows

I will also probably add an example that demonstrates some of the issues:
* What situations exhaust the space available in the uniform buffers
  * Light range too large making lights affect many clusters and so exhausting the space for the lists of lights that affect clusters
  * Light range set to be too small producing visible artifacts where clusters the light would physically affect are not affected by the light
* Perhaps some performance issues
  * How many lights can be closely packed or affect large portions of the view before performance drops?
2021-12-09 03:08:54 +00:00
.cargo Add aarch64-apple-darwin to the config_fast_builds for Apple Silicon (#2739) 2021-08-30 21:56:12 +00:00
.github Ignore reddit when checking markdown links (#3223) 2021-11-29 20:55:12 +00:00
assets Shader Imports. Decouple Mesh logic from PBR (#3137) 2021-11-18 03:45:02 +00:00
benches Add readme to errors crate and clean up cargo files (#3125) 2021-11-13 23:06:48 +00:00
crates More Bevy ECS schedule spans (#3281) 2021-12-08 23:43:03 +00:00
docs default features from bevy_asset and bevy_ecs can actually be disabled (#3097) 2021-11-13 21:15:22 +00:00
errors Add readme to errors crate and clean up cargo files (#3125) 2021-11-13 23:06:48 +00:00
examples Added transparency to window builder (#3105) 2021-12-08 20:53:35 +00:00
pipelined Clustered forward rendering (#3153) 2021-12-09 03:08:54 +00:00
src Down with the system! (#2496) 2021-07-27 23:42:36 +00:00
tests Implement and require #[derive(Component)] on all component structs (#2254) 2021-10-03 19:23:44 +00:00
tools Add readme to errors crate and clean up cargo files (#3125) 2021-11-13 23:06:48 +00:00
.gitattributes Enforce linux-style line endings for .rs and .toml (#3197) 2021-11-26 21:05:35 +00:00
.gitignore add .cargo/config.toml to .gitignore 2020-12-12 17:17:35 -08:00
Cargo.toml Added transparency to window builder (#3105) 2021-12-08 20:53:35 +00:00
CHANGELOG.md update CHANGELOG for 0.5 (#1967) 2021-04-19 22:41:19 +00:00
CODE_OF_CONDUCT.md Update CODE_OF_CONDUCT.md 2020-08-19 20:25:58 +01:00
CONTRIBUTING.md Not me ... us (#2654) 2021-08-15 20:08:52 +00:00
CREDITS.md Cleanup of Markdown Files and add CI Checking (#1463) 2021-02-22 04:50:05 +00:00
deny.toml update ndk-glue to 0.4 (#2684) 2021-08-19 01:02:15 +00:00
LICENSE Relicense Bevy under the dual MIT or Apache-2.0 license (#2509) 2021-07-23 21:11:51 +00:00
README.md Relicense Bevy under the dual MIT or Apache-2.0 license (#2509) 2021-07-23 21:11:51 +00:00
rustfmt.toml Cargo fmt with unstable features (#1903) 2021-04-21 23:19:34 +00:00

Bevy

Crates.io MIT/Apache 2.0 Crates.io Rust iOS cron CI Discord

What is Bevy?

Bevy is a refreshingly simple data-driven game engine built in Rust. It is free and open-source forever!

WARNING

Bevy is still in the very early stages of development. APIs can and will change (now is the time to make suggestions!). Important features are missing. Documentation is sparse. Please don't build any serious projects in Bevy unless you are prepared to be broken by API changes constantly.

Design Goals

  • Capable: Offer a complete 2D and 3D feature set
  • Simple: Easy for newbies to pick up, but infinitely flexible for power users
  • Data Focused: Data-oriented architecture using the Entity Component System paradigm
  • Modular: Use only what you need. Replace what you don't like
  • Fast: App logic should run quickly, and when possible, in parallel
  • Productive: Changes should compile quickly ... waiting isn't fun

About

Docs

Community

Before contributing or participating in discussions with the community, you should familiarize yourself with our Code of Conduct and How to Contribute

Getting Started

We recommend checking out The Bevy Book for a full tutorial.

Follow the Setup guide to ensure your development environment is set up correctly. Once set up, you can quickly try out the examples by cloning this repo and running the following commands:

# Switch to the correct version (latest release, default is main development branch)
git checkout latest
# Runs the "breakout" example
cargo run --example breakout

Fast Compiles

Bevy can be built just fine using default configuration on stable Rust. However for really fast iterative compiles, you should enable the "fast compiles" setup by following the instructions here.

Focus Areas

Bevy has the following Focus Areas. We are currently focusing our development efforts in these areas, and they will receive priority for Bevy developers' time. If you would like to contribute to Bevy, you are heavily encouraged to join in on these efforts:

Editor-Ready UI

PBR / Clustered Forward Rendering

Scenes

Libraries Used

Bevy is only possible because of the hard work put into these foundational technologies:

  • wgpu: modern / low-level / cross-platform graphics library inspired by Vulkan
  • glam-rs: a simple and fast 3D math library for games and graphics
  • winit: cross-platform window creation and management in Rust
  • spirv-reflect: Reflection API in rust for SPIR-V shader byte code

Bevy Cargo Features

This list outlines the different cargo features supported by Bevy. These allow you to customize the Bevy feature set for your use-case.

Third Party Plugins

Plugins are very welcome to extend Bevy's features. Guidelines are available to help integration and usage.

Thanks and Alternatives

Additionally, we would like to thank the Amethyst, macroquad, coffee, ggez, rg3d, and Piston projects for providing solid examples of game engine development in Rust. If you are looking for a Rust game engine, it is worth considering all of your options. Each engine has different design goals, and some will likely resonate with you more than others.

License

Bevy is free and open source! All code in this repository is dual-licensed under either:

at your option. This means you can select the license you prefer! This dual-licensing approach is the de-facto standard in the Rust ecosystem and there are very good reasons to include both.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.