bevy/crates/bevy_pbr
Rafał Harabień 16ce8c6136
Optimize extract_clusters and prepare_clusters systems (#10633)
# Objective

When developing my game I realized `extract_clusters` and
`prepare_clusters` systems are taking a lot of time despite me creating
very little lights. Reducing number of clusters from the default 4096 to
2048 or less greatly improved performance and stabilized FPS (~300 ->
1000+). I debugged it and found out that the main reason for this is
cloning `VisiblePointLights` in `extract_clusters` system. It contains
light entities grouped by clusters that they affect. The problem is that
we clone 4096 (assuming the default clusters configuration) vectors
every frame. If many of them happen to be non-empty it starts to be a
bottleneck because there is a lot of heap allocation. It wouldn't be a
problem if we reused those vectors in following frames but we don't.

## Solution

Avoid cloning multiple vectors and instead build a single vector
containing data for all clusters.

I've recorded a trace in `3d_scene` example with disabled v-sync before
and after the change.
Mean FPS went from 424 to 990. Mean time for `extract_clusters` system
was reduced from 210 us to 24 us and `prepare_clusters` from 189 us to
87 us.


![image](https://github.com/bevyengine/bevy/assets/160391/ab66aa9d-1fa7-4993-9827-8be76b530972)

---

## Changelog

- Improved performance of `extract_clusters` and `prepare_clusters`
systems for scenes where lights affect a big part of it.
2024-01-29 17:50:22 +00:00
..
src Optimize extract_clusters and prepare_clusters systems (#10633) 2024-01-29 17:50:22 +00:00
Cargo.toml Update to wgpu 0.19 and raw-window-handle 0.6 (#11280) 2024-01-26 18:14:21 +00:00