Commit graph

135 commits

Author SHA1 Message Date
Patrick Walton
16531fb3e3
Implement GPU frustum culling. (#12889)
This commit implements opt-in GPU frustum culling, built on top of the
infrastructure in https://github.com/bevyengine/bevy/pull/12773. To
enable it on a camera, add the `GpuCulling` component to it. To
additionally disable CPU frustum culling, add the `NoCpuCulling`
component. Note that adding `GpuCulling` without `NoCpuCulling`
*currently* does nothing useful. The reason why `GpuCulling` doesn't
automatically imply `NoCpuCulling` is that I intend to follow this patch
up with GPU two-phase occlusion culling, and CPU frustum culling plus
GPU occlusion culling seems like a very commonly-desired mode.

Adding the `GpuCulling` component to a view puts that view into
*indirect mode*. This mode makes all drawcalls indirect, relying on the
mesh preprocessing shader to allocate instances dynamically. In indirect
mode, the `PreprocessWorkItem` `output_index` points not to a
`MeshUniform` instance slot but instead to a set of `wgpu`
`IndirectParameters`, from which it allocates an instance slot
dynamically if frustum culling succeeds. Batch building has been updated
to allocate and track indirect parameter slots, and the AABBs are now
supplied to the GPU as `MeshCullingData`.

A small amount of code relating to the frustum culling has been borrowed
from meshlets and moved into `maths.wgsl`. Note that standard Bevy
frustum culling uses AABBs, while meshlets use bounding spheres; this
means that not as much code can be shared as one might think.

This patch doesn't provide any way to perform GPU culling on shadow
maps, to avoid making this patch bigger than it already is. That can be
a followup.

## Changelog

### Added

* Frustum culling can now optionally be done on the GPU. To enable it,
add the `GpuCulling` component to a camera.
* To disable CPU frustum culling, add `NoCpuCulling` to a camera. Note
that `GpuCulling` doesn't automatically imply `NoCpuCulling`.
2024-04-28 12:50:00 +00:00
BD103
7b8d502083
Fix beta lints (#12980)
# Objective

- Fixes #12976

## Solution

This one is a doozy.

- Run `cargo +beta clippy --workspace --all-targets --all-features` and
fix all issues
- This includes:
- Moving inner attributes to be outer attributes, when the item in
question has both inner and outer attributes
  - Use `ptr::from_ref` in more scenarios
- Extend the valid idents list used by `clippy:doc_markdown` with more
names
  - Use `Clone::clone_from` when possible
  - Remove redundant `ron` import
  - Add backticks to **so many** identifiers and items
    - I'm sorry whoever has to review this

---

## Changelog

- Added links to more identifiers in documentation.
2024-04-16 02:46:46 +00:00
Patrick Walton
1141e731ff
Implement alpha to coverage (A2C) support. (#12970)
[Alpha to coverage] (A2C) replaces alpha blending with a
hardware-specific multisample coverage mask when multisample
antialiasing is in use. It's a simple form of [order-independent
transparency] that relies on MSAA. ["Anti-aliased Alpha Test: The
Esoteric Alpha To Coverage"] is a good summary of the motivation for and
best practices relating to A2C.

This commit implements alpha to coverage support as a new variant for
`AlphaMode`. You can supply `AlphaMode::AlphaToCoverage` as the
`alpha_mode` field in `StandardMaterial` to use it. When in use, the
standard material shader automatically applies the texture filtering
method from ["Anti-aliased Alpha Test: The Esoteric Alpha To Coverage"].
Objects with alpha-to-coverage materials are binned in the opaque pass,
as they're fully order-independent.

The `transparency_3d` example has been updated to feature an object with
alpha to coverage. Happily, the example was already using MSAA.

This is part of #2223, as far as I can tell.

[Alpha to coverage]: https://en.wikipedia.org/wiki/Alpha_to_coverage

[order-independent transparency]:
https://en.wikipedia.org/wiki/Order-independent_transparency

["Anti-aliased Alpha Test: The Esoteric Alpha To Coverage"]:
https://bgolus.medium.com/anti-aliased-alpha-test-the-esoteric-alpha-to-coverage-8b177335ae4f

---

## Changelog

### Added

* The `AlphaMode` enum now supports `AlphaToCoverage`, to provide
limited order-independent transparency when multisample antialiasing is
in use.
2024-04-15 20:37:52 +00:00
Patrick Walton
5caf085dac
Divide the single VisibleEntities list into separate lists for 2D meshes, 3D meshes, lights, and UI elements, for performance. (#12582)
This commit splits `VisibleEntities::entities` into four separate lists:
one for lights, one for 2D meshes, one for 3D meshes, and one for UI
elements. This allows `queue_material_meshes` and similar methods to
avoid examining entities that are obviously irrelevant. In particular,
this separation helps scenes with many skinned meshes, as the individual
bones are considered visible entities but have no rendered appearance.

Internally, `VisibleEntities::entities` is a `HashMap` from the `TypeId`
representing a `QueryFilter` to the appropriate `Entity` list. I had to
do this because `VisibleEntities` is located within an upstream crate
from the crates that provide lights (`bevy_pbr`) and 2D meshes
(`bevy_sprite`). As an added benefit, this setup allows apps to provide
their own types of renderable components, by simply adding a specialized
`check_visibility` to the schedule.

This provides a 16.23% end-to-end speedup on `many_foxes` with 10,000
foxes (24.06 ms/frame to 20.70 ms/frame).

## Migration guide

* `check_visibility` and `VisibleEntities` now store the four types of
renderable entities--2D meshes, 3D meshes, lights, and UI
elements--separately. If your custom rendering code examines
`VisibleEntities`, it will now need to specify which type of entity it's
interested in using the `WithMesh2d`, `WithMesh`, `WithLight`, and
`WithNode` types respectively. If your app introduces a new type of
renderable entity, you'll need to add an explicit call to
`check_visibility` to the schedule to accommodate your new component or
components.

## Analysis

`many_foxes`, 10,000 foxes: `main`:
![Screenshot 2024-03-31
114444](https://github.com/bevyengine/bevy/assets/157897/16ecb2ff-6e04-46c0-a4b0-b2fde2084bad)

`many_foxes`, 10,000 foxes, this branch:
![Screenshot 2024-03-31
114256](https://github.com/bevyengine/bevy/assets/157897/94dedae4-bd00-45b2-9aaf-dfc237004ddb)

`queue_material_meshes` (yellow = this branch, red = `main`):
![Screenshot 2024-03-31
114637](https://github.com/bevyengine/bevy/assets/157897/f90912bd-45bd-42c4-bd74-57d98a0f036e)

`queue_shadows` (yellow = this branch, red = `main`):
![Screenshot 2024-03-31
114607](https://github.com/bevyengine/bevy/assets/157897/6ce693e3-20c0-4234-8ec9-a6f191299e2d)
2024-04-11 20:33:20 +00:00
Patrick Walton
d59b1e71ef
Implement percentage-closer filtering (PCF) for point lights. (#12910)
I ported the two existing PCF techniques to the cubemap domain as best I
could. Generally, the technique is to create a 2D orthonormal basis
using Gram-Schmidt normalization, then apply the technique over that
basis. The results look fine, though the shadow bias often needs
adjusting.

For comparison, Unity uses a 4-tap pattern for PCF on point lights of
(1, 1, 1), (-1, -1, 1), (-1, 1, -1), (1, -1, -1). I tried this but
didn't like the look, so I went with the design above, which ports the
2D techniques to the 3D domain. There's surprisingly little material on
point light PCF.

I've gone through every example using point lights and verified that the
shadow maps look fine, adjusting biases as necessary.

Fixes #3628.

---

## Changelog

### Added
* Shadows from point lights now support percentage-closer filtering
(PCF), and as a result look less aliased.

### Changed
* `ShadowFilteringMethod::Castano13` and
`ShadowFilteringMethod::Jimenez14` have been renamed to
`ShadowFilteringMethod::Gaussian` and `ShadowFilteringMethod::Temporal`
respectively.

## Migration Guide

* `ShadowFilteringMethod::Castano13` and
`ShadowFilteringMethod::Jimenez14` have been renamed to
`ShadowFilteringMethod::Gaussian` and `ShadowFilteringMethod::Temporal`
respectively.
2024-04-10 20:16:08 +00:00
Patrick Walton
11817f4ba4
Generate MeshUniforms on the GPU via compute shader where available. (#12773)
Currently, `MeshUniform`s are rather large: 160 bytes. They're also
somewhat expensive to compute, because they involve taking the inverse
of a 3x4 matrix. Finally, if a mesh is present in multiple views, that
mesh will have a separate `MeshUniform` for each and every view, which
is wasteful.

This commit fixes these issues by introducing the concept of a *mesh
input uniform* and adding a *mesh uniform building* compute shader pass.
The `MeshInputUniform` is simply the minimum amount of data needed for
the GPU to compute the full `MeshUniform`. Most of this data is just the
transform and is therefore only 64 bytes. `MeshInputUniform`s are
computed during the *extraction* phase, much like skins are today, in
order to avoid needlessly copying transforms around on CPU. (In fact,
the render app has been changed to only store the translation of each
mesh; it no longer cares about any other part of the transform, which is
stored only on the GPU and the main world.) Before rendering, the
`build_mesh_uniforms` pass runs to expand the `MeshInputUniform`s to the
full `MeshUniform`.

The mesh uniform building pass does the following, all on GPU:

1. Copy the appropriate fields of the `MeshInputUniform` to the
`MeshUniform` slot. If a single mesh is present in multiple views, this
effectively duplicates it into each view.

2. Compute the inverse transpose of the model transform, used for
transforming normals.

3. If applicable, copy the mesh's transform from the previous frame for
TAA. To support this, we double-buffer the `MeshInputUniform`s over two
frames and swap the buffers each frame. The `MeshInputUniform`s for the
current frame contain the index of that mesh's `MeshInputUniform` for
the previous frame.

This commit produces wins in virtually every CPU part of the pipeline:
`extract_meshes`, `queue_material_meshes`,
`batch_and_prepare_render_phase`, and especially
`write_batched_instance_buffer` are all faster. Shrinking the amount of
CPU data that has to be shuffled around speeds up the entire rendering
process.

| Benchmark              | This branch | `main`  | Speedup |
|------------------------|-------------|---------|---------|
| `many_cubes -nfc`      |      17.259 |  24.529 |  42.12% |
| `many_cubes -nfc -vpi` |     302.116 | 312.123 |   3.31% |
| `many_foxes`           |       3.227 |   3.515 |   8.92% |

Because mesh uniform building requires compute shader, and WebGL 2 has
no compute shader, the existing CPU mesh uniform building code has been
left as-is. Many types now have both CPU mesh uniform building and GPU
mesh uniform building modes. Developers can opt into the old CPU mesh
uniform building by setting the `use_gpu_uniform_builder` option on
`PbrPlugin` to `false`.

Below are graphs of the CPU portions of `many-cubes
--no-frustum-culling`. Yellow is this branch, red is `main`.

`extract_meshes`:
![Screenshot 2024-04-02
124842](https://github.com/bevyengine/bevy/assets/157897/a6748ea4-dd05-47b6-9254-45d07d33cb10)
It's notable that we get a small win even though we're now writing to a
GPU buffer.

`queue_material_meshes`:
![Screenshot 2024-04-02
124911](https://github.com/bevyengine/bevy/assets/157897/ecb44d78-65dc-448d-ba85-2de91aa2ad94)
There's a bit of a regression here; not sure what's causing it. In any
case it's very outweighed by the other gains.

`batch_and_prepare_render_phase`:
![Screenshot 2024-04-02
125123](https://github.com/bevyengine/bevy/assets/157897/4e20fc86-f9dd-4e5c-8623-837e4258f435)
There's a huge win here, enough to make batching basically drop off the
profile.

`write_batched_instance_buffer`:
![Screenshot 2024-04-02
125237](https://github.com/bevyengine/bevy/assets/157897/401a5c32-9dc1-4991-996d-eb1cac6014b2)
There's a massive improvement here, as expected. Note that a lot of it
simply comes from the fact that `MeshInputUniform` is `Pod`. (This isn't
a maintainability problem in my view because `MeshInputUniform` is so
simple: just 16 tightly-packed words.)

## Changelog

### Added

* Per-mesh instance data is now generated on GPU with a compute shader
instead of CPU, resulting in rendering performance improvements on
platforms where compute shaders are supported.

## Migration guide

* Custom render phases now need multiple systems beyond just
`batch_and_prepare_render_phase`. Code that was previously creating
custom render phases should now add a `BinnedRenderPhasePlugin` or
`SortedRenderPhasePlugin` as appropriate instead of directly adding
`batch_and_prepare_render_phase`.
2024-04-10 05:33:32 +00:00
Robert Swain
ab7cbfa8fc
Consolidate Render(Ui)Materials(2d) into RenderAssets (#12827)
# Objective

- Replace `RenderMaterials` / `RenderMaterials2d` / `RenderUiMaterials`
with `RenderAssets` to enable implementing changes to one thing,
`RenderAssets`, that applies to all use cases rather than duplicating
changes everywhere for multiple things that should be one thing.
- Adopts #8149 

## Solution

- Make RenderAsset generic over the destination type rather than the
source type as in #8149
- Use `RenderAssets<PreparedMaterial<M>>` etc for render materials

---

## Changelog

- Changed:
- The `RenderAsset` trait is now implemented on the destination type.
Its `SourceAsset` associated type refers to the type of the source
asset.
- `RenderMaterials`, `RenderMaterials2d`, and `RenderUiMaterials` have
been replaced by `RenderAssets<PreparedMaterial<M>>` and similar.

## Migration Guide

- `RenderAsset` is now implemented for the destination type rather that
the source asset type. The source asset type is now the `RenderAsset`
trait's `SourceAsset` associated type.
2024-04-09 13:26:34 +00:00
Patrick Walton
37522fd0ae
Micro-optimize queue_material_meshes, primarily to remove bit manipulation. (#12791)
This commit makes the following optimizations:

## `MeshPipelineKey`/`BaseMeshPipelineKey` split

`MeshPipelineKey` has been split into `BaseMeshPipelineKey`, which lives
in `bevy_render` and `MeshPipelineKey`, which lives in `bevy_pbr`.
Conceptually, `BaseMeshPipelineKey` is a superclass of
`MeshPipelineKey`. For `BaseMeshPipelineKey`, the bits start at the
highest (most significant) bit and grow downward toward the lowest bit;
for `MeshPipelineKey`, the bits start at the lowest bit and grow upward
toward the highest bit. This prevents them from colliding.

The goal of this is to avoid having to reassemble bits of the pipeline
key for every mesh every frame. Instead, we can just use a bitwise or
operation to combine the pieces that make up a `MeshPipelineKey`.

## `specialize_slow`

Previously, all of `specialize()` was marked as `#[inline]`. This
bloated `queue_material_meshes` unnecessarily, as a large chunk of it
ended up being a slow path that was rarely hit. This commit refactors
the function to move the slow path to `specialize_slow()`.

Together, these two changes shave about 5% off `queue_material_meshes`:

![Screenshot 2024-03-29
130002](https://github.com/bevyengine/bevy/assets/157897/a7e5a994-a807-4328-b314-9003429dcdd2)

## Migration Guide

- The `primitive_topology` field on `GpuMesh` is now an accessor method:
`GpuMesh::primitive_topology()`.
- For performance reasons, `MeshPipelineKey` has been split into
`BaseMeshPipelineKey`, which lives in `bevy_render`, and
`MeshPipelineKey`, which lives in `bevy_pbr`. These two should be
combined with bitwise-or to produce the final `MeshPipelineKey`.
2024-04-01 21:58:53 +00:00
Cameron
01649f13e2
Refactor App and SubApp internals for better separation (#9202)
# Objective

This is a necessary precursor to #9122 (this was split from that PR to
reduce the amount of code to review all at once).

Moving `!Send` resource ownership to `App` will make it unambiguously
`!Send`. `SubApp` must be `Send`, so it can't wrap `App`.

## Solution

Refactor `App` and `SubApp` to not have a recursive relationship. Since
`SubApp` no longer wraps `App`, once `!Send` resources are moved out of
`World` and into `App`, `SubApp` will become unambiguously `Send`.

There could be less code duplication between `App` and `SubApp`, but
that would break `App` method chaining.

## Changelog

- `SubApp` no longer wraps `App`.
- `App` fields are no longer publicly accessible.
- `App` can no longer be converted into a `SubApp`.
- Various methods now return references to a `SubApp` instead of an
`App`.
## Migration Guide

- To construct a sub-app, use `SubApp::new()`. `App` can no longer
convert into `SubApp`.
- If you implemented a trait for `App`, you may want to implement it for
`SubApp` as well.
- If you're accessing `app.world` directly, you now have to use
`app.world()` and `app.world_mut()`.
- `App::sub_app` now returns `&SubApp`.
- `App::sub_app_mut`  now returns `&mut SubApp`.
- `App::get_sub_app` now returns `Option<&SubApp>.`
- `App::get_sub_app_mut` now returns `Option<&mut SubApp>.`
2024-03-31 03:16:10 +00:00
Patrick Walton
4dadebd9c4
Improve performance by binning together opaque items instead of sorting them. (#12453)
Today, we sort all entities added to all phases, even the phases that
don't strictly need sorting, such as the opaque and shadow phases. This
results in a performance loss because our `PhaseItem`s are rather large
in memory, so sorting is slow. Additionally, determining the boundaries
of batches is an O(n) process.

This commit makes Bevy instead applicable place phase items into *bins*
keyed by *bin keys*, which have the invariant that everything in the
same bin is potentially batchable. This makes determining batch
boundaries O(1), because everything in the same bin can be batched.
Instead of sorting each entity, we now sort only the bin keys. This
drops the sorting time to near-zero on workloads with few bins like
`many_cubes --no-frustum-culling`. Memory usage is improved too, with
batch boundaries and dynamic indices now implicit instead of explicit.
The improved memory usage results in a significant win even on
unbatchable workloads like `many_cubes --no-frustum-culling
--vary-material-data-per-instance`, presumably due to cache effects.

Not all phases can be binned; some, such as transparent and transmissive
phases, must still be sorted. To handle this, this commit splits
`PhaseItem` into `BinnedPhaseItem` and `SortedPhaseItem`. Most of the
logic that today deals with `PhaseItem`s has been moved to
`SortedPhaseItem`. `BinnedPhaseItem` has the new logic.

Frame time results (in ms/frame) are as follows:

| Benchmark                | `binning` | `main`  | Speedup |
| ------------------------ | --------- | ------- | ------- |
| `many_cubes -nfc -vpi` | 232.179     | 312.123   | 34.43%  |
| `many_cubes -nfc`        | 25.874 | 30.117 | 16.40%  |
| `many_foxes`             | 3.276 | 3.515 | 7.30%   |

(`-nfc` is short for `--no-frustum-culling`; `-vpi` is short for
`--vary-per-instance`.)

---

## Changelog

### Changed

* Render phases have been split into binned and sorted phases. Binned
phases, such as the common opaque phase, achieve improved CPU
performance by avoiding the sorting step.

## Migration Guide

- `PhaseItem` has been split into `BinnedPhaseItem` and
`SortedPhaseItem`. If your code has custom `PhaseItem`s, you will need
to migrate them to one of these two types. `SortedPhaseItem` requires
the fewest code changes, but you may want to pick `BinnedPhaseItem` if
your phase doesn't require sorting, as that enables higher performance.

## Tracy graphs

`many-cubes --no-frustum-culling`, `main` branch:
<img width="1064" alt="Screenshot 2024-03-12 180037"
src="https://github.com/bevyengine/bevy/assets/157897/e1180ce8-8e89-46d2-85e3-f59f72109a55">

`many-cubes --no-frustum-culling`, this branch:
<img width="1064" alt="Screenshot 2024-03-12 180011"
src="https://github.com/bevyengine/bevy/assets/157897/0899f036-6075-44c5-a972-44d95895f46c">

You can see that `batch_and_prepare_binned_render_phase` is a much
smaller fraction of the time. Zooming in on that function, with yellow
being this branch and red being `main`, we see:

<img width="1064" alt="Screenshot 2024-03-12 175832"
src="https://github.com/bevyengine/bevy/assets/157897/0dfc8d3f-49f4-496e-8825-a66e64d356d0">

The binning happens in `queue_material_meshes`. Again with yellow being
this branch and red being `main`:
<img width="1064" alt="Screenshot 2024-03-12 175755"
src="https://github.com/bevyengine/bevy/assets/157897/b9b20dc1-11c8-400c-a6cc-1c2e09c1bb96">

We can see that there is a small regression in `queue_material_meshes`
performance, but it's not nearly enough to outweigh the large gains in
`batch_and_prepare_binned_render_phase`.

---------

Co-authored-by: James Liu <contact@jamessliu.com>
2024-03-30 02:55:02 +00:00
JMS55
4f20faaa43
Meshlet rendering (initial feature) (#10164)
# Objective
- Implements a more efficient, GPU-driven
(https://github.com/bevyengine/bevy/issues/1342) rendering pipeline
based on meshlets.
- Meshes are split into small clusters of triangles called meshlets,
each of which acts as a mini index buffer into the larger mesh data.
Meshlets can be compressed, streamed, culled, and batched much more
efficiently than monolithic meshes.


![image](https://github.com/bevyengine/bevy/assets/47158642/cb2aaad0-7a9a-4e14-93b0-15d4e895b26a)

![image](https://github.com/bevyengine/bevy/assets/47158642/7534035b-1eb7-4278-9b99-5322e4401715)

# Misc
* Future work: https://github.com/bevyengine/bevy/issues/11518
* Nanite reference:
https://advances.realtimerendering.com/s2021/Karis_Nanite_SIGGRAPH_Advances_2021_final.pdf
Two pass occlusion culling explained very well:
https://medium.com/@mil_kru/two-pass-occlusion-culling-4100edcad501

---------

Co-authored-by: Ricky Taylor <rickytaylor26@gmail.com>
Co-authored-by: vero <email@atlasdostal.com>
Co-authored-by: François <mockersf@gmail.com>
Co-authored-by: atlas dostal <rodol@rivalrebels.com>
2024-03-25 19:08:27 +00:00
NiseVoid
ce75dec3b8
Add setting to enable/disable shadows to MaterialPlugin (#12538)
# Objective

- Not all materials need shadow, but a queue_shadows system is always
added to the `Render` schedule and executed

## Solution

- Make a setting for shadows, it defaults to true

## Changelog

- Added `shadows_enabled` setting to `MaterialPlugin`

## Migration Guide

- `MaterialPlugin` now has a `shadows_enabled` setting, if you didn't
spawn the plugin using `::default()` or `..default()`, you'll need to
set it. `shadows_enabled: true` is the same behavior as the previous
version, and also the default value.
2024-03-18 17:54:41 +00:00
robtfm
1323de7cd7
stop retrying removed assets (#12505)
# Objective

assets that don't load before they get removed are retried forever,
causing buffer churn and slowdown.

## Solution

stop trying to prepare dead assets.
2024-03-16 04:49:16 +00:00
Patrick Walton
f9cc91d5a1
Intern mesh vertex buffer layouts so that we don't have to compare them over and over. (#12216)
Although we cached hashes of `MeshVertexBufferLayout`, we were paying
the cost of `PartialEq` on `InnerMeshVertexBufferLayout` for every
entity, every frame. This patch changes that logic to place
`MeshVertexBufferLayout`s in `Arc`s so that they can be compared and
hashed by pointer. This results in a 28% speedup in the
`queue_material_meshes` phase of `many_cubes`, with frustum culling
disabled.

Additionally, this patch contains two minor changes:

1. This commit flattens the specialized mesh pipeline cache to one level
of hash tables instead of two. This saves a hash lookup.

2. The example `many_cubes` has been given a `--no-frustum-culling`
flag, to aid in benchmarking.

See the Tracy profile:

<img width="1064" alt="Screenshot 2024-02-29 144406"
src="https://github.com/bevyengine/bevy/assets/157897/18632f1d-1fdd-4ac7-90ed-2d10306b2a1e">

## Migration guide

* Duplicate `MeshVertexBufferLayout`s are now combined into a single
object, `MeshVertexBufferLayoutRef`, which contains an
atomically-reference-counted pointer to the layout. Code that was using
`MeshVertexBufferLayout` may need to be updated to use
`MeshVertexBufferLayoutRef` instead.
2024-03-01 20:56:21 +00:00
Alice Cecile
599e5e4e76
Migrate from LegacyColor to bevy_color::Color (#12163)
# Objective

- As part of the migration process we need to a) see the end effect of
the migration on user ergonomics b) check for serious perf regressions
c) actually migrate the code
- To accomplish this, I'm going to attempt to migrate all of the
remaining user-facing usages of `LegacyColor` in one PR, being careful
to keep a clean commit history.
- Fixes #12056.

## Solution

I've chosen to use the polymorphic `Color` type as our standard
user-facing API.

- [x] Migrate `bevy_gizmos`.
- [x] Take `impl Into<Color>` in all `bevy_gizmos` APIs
- [x] Migrate sprites
- [x] Migrate UI
- [x] Migrate `ColorMaterial`
- [x] Migrate `MaterialMesh2D`
- [x] Migrate fog
- [x] Migrate lights
- [x] Migrate StandardMaterial
- [x] Migrate wireframes
- [x] Migrate clear color
- [x] Migrate text
- [x] Migrate gltf loader
- [x] Register color types for reflection
- [x] Remove `LegacyColor`
- [x] Make sure CI passes

Incidental improvements to ease migration:

- added `Color::srgba_u8`, `Color::srgba_from_array` and friends
- added `set_alpha`, `is_fully_transparent` and `is_fully_opaque` to the
`Alpha` trait
- add and immediately deprecate (lol) `Color::rgb` and friends in favor
of more explicit and consistent `Color::srgb`
- standardized on white and black for most example text colors
- added vector field traits to `LinearRgba`: ~~`Add`, `Sub`,
`AddAssign`, `SubAssign`,~~ `Mul<f32>` and `Div<f32>`. Multiplications
and divisions do not scale alpha. `Add` and `Sub` have been cut from
this PR.
- added `LinearRgba` and `Srgba` `RED/GREEN/BLUE`
- added `LinearRgba_to_f32_array` and `LinearRgba::to_u32`

## Migration Guide

Bevy's color types have changed! Wherever you used a
`bevy::render::Color`, a `bevy::color::Color` is used instead.

These are quite similar! Both are enums storing a color in a specific
color space (or to be more precise, using a specific color model).
However, each of the different color models now has its own type.

TODO...

- `Color::rgba`, `Color::rgb`, `Color::rbga_u8`, `Color::rgb_u8`,
`Color::rgb_from_array` are now `Color::srgba`, `Color::srgb`,
`Color::srgba_u8`, `Color::srgb_u8` and `Color::srgb_from_array`.
- `Color::set_a` and `Color::a` is now `Color::set_alpha` and
`Color::alpha`. These are part of the `Alpha` trait in `bevy_color`.
- `Color::is_fully_transparent` is now part of the `Alpha` trait in
`bevy_color`
- `Color::r`, `Color::set_r`, `Color::with_r` and the equivalents for
`g`, `b` `h`, `s` and `l` have been removed due to causing silent
relatively expensive conversions. Convert your `Color` into the desired
color space, perform your operations there, and then convert it back
into a polymorphic `Color` enum.
- `Color::hex` is now `Srgba::hex`. Call `.into` or construct a
`Color::Srgba` variant manually to convert it.
- `WireframeMaterial`, `ExtractedUiNode`, `ExtractedDirectionalLight`,
`ExtractedPointLight`, `ExtractedSpotLight` and `ExtractedSprite` now
store a `LinearRgba`, rather than a polymorphic `Color`
- `Color::rgb_linear` and `Color::rgba_linear` are now
`Color::linear_rgb` and `Color::linear_rgba`
- The various CSS color constants are no longer stored directly on
`Color`. Instead, they're defined in the `Srgba` color space, and
accessed via `bevy::color::palettes::css`. Call `.into()` on them to
convert them into a `Color` for quick debugging use, and consider using
the much prettier `tailwind` palette for prototyping.
- The `LIME_GREEN` color has been renamed to `LIMEGREEN` to comply with
the standard naming.
- Vector field arithmetic operations on `Color` (add, subtract, multiply
and divide by a f32) have been removed. Instead, convert your colors
into `LinearRgba` space, and perform your operations explicitly there.
This is particularly relevant when working with emissive or HDR colors,
whose color channel values are routinely outside of the ordinary 0 to 1
range.
- `Color::as_linear_rgba_f32` has been removed. Call
`LinearRgba::to_f32_array` instead, converting if needed.
- `Color::as_linear_rgba_u32` has been removed. Call
`LinearRgba::to_u32` instead, converting if needed.
- Several other color conversion methods to transform LCH or HSL colors
into float arrays or `Vec` types have been removed. Please reimplement
these externally or open a PR to re-add them if you found them
particularly useful.
- Various methods on `Color` such as `rgb` or `hsl` to convert the color
into a specific color space have been removed. Convert into
`LinearRgba`, then to the color space of your choice.
- Various implicitly-converting color value methods on `Color` such as
`r`, `g`, `b` or `h` have been removed. Please convert it into the color
space of your choice, then check these properties.
- `Color` no longer implements `AsBindGroup`. Store a `LinearRgba`
internally instead to avoid conversion costs.

---------

Co-authored-by: Alice Cecile <alice.i.cecil@gmail.com>
Co-authored-by: Afonso Lage <lage.afonso@gmail.com>
Co-authored-by: Rob Parrett <robparrett@gmail.com>
Co-authored-by: Zachary Harrold <zac@harrold.com.au>
2024-02-29 19:35:12 +00:00
Elabajaba
78b6fa1f1b
sort alpha masked pipelines by pipeline & mesh instead of by distance (#12117)
# Objective

- followup to https://github.com/bevyengine/bevy/pull/11671
- I forgot to change the alpha masked phases.

## Solution

- Change the sorting for alpha mask phases to sort by pipeline+mesh
instead of distance, for much better batching for alpha masked
materials.

I also fixed some docs that I missed in the previous PR.

---

## Changelog
- Alpha masked materials are now sorted by pipeline and mesh.
2024-02-26 11:14:59 +00:00
eri
5f8f3b532c
Check cfg during CI and fix feature typos (#12103)
# Objective

- Add the new `-Zcheck-cfg` checks to catch more warnings
- Fixes #12091

## Solution

- Create a new `cfg-check` to the CI that runs `cargo check -Zcheck-cfg
--workspace` using cargo nightly (and fails if there are warnings)
- Fix all warnings generated by the new check

---

## Changelog

- Remove all redundant imports
- Fix cfg wasm32 targets
- Add 3 dead code exceptions (should StandardColor be unused?)
- Convert ios_simulator to a feature (I'm not sure if this is the right
way to do it, but the check complained before)

## Migration Guide

No breaking changes

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-02-25 15:19:27 +00:00
Alice Cecile
de004da8d5
Rename bevy_render::Color to LegacyColor (#12069)
# Objective

The migration process for `bevy_color` (#12013) will be fairly involved:
there will be hundreds of affected files, and a large number of APIs.

## Solution

To allow us to proceed granularly, we're going to keep both
`bevy_color::Color` (new) and `bevy_render::Color` (old) around until
the migration is complete.

However, simply doing this directly is confusing! They're both called
`Color`, making it very hard to tell when a portion of the code has been
ported.

As discussed in #12056, by renaming the old `Color` type, we can make it
easier to gradually migrate over, one API at a time.

## Migration Guide

THIS MIGRATION GUIDE INTENTIONALLY LEFT BLANK.

This change should not be shipped to end users: delete this section in
the final migration guide!

---------

Co-authored-by: Alice Cecile <alice.i.cecil@gmail.com>
2024-02-24 21:35:32 +00:00
IceSentry
e79b9b62ce
Make more things pub in the renderer (#12053)
# Objective

- Some properties of public types are private but sometimes it's useful
to be able to set those

## Solution

- Make more stuff pub

---

## Changelog

- `MaterialBindGroupId` internal id is now pub and added a new()
constructor
- `ExtractedPointLight` and `ExtractedDirectionalLight` properties are
now all pub

---------

Co-authored-by: James Liu <contact@jamessliu.com>
2024-02-23 06:13:37 +00:00
Ame
9d67edc3a6
fix some typos (#12038)
# Objective

Split - containing only the fixed typos

-
https://github.com/bevyengine/bevy/pull/12036#pullrequestreview-1894738751


# Migration Guide
In `crates/bevy_mikktspace/src/generated.rs` 

```rs
// before
pub struct SGroup {
    pub iVertexRepresentitive: i32,
    ..
}

// after
pub struct SGroup {
    pub iVertexRepresentative: i32,
    ..
}
```

In `crates/bevy_core_pipeline/src/core_2d/mod.rs`

```rs
// before
Node2D::ConstrastAdaptiveSharpening

// after
Node2D::ContrastAdaptiveSharpening
```

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: James Liu <contact@jamessliu.com>
Co-authored-by: François <mockersf@gmail.com>
2024-02-22 18:55:22 +00:00
James Liu
6d547d7ce6
Allow Mesh-related queue phase systems to parallelize (#11804)
# Objective
Partially addresses #3548. `queue_shadows` and `queue_material_meshes`
cannot parallelize because of the `ResMut<RenderMeshInstances>`
parameter for `queue_material_meshes`.

## Solution
Change the `material_bind_group` field to use atomics instead of needing
full mutable access. Change the `ResMut` to a `Res`, which should allow
both sets of systems to parallelize without issue.

## Performance
Tested against `many_foxes`, this has a significant improvement over the
entire render schedule. (Yellow is this PR, red is main)

![image](https://github.com/bevyengine/bevy/assets/3137680/6cc7f346-4f50-4f12-a383-682a9ce1daf6)

The use of atomics does seem to have a negative effect on
`queue_material_meshes` (roughly a 8.29% increase in time spent in the
system).

![image](https://github.com/bevyengine/bevy/assets/3137680/7907079a-863d-4760-aa5b-df68c006ea36)

`queue_shadows` seems to be ever so slightly slower (1.6% more time
spent) in the system.

![image](https://github.com/bevyengine/bevy/assets/3137680/6d90af73-b922-45e4-bae5-df200e8b9784)

`batch_and_prepare_render_phase` seems to be a mix, but overall seems to
be slightly *faster* by about 5%.

![image](https://github.com/bevyengine/bevy/assets/3137680/fac638ff-8c90-436b-9362-c6209b18957c)
2024-02-20 00:12:41 +00:00
Patrick Walton
3af8526786
Stop extracting mesh entities to the render world. (#11803)
This fixes a `FIXME` in `extract_meshes` and results in a performance
improvement.

As a result of this change, meshes in the render world might not be
attached to entities anymore. Therefore, the `entity` parameter to
`RenderCommand::render()` is now wrapped in an `Option`. Most
applications that use the render app's ECS can simply unwrap the
`Option`.

Note that for now sprites, gizmos, and UI elements still use the render
world as usual.

## Migration guide

* For efficiency reasons, some meshes in the render world may not have
corresponding `Entity` IDs anymore. As a result, the `entity` parameter
to `RenderCommand::render()` is now wrapped in an `Option`. Custom
rendering code may need to be updated to handle the case in which no
`Entity` exists for an object that is to be rendered.
2024-02-10 10:46:10 +00:00
Patrick Walton
4c15dd0fc5
Implement irradiance volumes. (#10268)
# Objective

Bevy could benefit from *irradiance volumes*, also known as *voxel
global illumination* or simply as light probes (though this term is not
preferred, as multiple techniques can be called light probes).
Irradiance volumes are a form of baked global illumination; they work by
sampling the light at the centers of each voxel within a cuboid. At
runtime, the voxels surrounding the fragment center are sampled and
interpolated to produce indirect diffuse illumination.

## Solution

This is divided into two sections. The first is copied and pasted from
the irradiance volume module documentation and describes the technique.
The second part consists of notes on the implementation.

### Overview

An *irradiance volume* is a cuboid voxel region consisting of
regularly-spaced precomputed samples of diffuse indirect light. They're
ideal if you have a dynamic object such as a character that can move
about
static non-moving geometry such as a level in a game, and you want that
dynamic object to be affected by the light bouncing off that static
geometry.

To use irradiance volumes, you need to precompute, or *bake*, the
indirect
light in your scene. Bevy doesn't currently come with a way to do this.
Fortunately, [Blender] provides a [baking tool] as part of the Eevee
renderer, and its irradiance volumes are compatible with those used by
Bevy.
The [`bevy-baked-gi`] project provides a tool, `export-blender-gi`, that
can
extract the baked irradiance volumes from the Blender `.blend` file and
package them up into a `.ktx2` texture for use by the engine. See the
documentation in the `bevy-baked-gi` project for more details as to this
workflow.

Like all light probes in Bevy, irradiance volumes are 1×1×1 cubes that
can
be arbitrarily scaled, rotated, and positioned in a scene with the
[`bevy_transform::components::Transform`] component. The 3D voxel grid
will
be stretched to fill the interior of the cube, and the illumination from
the
irradiance volume will apply to all fragments within that bounding
region.

Bevy's irradiance volumes are based on Valve's [*ambient cubes*] as used
in
*Half-Life 2* ([Mitchell 2006], slide 27). These encode a single color
of
light from the six 3D cardinal directions and blend the sides together
according to the surface normal.

The primary reason for choosing ambient cubes is to match Blender, so
that
its Eevee renderer can be used for baking. However, they also have some
advantages over the common second-order spherical harmonics approach:
ambient cubes don't suffer from ringing artifacts, they are smaller (6
colors for ambient cubes as opposed to 9 for spherical harmonics), and
evaluation is faster. A smaller basis allows for a denser grid of voxels
with the same storage requirements.

If you wish to use a tool other than `export-blender-gi` to produce the
irradiance volumes, you'll need to pack the irradiance volumes in the
following format. The irradiance volume of resolution *(Rx, Ry, Rz)* is
expected to be a 3D texture of dimensions *(Rx, 2Ry, 3Rz)*. The
unnormalized
texture coordinate *(s, t, p)* of the voxel at coordinate *(x, y, z)*
with
side *S* ∈ *{-X, +X, -Y, +Y, -Z, +Z}* is as follows:

```text
s = x

t = y + ⎰  0 if S ∈ {-X, -Y, -Z}
        ⎱ Ry if S ∈ {+X, +Y, +Z}

        ⎧   0 if S ∈ {-X, +X}
p = z + ⎨  Rz if S ∈ {-Y, +Y}
        ⎩ 2Rz if S ∈ {-Z, +Z}
```

Visually, in a left-handed coordinate system with Y up, viewed from the
right, the 3D texture looks like a stacked series of voxel grids, one
for
each cube side, in this order:

| **+X** | **+Y** | **+Z** |
| ------ | ------ | ------ |
| **-X** | **-Y** | **-Z** |

A terminology note: Other engines may refer to irradiance volumes as
*voxel
global illumination*, *VXGI*, or simply as *light probes*. Sometimes
*light
probe* refers to what Bevy calls a reflection probe. In Bevy, *light
probe*
is a generic term that encompasses all cuboid bounding regions that
capture
indirect illumination, whether based on voxels or not.

Note that, if binding arrays aren't supported (e.g. on WebGPU or WebGL
2),
then only the closest irradiance volume to the view will be taken into
account during rendering.

[*ambient cubes*]:
https://advances.realtimerendering.com/s2006/Mitchell-ShadingInValvesSourceEngine.pdf

[Mitchell 2006]:
https://advances.realtimerendering.com/s2006/Mitchell-ShadingInValvesSourceEngine.pdf

[Blender]: http://blender.org/

[baking tool]:
https://docs.blender.org/manual/en/latest/render/eevee/render_settings/indirect_lighting.html

[`bevy-baked-gi`]: https://github.com/pcwalton/bevy-baked-gi

### Implementation notes

This patch generalizes light probes so as to reuse as much code as
possible between irradiance volumes and the existing reflection probes.
This approach was chosen because both techniques share numerous
similarities:

1. Both irradiance volumes and reflection probes are cuboid bounding
regions.
2. Both are responsible for providing baked indirect light.
3. Both techniques involve presenting a variable number of textures to
the shader from which indirect light is sampled. (In the current
implementation, this uses binding arrays.)
4. Both irradiance volumes and reflection probes require gathering and
sorting probes by distance on CPU.
5. Both techniques require the GPU to search through a list of bounding
regions.
6. Both will eventually want to have falloff so that we can smoothly
blend as objects enter and exit the probes' influence ranges. (This is
not implemented yet to keep this patch relatively small and reviewable.)

To do this, we generalize most of the methods in the reflection probes
patch #11366 to be generic over a trait, `LightProbeComponent`. This
trait is implemented by both `EnvironmentMapLight` (for reflection
probes) and `IrradianceVolume` (for irradiance volumes). Using a trait
will allow us to add more types of light probes in the future. In
particular, I highly suspect we will want real-time reflection planes
for mirrors in the future, which can be easily slotted into this
framework.

## Changelog

> This section is optional. If this was a trivial fix, or has no
externally-visible impact, you can delete this section.

### Added
* A new `IrradianceVolume` asset type is available for baked voxelized
light probes. You can bake the global illumination using Blender or
another tool of your choice and use it in Bevy to apply indirect
illumination to dynamic objects.
2024-02-06 23:23:20 +00:00
Elabajaba
2a1ebc4ac4
sort by pipeline then mesh for non transparent passes for massively better batching (#11671)
# Objective

Bevy does ridiculous amount of drawcalls, and our batching isn't very
effective because we sort by distance and only batch if we get multiple
of the same object in a row. This can give us slightly better GPU
performance when not using the depth prepass (due to less overdraw), but
ends up being massively CPU bottlenecked due to doing thousands of
unnecessary drawcalls.

## Solution

Change the sort functions to sort by pipeline key then by mesh id for
large performance gains in more realistic scenes than our stress tests.

Pipelines changed:
- Opaque3d
- Opaque3dDeferred
- Opaque3dPrepass


![image](https://github.com/bevyengine/bevy/assets/177631/8c355256-ad86-4b47-81a0-f3906797fe7e)


---

## Changelog

- Opaque3d drawing order is now sorted by pipeline and mesh, rather than
by distance. This trades off a bit of GPU time in exchange for massively
better batching in scenes that aren't only drawing huge amounts of a
single object.
2024-02-05 22:12:22 +00:00
AxiomaticSemantics
2ebf5a303e
Remove TypeUuid (#11497)
# Objective
TypeUuid is deprecated, remove it.

## Migration Guide
Convert any uses of `#[derive(TypeUuid)]` with `#[derive(TypePath]` for
more complex uses see the relevant
[documentation](https://docs.rs/bevy/latest/bevy/prelude/trait.TypePath.html)
for more information.

---------

Co-authored-by: ebola <dev@axiomatic>
2024-01-25 16:16:58 +00:00
Alice Cecile
eb07d16871
Revert rendering-related associated type name changes (#11027)
# Objective

> Can anyone explain to me the reasoning of renaming all the types named
Query to Data. I'm talking about this PR
https://github.com/bevyengine/bevy/pull/10779 It doesn't make sense to
me that a bunch of types that are used to run queries aren't named Query
anymore. Like ViewQuery on the ViewNode is the type of the Query. I
don't really understand the point of the rename, it just seems like it
hides the fact that a query will run based on those types.


[@IceSentry](https://discord.com/channels/691052431525675048/692572690833473578/1184946251431694387)

## Solution

Revert several renames in #10779.

## Changelog

- `ViewNode::ViewData` is now `ViewNode::ViewQuery` again.

## Migration Guide

- This PR amends the migration guide in
https://github.com/bevyengine/bevy/pull/10779

---------

Co-authored-by: atlas dostal <rodol@rivalrebels.com>
2024-01-22 15:01:55 +00:00
Patrick Walton
83d6600267
Implement minimal reflection probes (fixed macOS, iOS, and Android). (#11366)
This pull request re-submits #10057, which was backed out for breaking
macOS, iOS, and Android. I've tested this version on macOS and Android
and on the iOS simulator.

# Objective

This pull request implements *reflection probes*, which generalize
environment maps to allow for multiple environment maps in the same
scene, each of which has an axis-aligned bounding box. This is a
standard feature of physically-based renderers and was inspired by [the
corresponding feature in Blender's Eevee renderer].

## Solution

This is a minimal implementation of reflection probes that allows
artists to define cuboid bounding regions associated with environment
maps. For every view, on every frame, a system builds up a list of the
nearest 4 reflection probes that are within the view's frustum and
supplies that list to the shader. The PBR fragment shader searches
through the list, finds the first containing reflection probe, and uses
it for indirect lighting, falling back to the view's environment map if
none is found. Both forward and deferred renderers are fully supported.

A reflection probe is an entity with a pair of components, *LightProbe*
and *EnvironmentMapLight* (as well as the standard *SpatialBundle*, to
position it in the world). The *LightProbe* component (along with the
*Transform*) defines the bounding region, while the
*EnvironmentMapLight* component specifies the associated diffuse and
specular cubemaps.

A frequent question is "why two components instead of just one?" The
advantages of this setup are:

1. It's readily extensible to other types of light probes, in particular
*irradiance volumes* (also known as ambient cubes or voxel global
illumination), which use the same approach of bounding cuboids. With a
single component that applies to both reflection probes and irradiance
volumes, we can share the logic that implements falloff and blending
between multiple light probes between both of those features.

2. It reduces duplication between the existing *EnvironmentMapLight* and
these new reflection probes. Systems can treat environment maps attached
to cameras the same way they treat environment maps applied to
reflection probes if they wish.

Internally, we gather up all environment maps in the scene and place
them in a cubemap array. At present, this means that all environment
maps must have the same size, mipmap count, and texture format. A
warning is emitted if this restriction is violated. We could potentially
relax this in the future as part of the automatic mipmap generation
work, which could easily do texture format conversion as part of its
preprocessing.

An easy way to generate reflection probe cubemaps is to bake them in
Blender and use the `export-blender-gi` tool that's part of the
[`bevy-baked-gi`] project. This tool takes a `.blend` file containing
baked cubemaps as input and exports cubemap images, pre-filtered with an
embedded fork of the [glTF IBL Sampler], alongside a corresponding
`.scn.ron` file that the scene spawner can use to recreate the
reflection probes.

Note that this is intentionally a minimal implementation, to aid
reviewability. Known issues are:

* Reflection probes are basically unsupported on WebGL 2, because WebGL
2 has no cubemap arrays. (Strictly speaking, you can have precisely one
reflection probe in the scene if you have no other cubemaps anywhere,
but this isn't very useful.)

* Reflection probes have no falloff, so reflections will abruptly change
when objects move from one bounding region to another.

* As mentioned before, all cubemaps in the world of a given type
(diffuse or specular) must have the same size, format, and mipmap count.

Future work includes:

* Blending between multiple reflection probes.

* A falloff/fade-out region so that reflected objects disappear
gradually instead of vanishing all at once.

* Irradiance volumes for voxel-based global illumination. This should
reuse much of the reflection probe logic, as they're both GI techniques
based on cuboid bounding regions.

* Support for WebGL 2, by breaking batches when reflection probes are
used.

These issues notwithstanding, I think it's best to land this with
roughly the current set of functionality, because this patch is useful
as is and adding everything above would make the pull request
significantly larger and harder to review.

---

## Changelog

### Added

* A new *LightProbe* component is available that specifies a bounding
region that an *EnvironmentMapLight* applies to. The combination of a
*LightProbe* and an *EnvironmentMapLight* offers *reflection probe*
functionality similar to that available in other engines.

[the corresponding feature in Blender's Eevee renderer]:
https://docs.blender.org/manual/en/latest/render/eevee/light_probes/reflection_cubemaps.html

[`bevy-baked-gi`]: https://github.com/pcwalton/bevy-baked-gi

[glTF IBL Sampler]: https://github.com/KhronosGroup/glTF-IBL-Sampler
2024-01-19 07:33:52 +00:00
François
3d996639a0
Revert "Implement minimal reflection probes. (#10057)" (#11307)
# Objective

- Fix working on macOS, iOS, Android on main 
- Fixes #11281 
- Fixes #11282 
- Fixes #11283 
- Fixes #11299

## Solution

- Revert #10057
2024-01-12 20:41:51 +00:00
Jakob Hellermann
a657478675
resolve all internal ambiguities (#10411)
- ignore all ambiguities that are not a problem
- remove `.before(Assets::<Image>::track_assets),` that points into a
different schedule (-> should this be caught?)
- add some explicit orderings:
- run `poll_receivers` and `update_accessibility_nodes` after
`window_closed` in `bevy_winit::accessibility`
  - run `bevy_ui::accessibility::calc_bounds` after `CameraUpdateSystem`
- run ` bevy_text::update_text2d_layout` and `bevy_ui::text_system`
after `font_atlas_set::remove_dropped_font_atlas_sets`
- add `app.ignore_ambiguity(a, b)` function for cases where you want to
ignore an ambiguity between two independent plugins `A` and `B`
- add `IgnoreAmbiguitiesPlugin` in `DefaultPlugins` that allows
cross-crate ambiguities like `bevy_animation`/`bevy_ui`
- Fixes https://github.com/bevyengine/bevy/issues/9511

## Before
**Render**
![render_schedule_Render
dot](https://github.com/bevyengine/bevy/assets/22177966/1c677968-7873-40cc-848c-91fca4c8e383)

**PostUpdate**
![schedule_PostUpdate
dot](https://github.com/bevyengine/bevy/assets/22177966/8fc61304-08d4-4533-8110-c04113a7367a)

## After
**Render**
![render_schedule_Render
dot](https://github.com/bevyengine/bevy/assets/22177966/462f3b28-cef7-4833-8619-1f5175983485)
**PostUpdate**
![schedule_PostUpdate
dot](https://github.com/bevyengine/bevy/assets/22177966/8cfb3d83-7842-4a84-9082-46177e1a6c70)

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Alice Cecile <alice.i.cecil@gmail.com>
Co-authored-by: François <mockersf@gmail.com>
2024-01-09 19:08:15 +00:00
Patrick Walton
54a943d232
Implement minimal reflection probes. (#10057)
# Objective

This pull request implements *reflection probes*, which generalize
environment maps to allow for multiple environment maps in the same
scene, each of which has an axis-aligned bounding box. This is a
standard feature of physically-based renderers and was inspired by [the
corresponding feature in Blender's Eevee renderer].

## Solution

This is a minimal implementation of reflection probes that allows
artists to define cuboid bounding regions associated with environment
maps. For every view, on every frame, a system builds up a list of the
nearest 4 reflection probes that are within the view's frustum and
supplies that list to the shader. The PBR fragment shader searches
through the list, finds the first containing reflection probe, and uses
it for indirect lighting, falling back to the view's environment map if
none is found. Both forward and deferred renderers are fully supported.

A reflection probe is an entity with a pair of components, *LightProbe*
and *EnvironmentMapLight* (as well as the standard *SpatialBundle*, to
position it in the world). The *LightProbe* component (along with the
*Transform*) defines the bounding region, while the
*EnvironmentMapLight* component specifies the associated diffuse and
specular cubemaps.

A frequent question is "why two components instead of just one?" The
advantages of this setup are:

1. It's readily extensible to other types of light probes, in particular
*irradiance volumes* (also known as ambient cubes or voxel global
illumination), which use the same approach of bounding cuboids. With a
single component that applies to both reflection probes and irradiance
volumes, we can share the logic that implements falloff and blending
between multiple light probes between both of those features.

2. It reduces duplication between the existing *EnvironmentMapLight* and
these new reflection probes. Systems can treat environment maps attached
to cameras the same way they treat environment maps applied to
reflection probes if they wish.

Internally, we gather up all environment maps in the scene and place
them in a cubemap array. At present, this means that all environment
maps must have the same size, mipmap count, and texture format. A
warning is emitted if this restriction is violated. We could potentially
relax this in the future as part of the automatic mipmap generation
work, which could easily do texture format conversion as part of its
preprocessing.

An easy way to generate reflection probe cubemaps is to bake them in
Blender and use the `export-blender-gi` tool that's part of the
[`bevy-baked-gi`] project. This tool takes a `.blend` file containing
baked cubemaps as input and exports cubemap images, pre-filtered with an
embedded fork of the [glTF IBL Sampler], alongside a corresponding
`.scn.ron` file that the scene spawner can use to recreate the
reflection probes.

Note that this is intentionally a minimal implementation, to aid
reviewability. Known issues are:

* Reflection probes are basically unsupported on WebGL 2, because WebGL
2 has no cubemap arrays. (Strictly speaking, you can have precisely one
reflection probe in the scene if you have no other cubemaps anywhere,
but this isn't very useful.)

* Reflection probes have no falloff, so reflections will abruptly change
when objects move from one bounding region to another.

* As mentioned before, all cubemaps in the world of a given type
(diffuse or specular) must have the same size, format, and mipmap count.

Future work includes:

* Blending between multiple reflection probes.

* A falloff/fade-out region so that reflected objects disappear
gradually instead of vanishing all at once.

* Irradiance volumes for voxel-based global illumination. This should
reuse much of the reflection probe logic, as they're both GI techniques
based on cuboid bounding regions.

* Support for WebGL 2, by breaking batches when reflection probes are
used.

These issues notwithstanding, I think it's best to land this with
roughly the current set of functionality, because this patch is useful
as is and adding everything above would make the pull request
significantly larger and harder to review.

---

## Changelog

### Added

* A new *LightProbe* component is available that specifies a bounding
region that an *EnvironmentMapLight* applies to. The combination of a
*LightProbe* and an *EnvironmentMapLight* offers *reflection probe*
functionality similar to that available in other engines.

[the corresponding feature in Blender's Eevee renderer]:
https://docs.blender.org/manual/en/latest/render/eevee/light_probes/reflection_cubemaps.html

[`bevy-baked-gi`]: https://github.com/pcwalton/bevy-baked-gi

[glTF IBL Sampler]: https://github.com/KhronosGroup/glTF-IBL-Sampler
2024-01-08 22:09:17 +00:00
JMS55
44424391fe
Unload render assets from RAM (#10520)
# Objective
- No point in keeping Meshes/Images in RAM once they're going to be sent
to the GPU, and kept in VRAM. This saves a _significant_ amount of
memory (several GBs) on scenes like bistro.
- References
  - https://github.com/bevyengine/bevy/pull/1782
  - https://github.com/bevyengine/bevy/pull/8624 

## Solution
- Augment RenderAsset with the capability to unload the underlying asset
after extracting to the render world.
- Mesh/Image now have a cpu_persistent_access field. If this field is
RenderAssetPersistencePolicy::Unload, the asset will be unloaded from
Assets<T>.
- A new AssetEvent is sent upon dropping the last strong handle for the
asset, which signals to the RenderAsset to remove the GPU version of the
asset.

---

## Changelog
- Added `AssetEvent::NoLongerUsed` and
`AssetEvent::is_no_longer_used()`. This event is sent when the last
strong handle of an asset is dropped.
- Rewrote the API for `RenderAsset` to allow for unloading the asset
data from the CPU.
- Added `RenderAssetPersistencePolicy`.
- Added `Mesh::cpu_persistent_access` for memory savings when the asset
is not needed except for on the GPU.
- Added `Image::cpu_persistent_access` for memory savings when the asset
is not needed except for on the GPU.
- Added `ImageLoaderSettings::cpu_persistent_access`.
- Added `ExrTextureLoaderSettings`.
- Added `HdrTextureLoaderSettings`.

## Migration Guide
- Asset loaders (GLTF, etc) now load meshes and textures without
`cpu_persistent_access`. These assets will be removed from
`Assets<Mesh>` and `Assets<Image>` once `RenderAssets<Mesh>` and
`RenderAssets<Image>` contain the GPU versions of these assets, in order
to reduce memory usage. If you require access to the asset data from the
CPU in future frames after the GLTF asset has been loaded, modify all
dependent `Mesh` and `Image` assets and set `cpu_persistent_access` to
`RenderAssetPersistencePolicy::Keep`.
- `Mesh` now requires a new `cpu_persistent_access` field. Set it to
`RenderAssetPersistencePolicy::Keep` to mimic the previous behavior.
- `Image` now requires a new `cpu_persistent_access` field. Set it to
`RenderAssetPersistencePolicy::Keep` to mimic the previous behavior.
- `MorphTargetImage::new()` now requires a new `cpu_persistent_access`
parameter. Set it to `RenderAssetPersistencePolicy::Keep` to mimic the
previous behavior.
- `DynamicTextureAtlasBuilder::add_texture()` now requires that the
`TextureAtlas` you pass has an `Image` with `cpu_persistent_access:
RenderAssetPersistencePolicy::Keep`. Ensure you construct the image
properly for the texture atlas.
- The `RenderAsset` trait has significantly changed, and requires
adapting your existing implementations.
  - The trait now requires `Clone`.
- The `ExtractedAsset` associated type has been removed (the type itself
is now extracted).
  - The signature of `prepare_asset()` is slightly different
- A new `persistence_policy()` method is now required (return
RenderAssetPersistencePolicy::Unload to match the previous behavior).
- Match on the new `NoLongerUsed` variant for exhaustive matches of
`AssetEvent`.
2024-01-03 03:31:04 +00:00
Patrick Walton
dd14f3a477
Implement lightmaps. (#10231)
![Screenshot](https://i.imgur.com/A4KzWFq.png)

# Objective

Lightmaps, textures that store baked global illumination, have been a
mainstay of real-time graphics for decades. Bevy currently has no
support for them, so this pull request implements them.

## Solution

The new `Lightmap` component can be attached to any entity that contains
a `Handle<Mesh>` and a `StandardMaterial`. When present, it will be
applied in the PBR shader. Because multiple lightmaps are frequently
packed into atlases, each lightmap may have its own UV boundaries within
its texture. An `exposure` field is also provided, to control the
brightness of the lightmap.

Note that this PR doesn't provide any way to bake the lightmaps. That
can be done with [The Lightmapper] or another solution, such as Unity's
Bakery.

---

## Changelog

### Added
* A new component, `Lightmap`, is available, for baked global
illumination. If your mesh has a second UV channel (UV1), and you attach
this component to the entity with that mesh, Bevy will apply the texture
referenced in the lightmap.

[The Lightmapper]: https://github.com/Naxela/The_Lightmapper

---------

Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2024-01-02 20:38:47 +00:00
Brian Reavis
846a871cb2
Export tonemapping_pipeline_key (2d), alpha_mode_pipeline_key (#11166)
This expands upon https://github.com/bevyengine/bevy/pull/11134.

I found myself needing `tonemapping_pipeline_key` for some custom 2d
draw functions. #11134 exported the 3d version of
`tonemapping_pipeline_key` and this PR exports the 2d version. I also
made `alpha_mode_pipeline_key` public for good measure.
2024-01-01 23:57:12 +00:00
Marco Buono
c2ab3a0402
Do not load prepass normals for transmissive materials (#11140)
Turns out whenever a normal prepass was active (which includes whenever
you use SSAO) we were attempting to read the normals from the prepass
for the specular transmissive material. Since transmissive materials
don't participate in the prepass (unlike opaque materials) we were
reading the normals from “behind” the mesh, producing really weird
visual results.

# Objective

- Fixes #11112.

## Solution

- We introduce a new `READS_VIEW_TRANSMISSION_TEXTURE` mesh pipeline
key;
- We set it whenever the material properties has the
`reads_view_transmission_texture` flag set; (i.e. the material is
transmissive)
- If this key is set we prevent the reading of normals from the prepass,
by not setting the `LOAD_PREPASS_NORMALS` shader def.

---

## Changelog

### Fixed

- Specular transmissive materials no longer attempt to erroneously load
prepass normals, and now work correctly even with the normal prepass
active (e.g. when using SSAO)
2024-01-01 17:04:20 +00:00
JMS55
3d3a065820
Misc cleanup (#11134)
Re-exports a few types/functions I need that have no reason to be
private, and some minor code quality changes.
2023-12-30 23:27:48 +00:00
Mantas
5af2f022d8
Rename WorldQueryData & WorldQueryFilter to QueryData & QueryFilter (#10779)
# Rename `WorldQueryData` & `WorldQueryFilter` to `QueryData` &
`QueryFilter`

Fixes #10776 

## Solution

Traits `WorldQueryData` & `WorldQueryFilter` were renamed to `QueryData`
and `QueryFilter`, respectively. Related Trait types were also renamed.

---

## Changelog

- Trait `WorldQueryData` has been renamed to `QueryData`. Derive macro's
`QueryData` attribute `world_query_data` has been renamed to
`query_data`.
- Trait `WorldQueryFilter` has been renamed to `QueryFilter`. Derive
macro's `QueryFilter` attribute `world_query_filter` has been renamed to
`query_filter`.
- Trait's `ExtractComponent` type `Query` has been renamed to `Data`.
- Trait's `GetBatchData` types `Query` & `QueryFilter` has been renamed
to `Data` & `Filter`, respectively.
- Trait's `ExtractInstance` type `Query` has been renamed to `Data`.
- Trait's `ViewNode` type `ViewQuery` has been renamed to `ViewData`.
- Trait's `RenderCommand` types `ViewWorldQuery` & `ItemWorldQuery` has
been renamed to `ViewData` & `ItemData`, respectively.

## Migration Guide

Note: if merged before 0.13 is released, this should instead modify the
migration guide of #10776 with the updated names.

- Rename `WorldQueryData` & `WorldQueryFilter` trait usages to
`QueryData` & `QueryFilter` and their respective derive macro attributes
`world_query_data` & `world_query_filter` to `query_data` &
`query_filter`.
- Rename the following trait type usages:
  - Trait's `ExtractComponent` type `Query` to `Data`.
  - Trait's `GetBatchData` type `Query` to `Data`.
  - Trait's `ExtractInstance` type `Query` to `Data`.
  - Trait's `ViewNode` type `ViewQuery` to `ViewData`'
- Trait's `RenderCommand` types `ViewWolrdQuery` & `ItemWorldQuery` to
`ViewData` & `ItemData`, respectively.

```rust
// Before
#[derive(WorldQueryData)]
#[world_query_data(derive(Debug))]
struct EmptyQuery {
    empty: (),
}

// After
#[derive(QueryData)]
#[query_data(derive(Debug))]
struct EmptyQuery {
    empty: (),
}

// Before
#[derive(WorldQueryFilter)]
struct CustomQueryFilter<T: Component, P: Component> {
    _c: With<ComponentC>,
    _d: With<ComponentD>,
    _or: Or<(Added<ComponentC>, Changed<ComponentD>, Without<ComponentZ>)>,
    _generic_tuple: (With<T>, With<P>),
}

// After
#[derive(QueryFilter)]
struct CustomQueryFilter<T: Component, P: Component> {
    _c: With<ComponentC>,
    _d: With<ComponentD>,
    _or: Or<(Added<ComponentC>, Changed<ComponentD>, Without<ComponentZ>)>,
    _generic_tuple: (With<T>, With<P>),
}

// Before
impl ExtractComponent for ContrastAdaptiveSharpeningSettings {
    type Query = &'static Self;
    type Filter = With<Camera>;
    type Out = (DenoiseCAS, CASUniform);

    fn extract_component(item: QueryItem<Self::Query>) -> Option<Self::Out> {
        //...
    }
}

// After
impl ExtractComponent for ContrastAdaptiveSharpeningSettings {
    type Data = &'static Self;
    type Filter = With<Camera>;
    type Out = (DenoiseCAS, CASUniform);

    fn extract_component(item: QueryItem<Self::Data>) -> Option<Self::Out> {
        //...
    }
}

// Before
impl GetBatchData for MeshPipeline {
    type Param = SRes<RenderMeshInstances>;
    type Query = Entity;
    type QueryFilter = With<Mesh3d>;
    type CompareData = (MaterialBindGroupId, AssetId<Mesh>);
    type BufferData = MeshUniform;

    fn get_batch_data(
        mesh_instances: &SystemParamItem<Self::Param>,
        entity: &QueryItem<Self::Query>,
    ) -> (Self::BufferData, Option<Self::CompareData>) {
        // ....
    }
}

// After
impl GetBatchData for MeshPipeline {
    type Param = SRes<RenderMeshInstances>;
    type Data = Entity;
    type Filter = With<Mesh3d>;
    type CompareData = (MaterialBindGroupId, AssetId<Mesh>);
    type BufferData = MeshUniform;

    fn get_batch_data(
        mesh_instances: &SystemParamItem<Self::Param>,
        entity: &QueryItem<Self::Data>,
    ) -> (Self::BufferData, Option<Self::CompareData>) {
        // ....
    }
}

// Before
impl<A> ExtractInstance for AssetId<A>
where
    A: Asset,
{
    type Query = Read<Handle<A>>;
    type Filter = ();

    fn extract(item: QueryItem<'_, Self::Query>) -> Option<Self> {
        Some(item.id())
    }
}

// After
impl<A> ExtractInstance for AssetId<A>
where
    A: Asset,
{
    type Data = Read<Handle<A>>;
    type Filter = ();

    fn extract(item: QueryItem<'_, Self::Data>) -> Option<Self> {
        Some(item.id())
    }
}

// Before
impl ViewNode for PostProcessNode {
    type ViewQuery = (
        &'static ViewTarget,
        &'static PostProcessSettings,
    );

    fn run(
        &self,
        _graph: &mut RenderGraphContext,
        render_context: &mut RenderContext,
        (view_target, _post_process_settings): QueryItem<Self::ViewQuery>,
        world: &World,
    ) -> Result<(), NodeRunError> {
        // ...
    }
}

// After
impl ViewNode for PostProcessNode {
    type ViewData = (
        &'static ViewTarget,
        &'static PostProcessSettings,
    );

    fn run(
        &self,
        _graph: &mut RenderGraphContext,
        render_context: &mut RenderContext,
        (view_target, _post_process_settings): QueryItem<Self::ViewData>,
        world: &World,
    ) -> Result<(), NodeRunError> {
        // ...
    }
}

// Before
impl<P: CachedRenderPipelinePhaseItem> RenderCommand<P> for SetItemPipeline {
    type Param = SRes<PipelineCache>;
    type ViewWorldQuery = ();
    type ItemWorldQuery = ();
    #[inline]
    fn render<'w>(
        item: &P,
        _view: (),
        _entity: (),
        pipeline_cache: SystemParamItem<'w, '_, Self::Param>,
        pass: &mut TrackedRenderPass<'w>,
    ) -> RenderCommandResult {
        // ...
    }
}

// After
impl<P: CachedRenderPipelinePhaseItem> RenderCommand<P> for SetItemPipeline {
    type Param = SRes<PipelineCache>;
    type ViewData = ();
    type ItemData = ();
    #[inline]
    fn render<'w>(
        item: &P,
        _view: (),
        _entity: (),
        pipeline_cache: SystemParamItem<'w, '_, Self::Param>,
        pass: &mut TrackedRenderPass<'w>,
    ) -> RenderCommandResult {
        // ...
    }
}
```
2023-12-12 19:45:50 +00:00
tygyh
fd308571c4
Remove unnecessary path prefixes (#10749)
# Objective

- Shorten paths by removing unnecessary prefixes

## Solution

- Remove the prefixes from many paths which do not need them. Finding
the paths was done automatically using built-in refactoring tools in
Jetbrains RustRover.
2023-11-28 23:43:40 +00:00
JMS55
4bf20e7d27
Swap material and mesh bind groups (#10485)
# Objective
- Materials should be a more frequent rebind then meshes (due to being
able to use a single vertex buffer, such as in #10164) and therefore
should be in a higher bind group.

---

## Changelog
- For 2d and 3d mesh/material setups (but not UI materials, or other
rendering setups such as gizmos, sprites, or text), mesh data is now in
bind group 1, and material data is now in bind group 2, which is swapped
from how they were before.

## Migration Guide
- Custom 2d and 3d mesh/material shaders should now use bind group 2
`@group(2) @binding(x)` for their bound resources, instead of bind group
1.
- Many internal pieces of rendering code have changed so that mesh data
is now in bind group 1, and material data is now in bind group 2.
Semi-custom rendering setups (that don't use the Material or Material2d
APIs) should adapt to these changes.
2023-11-28 22:26:22 +00:00
Marco Buono
44928e0df4
StandardMaterial Light Transmission (#8015)
# Objective

<img width="1920" alt="Screenshot 2023-04-26 at 01 07 34"
src="https://user-images.githubusercontent.com/418473/234467578-0f34187b-5863-4ea1-88e9-7a6bb8ce8da3.png">

This PR adds both diffuse and specular light transmission capabilities
to the `StandardMaterial`, with support for screen space refractions.
This enables realistically representing a wide range of real-world
materials, such as:

  - Glass; (Including frosted glass)
  - Transparent and translucent plastics;
  - Various liquids and gels;
  - Gemstones;
  - Marble;
  - Wax;
  - Paper;
  - Leaves;
  - Porcelain.

Unlike existing support for transparency, light transmission does not
rely on fixed function alpha blending, and therefore works with both
`AlphaMode::Opaque` and `AlphaMode::Mask` materials.

## Solution

- Introduces a number of transmission related fields in the
`StandardMaterial`;
- For specular transmission:
- Adds logic to take a view main texture snapshot after the opaque
phase; (in order to perform screen space refractions)
- Introduces a new `Transmissive3d` phase to the renderer, to which all
meshes with `transmission > 0.0` materials are sent.
- Calculates a light exit point (of the approximate mesh volume) using
`ior` and `thickness` properties
- Samples the snapshot texture with an adaptive number of taps across a
`roughness`-controlled radius enabling “blurry” refractions
- For diffuse transmission:
- Approximates transmitted diffuse light by using a second, flipped +
displaced, diffuse-only Lambertian lobe for each light source.

## To Do

- [x] Figure out where `fresnel_mix()` is taking place, if at all, and
where `dielectric_specular` is being calculated, if at all, and update
them to use the `ior` value (Not a blocker, just a nice-to-have for more
correct BSDF)
- To the _best of my knowledge, this is now taking place, after
964340cdd. The fresnel mix is actually "split" into two parts in our
implementation, one `(1 - fresnel(...))` in the transmission, and
`fresnel()` in the light implementations. A surface with more
reflectance now will produce slightly dimmer transmission towards the
grazing angle, as more of the light gets reflected.
- [x] Add `transmission_texture`
- [x] Add `diffuse_transmission_texture`
- [x] Add `thickness_texture`
- [x] Add `attenuation_distance` and `attenuation_color`
- [x] Connect values to glTF loader
  - [x] `transmission` and `transmission_texture`
  - [x] `thickness` and `thickness_texture`
  - [x] `ior`
- [ ] `diffuse_transmission` and `diffuse_transmission_texture` (needs
upstream support in `gltf` crate, not a blocker)
- [x] Add support for multiple screen space refraction “steps”
- [x] Conditionally create no transmission snapshot texture at all if
`steps == 0`
- [x] Conditionally enable/disable screen space refraction transmission
snapshots
- [x] Read from depth pre-pass to prevent refracting pixels in front of
the light exit point
- [x] Use `interleaved_gradient_noise()` function for sampling blur in a
way that benefits from TAA
- [x] Drill down a TAA `#define`, tweak some aspects of the effect
conditionally based on it
- [x] Remove const array that's crashing under HLSL (unless a new `naga`
release with https://github.com/gfx-rs/naga/pull/2496 comes out before
we merge this)
- [ ] Look into alternatives to the `switch` hack for dynamically
indexing the const array (might not be needed, compilers seem to be
decent at expanding it)
- [ ] Add pipeline keys for gating transmission (do we really want/need
this?)
- [x] Tweak some material field/function names?

## A Note on Texture Packing

_This was originally added as a comment to the
`specular_transmission_texture`, `thickness_texture` and
`diffuse_transmission_texture` documentation, I removed it since it was
more confusing than helpful, and will likely be made redundant/will need
to be updated once we have a better infrastructure for preprocessing
assets_

Due to how channels are mapped, you can more efficiently use a single
shared texture image
for configuring the following:

- R - `specular_transmission_texture`
- G - `thickness_texture`
- B - _unused_
- A - `diffuse_transmission_texture`

The `KHR_materials_diffuse_transmission` glTF extension also defines a
`diffuseTransmissionColorTexture`,
that _we don't currently support_. One might choose to pack the
intensity and color textures together,
using RGB for the color and A for the intensity, in which case this
packing advice doesn't really apply.

---

## Changelog

- Added a new `Transmissive3d` render phase for rendering specular
transmissive materials with screen space refractions
- Added rendering support for transmitted environment map light on the
`StandardMaterial` as a fallback for screen space refractions
- Added `diffuse_transmission`, `specular_transmission`, `thickness`,
`ior`, `attenuation_distance` and `attenuation_color` to the
`StandardMaterial`
- Added `diffuse_transmission_texture`, `specular_transmission_texture`,
`thickness_texture` to the `StandardMaterial`, gated behind a new
`pbr_transmission_textures` cargo feature (off by default, for maximum
hardware compatibility)
- Added `Camera3d::screen_space_specular_transmission_steps` for
controlling the number of “layers of transparency” rendered for
transmissive objects
- Added a `TransmittedShadowReceiver` component for enabling shadows in
(diffusely) transmitted light. (disabled by default, as it requires
carefully setting up the `thickness` to avoid self-shadow artifacts)
- Added support for the `KHR_materials_transmission`,
`KHR_materials_ior` and `KHR_materials_volume` glTF extensions
- Renamed items related to temporal jitter for greater consistency

## Migration Guide

- `SsaoPipelineKey::temporal_noise` has been renamed to
`SsaoPipelineKey::temporal_jitter`
- The `TAA` shader def (controlled by the presence of the
`TemporalAntiAliasSettings` component in the camera) has been replaced
with the `TEMPORAL_JITTER` shader def (controlled by the presence of the
`TemporalJitter` component in the camera)
- `MeshPipelineKey::TAA` has been replaced by
`MeshPipelineKey::TEMPORAL_JITTER`
- The `TEMPORAL_NOISE` shader def has been consolidated with
`TEMPORAL_JITTER`
2023-10-31 20:59:02 +00:00
Nicola Papale
66f72dd25b
Use wildcard imports in bevy_pbr (#9847)
# Objective

- the style of import used by bevy guarantees merge conflicts when any
file change
- This is especially true when import lists are large, such as in
`bevy_pbr`
- Merge conflicts are tricky to resolve. This bogs down rendering PRs
and makes contributing to bevy's rendering system more difficult than it
needs to

## Solution

- Use wildcard imports to replace multiline import list in `bevy_pbr`

I suspect this is controversial, but I'd like to hear alternatives.
Because this is one of many papercuts that makes developing render
features near impossible.
2023-10-25 08:40:55 +00:00
Griffin
1bd7e5a8e6
View Transformations (#9726)
# Objective

- Add functions for common view transformations.

---------

Co-authored-by: Robert Swain <robert.swain@gmail.com>
2023-10-24 21:26:19 +00:00
robtfm
c99351f7c2
allow extensions to StandardMaterial (#7820)
# Objective

allow extending `Material`s (including the built in `StandardMaterial`)
with custom vertex/fragment shaders and additional data, to easily get
pbr lighting with custom modifications, or otherwise extend a base
material.

# Solution

- added `ExtendedMaterial<B: Material, E: MaterialExtension>` which
contains a base material and a user-defined extension.
- added example `extended_material` showing how to use it
- modified AsBindGroup to have "unprepared" functions that return raw
resources / layout entries so that the extended material can combine
them

note: doesn't currently work with array resources, as i can't figure out
how to make the OwnedBindingResource::get_binding() work, as wgpu
requires a `&'a[&'a TextureView]` and i have a `Vec<TextureView>`.

# Migration Guide

manual implementations of `AsBindGroup` will need to be adjusted, the
changes are pretty straightforward and can be seen in the diff for e.g.
the `texture_binding_array` example.

---------

Co-authored-by: Robert Swain <robert.swain@gmail.com>
2023-10-17 21:28:08 +00:00
Marco Buono
5733d2403e
*_PREPASS Shader Def Cleanup (#10136)
# Objective

- This PR aims to make the various `*_PREPASS` shader defs we have
(`NORMAL_PREPASS`, `DEPTH_PREPASS`, `MOTION_VECTORS_PREPASS` AND
`DEFERRED_PREPASS`) easier to use and understand:
- So that their meaning is now consistent across all contexts; (“prepass
X is enabled for the current view”)
  - So that they're also consistently set across all contexts.
- It also aims to enable us to (with a follow up PR) to conditionally
gate the `BindGroupEntry` and `BindGroupLayoutEntry` items associated
with these prepasses, saving us up to 4 texture slots in WebGL
(currently globally limited to 16 per shader, regardless of bind groups)

## Solution

- We now consistently set these from `PrepassPipeline`, the
`MeshPipeline` and the `DeferredLightingPipeline`, we also set their
`MeshPipelineKey`s;
- We introduce `PREPASS_PIPELINE`, `MESH_PIPELINE` and
`DEFERRED_LIGHTING_PIPELINE` that can be used to detect where the code
is running, without overloading the meanings of the prepass shader defs;
- We also gate the WGSL functions in `bevy_pbr::prepass_utils` with
`#ifdef`s for their respective shader defs, so that shader code can
provide a fallback whenever they're not available.
- This allows us to conditionally include the bindings for these prepass
textures (My next PR, which will hopefully unblock #8015)
- @robtfm mentioned [these were being used to prevent accessing the same
binding as read/write in the
prepass](https://discord.com/channels/691052431525675048/743663924229963868/1163270458393759814),
however even after reversing the `#ifndef`s I had no issues running the
code, so perhaps the compiler is already smart enough even without tree
shaking to know they're not being used, thanks to `#ifdef
PREPASS_PIPELINE`?

## Comparison

### Before

| Shader Def | `PrepassPipeline` | `MeshPipeline` |
`DeferredLightingPipeline` |
| ------------------------ | ----------------- | -------------- |
-------------------------- |
| `NORMAL_PREPASS` | Yes | No | No |
| `DEPTH_PREPASS` | Yes | No | No |
| `MOTION_VECTORS_PREPASS` | Yes | No | No |
| `DEFERRED_PREPASS` | Yes | No | No |

| View Key | `PrepassPipeline` | `MeshPipeline` |
`DeferredLightingPipeline` |
| ------------------------ | ----------------- | -------------- |
-------------------------- |
| `NORMAL_PREPASS` | Yes | Yes | No |
| `DEPTH_PREPASS` | Yes | No | No |
| `MOTION_VECTORS_PREPASS` | Yes | No | No |
| `DEFERRED_PREPASS` | Yes | Yes\* | No |

\* Accidentally was being set twice, once with only
`deferred_prepass.is_some()` as a condition,
and once with `deferred_p repass.is_some() && !forward` as a condition.

### After

| Shader Def | `PrepassPipeline` | `MeshPipeline` |
`DeferredLightingPipeline` |
| ---------------------------- | ----------------- | --------------- |
-------------------------- |
| `NORMAL_PREPASS` | Yes | Yes | Yes |
| `DEPTH_PREPASS` | Yes | Yes | Yes |
| `MOTION_VECTORS_PREPASS` | Yes | Yes | Yes |
| `DEFERRED_PREPASS` | Yes | Yes | Unconditionally |
| `PREPASS_PIPELINE` | Unconditionally | No | No |
| `MESH_PIPELINE` | No | Unconditionally | No |
| `DEFERRED_LIGHTING_PIPELINE` | No | No | Unconditionally |

| View Key | `PrepassPipeline` | `MeshPipeline` |
`DeferredLightingPipeline` |
| ------------------------ | ----------------- | -------------- |
-------------------------- |
| `NORMAL_PREPASS` | Yes | Yes | Yes |
| `DEPTH_PREPASS` | Yes | Yes | Yes |
| `MOTION_VECTORS_PREPASS` | Yes | Yes | Yes |
| `DEFERRED_PREPASS` | Yes | Yes | Unconditionally |

---

## Changelog

- Cleaned up WGSL `*_PREPASS` shader defs so they're now consistently
used everywhere;
- Introduced `PREPASS_PIPELINE`, `MESH_PIPELINE` and
`DEFERRED_LIGHTING_PIPELINE` WGSL shader defs for conditionally
compiling logic based the current pipeline;
- WGSL functions from `bevy_pbr::prepass_utils` are now guarded with
`#ifdef` based on the currently enabled prepasses;

## Migration Guide

- When using functions from `bevy_pbr::prepass_utils`
(`prepass_depth()`, `prepass_normal()`, `prepass_motion_vector()`) in
contexts where these prepasses might be disabled, you should now wrap
your calls with the appropriate `#ifdef` guards, (`#ifdef
DEPTH_PREPASS`, `#ifdef NORMAL_PREPASS`, `#ifdef MOTION_VECTOR_PREPASS`)
providing fallback logic where applicable.

---------

Co-authored-by: Carter Anderson <mcanders1@gmail.com>
Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>
2023-10-17 00:16:21 +00:00
Edgar Geier
e23d7cf501
Explain usage of prepass shaders in docs for Material trait (#9025)
# Objective

- Fixes #8696.

## Solution

- Added a paragraph describing the usage of the `prepass_vertex_shader`
and `prepass_fragment_shader`.
2023-10-16 13:39:17 +00:00
Jan Češpivo
4a61f894b7
chore: Renamed RenderInstance trait to ExtractInstance (#10065)
# Objective

Fixes [#10061]

## Solution

Renamed `RenderInstance` to `ExtractInstance`, `RenderInstances` to
`ExtractedInstances` and `RenderInstancePlugin` to
`ExtractInstancesPlugin`
2023-10-13 17:06:53 +00:00
Griffin
a15d152635
Deferred Renderer (#9258)
# Objective

- Add a [Deferred
Renderer](https://en.wikipedia.org/wiki/Deferred_shading) to Bevy.
- This allows subsequent passes to access per pixel material information
before/during shading.
- Accessing this per pixel material information is needed for some
features, like GI. It also makes other features (ex. Decals) simpler to
implement and/or improves their capability. There are multiple
approaches to accomplishing this. The deferred shading approach works
well given the limitations of WebGPU and WebGL2.

Motivation: [I'm working on a GI solution for
Bevy](https://youtu.be/eH1AkL-mwhI)

# Solution
- The deferred renderer is implemented with a prepass and a deferred
lighting pass.
- The prepass renders opaque objects into the Gbuffer attachment
(`Rgba32Uint`). The PBR shader generates a `PbrInput` in mostly the same
way as the forward implementation and then [packs it into the
Gbuffer](ec1465559f/crates/bevy_pbr/src/render/pbr.wgsl (L168)).
- The deferred lighting pass unpacks the `PbrInput` and [feeds it into
the pbr()
function](ec1465559f/crates/bevy_pbr/src/deferred/deferred_lighting.wgsl (L65)),
then outputs the shaded color data.

- There is now a resource
[DefaultOpaqueRendererMethod](ec1465559f/crates/bevy_pbr/src/material.rs (L599))
that can be used to set the default render method for opaque materials.
If materials return `None` from
[opaque_render_method()](ec1465559f/crates/bevy_pbr/src/material.rs (L131))
the `DefaultOpaqueRendererMethod` will be used. Otherwise, custom
materials can also explicitly choose to only support Deferred or Forward
by returning the respective
[OpaqueRendererMethod](ec1465559f/crates/bevy_pbr/src/material.rs (L603))

- Deferred materials can be used seamlessly along with both opaque and
transparent forward rendered materials in the same scene. The [deferred
rendering
example](https://github.com/DGriffin91/bevy/blob/deferred/examples/3d/deferred_rendering.rs)
does this.

- The deferred renderer does not support MSAA. If any deferred materials
are used, MSAA must be disabled. Both TAA and FXAA are supported.

- Deferred rendering supports WebGL2/WebGPU. 

## Custom deferred materials
- Custom materials can support both deferred and forward at the same
time. The
[StandardMaterial](ec1465559f/crates/bevy_pbr/src/render/pbr.wgsl (L166))
does this. So does [this
example](https://github.com/DGriffin91/bevy_glowy_orb_tutorial/blob/deferred/assets/shaders/glowy.wgsl#L56).
- Custom deferred materials that require PBR lighting can create a
`PbrInput`, write it to the deferred GBuffer and let it be rendered by
the `PBRDeferredLightingPlugin`.
- Custom deferred materials that require custom lighting have two
options:
1. Use the base_color channel of the `PbrInput` combined with the
`STANDARD_MATERIAL_FLAGS_UNLIT_BIT` flag.
[Example.](https://github.com/DGriffin91/bevy_glowy_orb_tutorial/blob/deferred/assets/shaders/glowy.wgsl#L56)
(If the unlit bit is set, the base_color is stored as RGB9E5 for extra
precision)
2. A Custom Deferred Lighting pass can be created, either overriding the
default, or running in addition. The a depth buffer is used to limit
rendering to only the required fragments for each deferred lighting
pass. Materials can set their respective depth id via the
[deferred_lighting_pass_id](b79182d2a3/crates/bevy_pbr/src/prepass/prepass_io.wgsl (L95))
attachment. The custom deferred lighting pass plugin can then set [its
corresponding
depth](ec1465559f/crates/bevy_pbr/src/deferred/deferred_lighting.wgsl (L37)).
Then with the lighting pass using
[CompareFunction::Equal](ec1465559f/crates/bevy_pbr/src/deferred/mod.rs (L335)),
only the fragments with a depth that equal the corresponding depth
written in the material will be rendered.

Custom deferred lighting plugins can also be created to render the
StandardMaterial. The default deferred lighting plugin can be bypassed
with `DefaultPlugins.set(PBRDeferredLightingPlugin { bypass: true })`

---------

Co-authored-by: nickrart <nickolas.g.russell@gmail.com>
2023-10-12 22:10:38 +00:00
Patrick Walton
e67d63aa79
Refactor the render instance logic in #9903 so that it's easier for other components to adopt. (#10002)
# Objective

Currently, the only way for custom components that participate in
rendering to opt into the higher-performance extraction method in #9903
is to implement the `RenderInstances` data structure and the extraction
logic manually. This is inconvenient compared to the `ExtractComponent`
API.

## Solution

This commit creates a new `RenderInstance` trait that mirrors the
existing `ExtractComponent` method but uses the higher-performance
approach that #9903 uses. Additionally, `RenderInstance` is more
flexible than `ExtractComponent`, because it can extract multiple
components at once. This makes high-performance rendering components
essentially as easy to write as the existing ones based on component
extraction.

---

## Changelog

### Added

A new `RenderInstance` trait is available mirroring `ExtractComponent`,
but using a higher-performance method to extract one or more components
to the render world. If you have custom components that rendering takes
into account, you may consider migration from `ExtractComponent` to
`RenderInstance` for higher performance.
2023-10-08 10:34:44 +00:00
JMS55
1f95a484ed
PCF For DirectionalLight/SpotLight Shadows (#8006)
# Objective

- Improve antialiasing for non-point light shadow edges.
- Very partially addresses
https://github.com/bevyengine/bevy/issues/3628.

## Solution

- Implements "The Witness"'s shadow map sampling technique.
  - Ported from @superdump's old branch, all credit to them :)
- Implements "Call of Duty: Advanced Warfare"'s stochastic shadow map
sampling technique when the velocity prepass is enabled, for use with
TAA.
- Uses interleaved gradient noise to generate a random angle, and then
averages 8 samples in a spiral pattern, rotated by the random angle.
- I also tried spatiotemporal blue noise, but it was far too noisy to be
filtered by TAA alone. In the future, we should try spatiotemporal blue
noise + a specialized shadow denoiser such as
https://gpuopen.com/fidelityfx-denoiser/#shadow. This approach would
also be useful for hybrid rasterized applications with raytraced
shadows.
- The COD presentation has an interesting temporal dithering of the
noise for use with temporal supersampling that we should revisit when we
get DLSS/FSR/other TSR.

---

## Changelog

* Added `ShadowFilteringMethod`. Improved directional light and
spotlight shadow edges to be less aliased.

## Migration Guide

* Shadows cast by directional lights or spotlights now have smoother
edges. To revert to the old behavior, add
`ShadowFilteringMethod::Hardware2x2` to your cameras.

---------

Co-authored-by: IceSentry <c.giguere42@gmail.com>
Co-authored-by: Daniel Chia <danstryder@gmail.com>
Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>
Co-authored-by: Brandon Dyer <brandondyer64@gmail.com>
Co-authored-by: Edgar Geier <geieredgar@gmail.com>
Co-authored-by: Robert Swain <robert.swain@gmail.com>
Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com>
Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>
2023-10-07 17:13:29 +00:00
Robert Swain
b6ead2be95
Use EntityHashMap<Entity, T> for render world entity storage for better performance (#9903)
# Objective

- Improve rendering performance, particularly by avoiding the large
system commands costs of using the ECS in the way that the render world
does.

## Solution

- Define `EntityHasher` that calculates a hash from the
`Entity.to_bits()` by `i | (i.wrapping_mul(0x517cc1b727220a95) << 32)`.
`0x517cc1b727220a95` is something like `u64::MAX / N` for N that gives a
value close to π and that works well for hashing. Thanks for @SkiFire13
for the suggestion and to @nicopap for alternative suggestions and
discussion. This approach comes from `rustc-hash` (a.k.a. `FxHasher`)
with some tweaks for the case of hashing an `Entity`. `FxHasher` and
`SeaHasher` were also tested but were significantly slower.
- Define `EntityHashMap` type that uses the `EntityHashser`
- Use `EntityHashMap<Entity, T>` for render world entity storage,
including:
- `RenderMaterialInstances` - contains the `AssetId<M>` of the material
associated with the entity. Also for 2D.
- `RenderMeshInstances` - contains mesh transforms, flags and properties
about mesh entities. Also for 2D.
- `SkinIndices` and `MorphIndices` - contains the skin and morph index
for an entity, respectively
  - `ExtractedSprites`
  - `ExtractedUiNodes`

## Benchmarks

All benchmarks have been conducted on an M1 Max connected to AC power.
The tests are run for 1500 frames. The 1000th frame is captured for
comparison to check for visual regressions. There were none.

### 2D Meshes

`bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d`

#### `--ordered-z`

This test spawns the 2D meshes with z incrementing back to front, which
is the ideal arrangement allocation order as it matches the sorted
render order which means lookups have a high cache hit rate.

<img width="1112" alt="Screenshot 2023-09-27 at 07 50 45"
src="https://github.com/bevyengine/bevy/assets/302146/e140bc98-7091-4a3b-8ae1-ab75d16d2ccb">

-39.1% median frame time.

#### Random

This test spawns the 2D meshes with random z. This not only makes the
batching and transparent 2D pass lookups get a lot of cache misses, it
also currently means that the meshes are almost certain to not be
batchable.

<img width="1108" alt="Screenshot 2023-09-27 at 07 51 28"
src="https://github.com/bevyengine/bevy/assets/302146/29c2e813-645a-43ce-982a-55df4bf7d8c4">

-7.2% median frame time.

### 3D Meshes

`many_cubes --benchmark`

<img width="1112" alt="Screenshot 2023-09-27 at 07 51 57"
src="https://github.com/bevyengine/bevy/assets/302146/1a729673-3254-4e2a-9072-55e27c69f0fc">

-7.7% median frame time.

### Sprites

**NOTE: On `main` sprites are using `SparseSet<Entity, T>`!**

`bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite`

#### `--ordered-z`

This test spawns the sprites with z incrementing back to front, which is
the ideal arrangement allocation order as it matches the sorted render
order which means lookups have a high cache hit rate.

<img width="1116" alt="Screenshot 2023-09-27 at 07 52 31"
src="https://github.com/bevyengine/bevy/assets/302146/bc8eab90-e375-4d31-b5cd-f55f6f59ab67">

+13.0% median frame time.

#### Random

This test spawns the sprites with random z. This makes the batching and
transparent 2D pass lookups get a lot of cache misses.

<img width="1109" alt="Screenshot 2023-09-27 at 07 53 01"
src="https://github.com/bevyengine/bevy/assets/302146/22073f5d-99a7-49b0-9584-d3ac3eac3033">

+0.6% median frame time.

### UI

**NOTE: On `main` UI is using `SparseSet<Entity, T>`!**

`many_buttons`

<img width="1111" alt="Screenshot 2023-09-27 at 07 53 26"
src="https://github.com/bevyengine/bevy/assets/302146/66afd56d-cbe4-49e7-8b64-2f28f6043d85">

+15.1% median frame time.

## Alternatives

- Cart originally suggested trying out `SparseSet<Entity, T>` and indeed
that is slightly faster under ideal conditions. However,
`PassHashMap<Entity, T>` has better worst case performance when data is
randomly distributed, rather than in sorted render order, and does not
have the worst case memory usage that `SparseSet`'s dense `Vec<usize>`
that maps from the `Entity` index to sparse index into `Vec<T>`. This
dense `Vec` has to be as large as the largest Entity index used with the
`SparseSet`.
- I also tested `PassHashMap<u32, T>`, intending to use `Entity.index()`
as the key, but this proved to sometimes be slower and mostly no
different.
- The only outstanding approach that has not been implemented and tested
is to _not_ clear the render world of its entities each frame. That has
its own problems, though they could perhaps be solved.
- Performance-wise, if the entities and their component data were not
cleared, then they would incur table moves on spawn, and should not
thereafter, rather just their component data would be overwritten.
Ideally we would have a neat way of either updating data in-place via
`&mut T` queries, or inserting components if not present. This would
likely be quite cumbersome to have to remember to do everywhere, but
perhaps it only needs to be done in the more performance-sensitive
systems.
- The main problem to solve however is that we want to both maintain a
mapping between main world entities and render world entities, be able
to run the render app and world in parallel with the main app and world
for pipelined rendering, and at the same time be able to spawn entities
in the render world in such a way that those Entity ids do not collide
with those spawned in the main world. This is potentially quite
solvable, but could well be a lot of ECS work to do it in a way that
makes sense.

---

## Changelog

- Changed: Component data for entities to be drawn are no longer stored
on entities in the render world. Instead, data is stored in a
`EntityHashMap<Entity, T>` in various resources. This brings significant
performance benefits due to the way the render app clears entities every
frame. Resources of most interest are `RenderMeshInstances` and
`RenderMaterialInstances`, and their 2D counterparts.

## Migration Guide

Previously the render app extracted mesh entities and their component
data from the main world and stored them as entities and components in
the render world. Now they are extracted into essentially
`EntityHashMap<Entity, T>` where `T` are structs containing an
appropriate group of data. This means that while extract set systems
will continue to run extract queries against the main world they will
store their data in hash maps. Also, systems in later sets will either
need to look up entities in the available resources such as
`RenderMeshInstances`, or maintain their own `EntityHashMap<Entity, T>`
for their own data.

Before:
```rust
fn queue_custom(
    material_meshes: Query<(Entity, &MeshTransforms, &Handle<Mesh>), With<InstanceMaterialData>>,
) {
    ...
    for (entity, mesh_transforms, mesh_handle) in &material_meshes {
        ...
    }
}
```

After:
```rust
fn queue_custom(
    render_mesh_instances: Res<RenderMeshInstances>,
    instance_entities: Query<Entity, With<InstanceMaterialData>>,
) {
    ...
    for entity in &instance_entities {
        let Some(mesh_instance) = render_mesh_instances.get(&entity) else { continue; };
        // The mesh handle in `AssetId<Mesh>` form, and the `MeshTransforms` can now
        // be found in `mesh_instance` which is a `RenderMeshInstance`
        ...
    }
}
```

---------

Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>
2023-09-27 08:28:28 +00:00
Robert Swain
5c884c5a15
Automatic batching/instancing of draw commands (#9685)
# Objective

- Implement the foundations of automatic batching/instancing of draw
commands as the next step from #89
- NOTE: More performance improvements will come when more data is
managed and bound in ways that do not require rebinding such as mesh,
material, and texture data.

## Solution

- The core idea for batching of draw commands is to check whether any of
the information that has to be passed when encoding a draw command
changes between two things that are being drawn according to the sorted
render phase order. These should be things like the pipeline, bind
groups and their dynamic offsets, index/vertex buffers, and so on.
  - The following assumptions have been made:
- Only entities with prepared assets (pipelines, materials, meshes) are
queued to phases
- View bindings are constant across a phase for a given draw function as
phases are per-view
- `batch_and_prepare_render_phase` is the only system that performs this
batching and has sole responsibility for preparing the per-object data.
As such the mesh binding and dynamic offsets are assumed to only vary as
a result of the `batch_and_prepare_render_phase` system, e.g. due to
having to split data across separate uniform bindings within the same
buffer due to the maximum uniform buffer binding size.
- Implement `GpuArrayBuffer` for `Mesh2dUniform` to store Mesh2dUniform
in arrays in GPU buffers rather than each one being at a dynamic offset
in a uniform buffer. This is the same optimisation that was made for 3D
not long ago.
- Change batch size for a range in `PhaseItem`, adding API for getting
or mutating the range. This is more flexible than a size as the length
of the range can be used in place of the size, but the start and end can
be otherwise whatever is needed.
- Add an optional mesh bind group dynamic offset to `PhaseItem`. This
avoids having to do a massive table move just to insert
`GpuArrayBufferIndex` components.

## Benchmarks

All tests have been run on an M1 Max on AC power. `bevymark` and
`many_cubes` were modified to use 1920x1080 with a scale factor of 1. I
run a script that runs a separate Tracy capture process, and then runs
the bevy example with `--features bevy_ci_testing,trace_tracy` and
`CI_TESTING_CONFIG=../benchmark.ron` with the contents of
`../benchmark.ron`:
```rust
(
    exit_after: Some(1500)
)
```
...in order to run each test for 1500 frames.

The recent changes to `many_cubes` and `bevymark` added reproducible
random number generation so that with the same settings, the same rng
will occur. They also added benchmark modes that use a fixed delta time
for animations. Combined this means that the same frames should be
rendered both on main and on the branch.

The graphs compare main (yellow) to this PR (red).

### 3D Mesh `many_cubes --benchmark`

<img width="1411" alt="Screenshot 2023-09-03 at 23 42 10"
src="https://github.com/bevyengine/bevy/assets/302146/2088716a-c918-486c-8129-090b26fd2bc4">
The mesh and material are the same for all instances. This is basically
the best case for the initial batching implementation as it results in 1
draw for the ~11.7k visible meshes. It gives a ~30% reduction in median
frame time.

The 1000th frame is identical using the flip tool:

![flip many_cubes-main-mesh3d many_cubes-batching-mesh3d 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/2511f37a-6df8-481a-932f-706ca4de7643)

```
     Mean: 0.000000
     Weighted median: 0.000000
     1st weighted quartile: 0.000000
     3rd weighted quartile: 0.000000
     Min: 0.000000
     Max: 0.000000
     Evaluation time: 0.4615 seconds
```

### 3D Mesh `many_cubes --benchmark --material-texture-count 10`

<img width="1404" alt="Screenshot 2023-09-03 at 23 45 18"
src="https://github.com/bevyengine/bevy/assets/302146/5ee9c447-5bd2-45c6-9706-ac5ff8916daf">
This run uses 10 different materials by varying their textures. The
materials are randomly selected, and there is no sorting by material
bind group for opaque 3D so any batching is 'random'. The PR produces a
~5% reduction in median frame time. If we were to sort the opaque phase
by the material bind group, then this should be a lot faster. This
produces about 10.5k draws for the 11.7k visible entities. This makes
sense as randomly selecting from 10 materials gives a chance that two
adjacent entities randomly select the same material and can be batched.

The 1000th frame is identical in flip:

![flip many_cubes-main-mesh3d-mtc10 many_cubes-batching-mesh3d-mtc10
67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/2b3a8614-9466-4ed8-b50c-d4aa71615dbb)

```
     Mean: 0.000000
     Weighted median: 0.000000
     1st weighted quartile: 0.000000
     3rd weighted quartile: 0.000000
     Min: 0.000000
     Max: 0.000000
     Evaluation time: 0.4537 seconds
```

### 3D Mesh `many_cubes --benchmark --vary-per-instance`

<img width="1394" alt="Screenshot 2023-09-03 at 23 48 44"
src="https://github.com/bevyengine/bevy/assets/302146/f02a816b-a444-4c18-a96a-63b5436f3b7f">
This run varies the material data per instance by randomly-generating
its colour. This is the worst case for batching and that it performs
about the same as `main` is a good thing as it demonstrates that the
batching has minimal overhead when dealing with ~11k visible mesh
entities.

The 1000th frame is identical according to flip:

![flip many_cubes-main-mesh3d-vpi many_cubes-batching-mesh3d-vpi 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/ac5f5c14-9bda-4d1a-8219-7577d4aac68c)

```
     Mean: 0.000000
     Weighted median: 0.000000
     1st weighted quartile: 0.000000
     3rd weighted quartile: 0.000000
     Min: 0.000000
     Max: 0.000000
     Evaluation time: 0.4568 seconds
```

### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode
mesh2d`

<img width="1412" alt="Screenshot 2023-09-03 at 23 59 56"
src="https://github.com/bevyengine/bevy/assets/302146/cb02ae07-237b-4646-ae9f-fda4dafcbad4">
This spawns 160 waves of 1000 quad meshes that are shaded with
ColorMaterial. Each wave has a different material so 160 waves currently
should result in 160 batches. This results in a 50% reduction in median
frame time.

Capturing a screenshot of the 1000th frame main vs PR gives:

![flip bevymark-main-mesh2d bevymark-batching-mesh2d 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/80102728-1217-4059-87af-14d05044df40)

```
     Mean: 0.001222
     Weighted median: 0.750432
     1st weighted quartile: 0.453494
     3rd weighted quartile: 0.969758
     Min: 0.000000
     Max: 0.990296
     Evaluation time: 0.4255 seconds
```

So they seem to produce the same results. I also double-checked the
number of draws. `main` does 160000 draws, and the PR does 160, as
expected.

### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode
mesh2d --material-texture-count 10`

<img width="1392" alt="Screenshot 2023-09-04 at 00 09 22"
src="https://github.com/bevyengine/bevy/assets/302146/4358da2e-ce32-4134-82df-3ab74c40849c">
This generates 10 textures and generates materials for each of those and
then selects one material per wave. The median frame time is reduced by
50%. Similar to the plain run above, this produces 160 draws on the PR
and 160000 on `main` and the 1000th frame is identical (ignoring the fps
counter text overlay).

![flip bevymark-main-mesh2d-mtc10 bevymark-batching-mesh2d-mtc10 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/ebed2822-dce7-426a-858b-b77dc45b986f)

```
     Mean: 0.002877
     Weighted median: 0.964980
     1st weighted quartile: 0.668871
     3rd weighted quartile: 0.982749
     Min: 0.000000
     Max: 0.992377
     Evaluation time: 0.4301 seconds
```

### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode
mesh2d --vary-per-instance`

<img width="1396" alt="Screenshot 2023-09-04 at 00 13 53"
src="https://github.com/bevyengine/bevy/assets/302146/b2198b18-3439-47ad-919a-cdabe190facb">
This creates unique materials per instance by randomly-generating the
material's colour. This is the worst case for 2D batching. Somehow, this
PR manages a 7% reduction in median frame time. Both main and this PR
issue 160000 draws.

The 1000th frame is the same:

![flip bevymark-main-mesh2d-vpi bevymark-batching-mesh2d-vpi 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/a2ec471c-f576-4a36-a23b-b24b22578b97)

```
     Mean: 0.001214
     Weighted median: 0.937499
     1st weighted quartile: 0.635467
     3rd weighted quartile: 0.979085
     Min: 0.000000
     Max: 0.988971
     Evaluation time: 0.4462 seconds
```

### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode
sprite`

<img width="1396" alt="Screenshot 2023-09-04 at 12 21 12"
src="https://github.com/bevyengine/bevy/assets/302146/8b31e915-d6be-4cac-abf5-c6a4da9c3d43">
This just spawns 160 waves of 1000 sprites. There should be and is no
notable difference between main and the PR.

### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode
sprite --material-texture-count 10`

<img width="1389" alt="Screenshot 2023-09-04 at 12 36 08"
src="https://github.com/bevyengine/bevy/assets/302146/45fe8d6d-c901-4062-a349-3693dd044413">
This spawns the sprites selecting a texture at random per instance from
the 10 generated textures. This has no significant change vs main and
shouldn't.

### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode
sprite --vary-per-instance`

<img width="1401" alt="Screenshot 2023-09-04 at 12 29 52"
src="https://github.com/bevyengine/bevy/assets/302146/762c5c60-352e-471f-8dbe-bbf10e24ebd6">
This sets the sprite colour as being unique per instance. This can still
all be drawn using one batch. There should be no difference but the PR
produces median frame times that are 4% higher. Investigation showed no
clear sources of cost, rather a mix of give and take that should not
happen. It seems like noise in the results.

### Summary

| Benchmark  | % change in median frame time |
| ------------- | ------------- |
| many_cubes  | 🟩 -30%  |
| many_cubes 10 materials  | 🟩 -5%  |
| many_cubes unique materials  | 🟩 ~0%  |
| bevymark mesh2d  | 🟩 -50%  |
| bevymark mesh2d 10 materials  | 🟩 -50%  |
| bevymark mesh2d unique materials  | 🟩 -7%  |
| bevymark sprite  | 🟥 2%  |
| bevymark sprite 10 materials  | 🟥 0.6%  |
| bevymark sprite unique materials  | 🟥 4.1%  |

---

## Changelog

- Added: 2D and 3D mesh entities that share the same mesh and material
(same textures, same data) are now batched into the same draw command
for better performance.

---------

Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>
Co-authored-by: Nicola Papale <nico@nicopap.ch>
2023-09-21 22:12:34 +00:00