Commit graph

291 commits

Author SHA1 Message Date
Patrick Walton
d59b1e71ef
Implement percentage-closer filtering (PCF) for point lights. (#12910)
I ported the two existing PCF techniques to the cubemap domain as best I
could. Generally, the technique is to create a 2D orthonormal basis
using Gram-Schmidt normalization, then apply the technique over that
basis. The results look fine, though the shadow bias often needs
adjusting.

For comparison, Unity uses a 4-tap pattern for PCF on point lights of
(1, 1, 1), (-1, -1, 1), (-1, 1, -1), (1, -1, -1). I tried this but
didn't like the look, so I went with the design above, which ports the
2D techniques to the 3D domain. There's surprisingly little material on
point light PCF.

I've gone through every example using point lights and verified that the
shadow maps look fine, adjusting biases as necessary.

Fixes #3628.

---

## Changelog

### Added
* Shadows from point lights now support percentage-closer filtering
(PCF), and as a result look less aliased.

### Changed
* `ShadowFilteringMethod::Castano13` and
`ShadowFilteringMethod::Jimenez14` have been renamed to
`ShadowFilteringMethod::Gaussian` and `ShadowFilteringMethod::Temporal`
respectively.

## Migration Guide

* `ShadowFilteringMethod::Castano13` and
`ShadowFilteringMethod::Jimenez14` have been renamed to
`ShadowFilteringMethod::Gaussian` and `ShadowFilteringMethod::Temporal`
respectively.
2024-04-10 20:16:08 +00:00
Patrick Walton
11817f4ba4
Generate MeshUniforms on the GPU via compute shader where available. (#12773)
Currently, `MeshUniform`s are rather large: 160 bytes. They're also
somewhat expensive to compute, because they involve taking the inverse
of a 3x4 matrix. Finally, if a mesh is present in multiple views, that
mesh will have a separate `MeshUniform` for each and every view, which
is wasteful.

This commit fixes these issues by introducing the concept of a *mesh
input uniform* and adding a *mesh uniform building* compute shader pass.
The `MeshInputUniform` is simply the minimum amount of data needed for
the GPU to compute the full `MeshUniform`. Most of this data is just the
transform and is therefore only 64 bytes. `MeshInputUniform`s are
computed during the *extraction* phase, much like skins are today, in
order to avoid needlessly copying transforms around on CPU. (In fact,
the render app has been changed to only store the translation of each
mesh; it no longer cares about any other part of the transform, which is
stored only on the GPU and the main world.) Before rendering, the
`build_mesh_uniforms` pass runs to expand the `MeshInputUniform`s to the
full `MeshUniform`.

The mesh uniform building pass does the following, all on GPU:

1. Copy the appropriate fields of the `MeshInputUniform` to the
`MeshUniform` slot. If a single mesh is present in multiple views, this
effectively duplicates it into each view.

2. Compute the inverse transpose of the model transform, used for
transforming normals.

3. If applicable, copy the mesh's transform from the previous frame for
TAA. To support this, we double-buffer the `MeshInputUniform`s over two
frames and swap the buffers each frame. The `MeshInputUniform`s for the
current frame contain the index of that mesh's `MeshInputUniform` for
the previous frame.

This commit produces wins in virtually every CPU part of the pipeline:
`extract_meshes`, `queue_material_meshes`,
`batch_and_prepare_render_phase`, and especially
`write_batched_instance_buffer` are all faster. Shrinking the amount of
CPU data that has to be shuffled around speeds up the entire rendering
process.

| Benchmark              | This branch | `main`  | Speedup |
|------------------------|-------------|---------|---------|
| `many_cubes -nfc`      |      17.259 |  24.529 |  42.12% |
| `many_cubes -nfc -vpi` |     302.116 | 312.123 |   3.31% |
| `many_foxes`           |       3.227 |   3.515 |   8.92% |

Because mesh uniform building requires compute shader, and WebGL 2 has
no compute shader, the existing CPU mesh uniform building code has been
left as-is. Many types now have both CPU mesh uniform building and GPU
mesh uniform building modes. Developers can opt into the old CPU mesh
uniform building by setting the `use_gpu_uniform_builder` option on
`PbrPlugin` to `false`.

Below are graphs of the CPU portions of `many-cubes
--no-frustum-culling`. Yellow is this branch, red is `main`.

`extract_meshes`:
![Screenshot 2024-04-02
124842](https://github.com/bevyengine/bevy/assets/157897/a6748ea4-dd05-47b6-9254-45d07d33cb10)
It's notable that we get a small win even though we're now writing to a
GPU buffer.

`queue_material_meshes`:
![Screenshot 2024-04-02
124911](https://github.com/bevyengine/bevy/assets/157897/ecb44d78-65dc-448d-ba85-2de91aa2ad94)
There's a bit of a regression here; not sure what's causing it. In any
case it's very outweighed by the other gains.

`batch_and_prepare_render_phase`:
![Screenshot 2024-04-02
125123](https://github.com/bevyengine/bevy/assets/157897/4e20fc86-f9dd-4e5c-8623-837e4258f435)
There's a huge win here, enough to make batching basically drop off the
profile.

`write_batched_instance_buffer`:
![Screenshot 2024-04-02
125237](https://github.com/bevyengine/bevy/assets/157897/401a5c32-9dc1-4991-996d-eb1cac6014b2)
There's a massive improvement here, as expected. Note that a lot of it
simply comes from the fact that `MeshInputUniform` is `Pod`. (This isn't
a maintainability problem in my view because `MeshInputUniform` is so
simple: just 16 tightly-packed words.)

## Changelog

### Added

* Per-mesh instance data is now generated on GPU with a compute shader
instead of CPU, resulting in rendering performance improvements on
platforms where compute shaders are supported.

## Migration guide

* Custom render phases now need multiple systems beyond just
`batch_and_prepare_render_phase`. Code that was previously creating
custom render phases should now add a `BinnedRenderPhasePlugin` or
`SortedRenderPhasePlugin` as appropriate instead of directly adding
`batch_and_prepare_render_phase`.
2024-04-10 05:33:32 +00:00
Robert Swain
ab7cbfa8fc
Consolidate Render(Ui)Materials(2d) into RenderAssets (#12827)
# Objective

- Replace `RenderMaterials` / `RenderMaterials2d` / `RenderUiMaterials`
with `RenderAssets` to enable implementing changes to one thing,
`RenderAssets`, that applies to all use cases rather than duplicating
changes everywhere for multiple things that should be one thing.
- Adopts #8149 

## Solution

- Make RenderAsset generic over the destination type rather than the
source type as in #8149
- Use `RenderAssets<PreparedMaterial<M>>` etc for render materials

---

## Changelog

- Changed:
- The `RenderAsset` trait is now implemented on the destination type.
Its `SourceAsset` associated type refers to the type of the source
asset.
- `RenderMaterials`, `RenderMaterials2d`, and `RenderUiMaterials` have
been replaced by `RenderAssets<PreparedMaterial<M>>` and similar.

## Migration Guide

- `RenderAsset` is now implemented for the destination type rather that
the source asset type. The source asset type is now the `RenderAsset`
trait's `SourceAsset` associated type.
2024-04-09 13:26:34 +00:00
UkoeHB
2ee69807b1
Fix potential out-of-bounds access in pbr_functions.wgsl (#12585)
# Objective

- Fix a potential out-of-bounds access in the `pbr_functions.wgsl`
shader.

## Solution

- Correctly compute the `GpuLights::directional_lights` array length.

## Comments

I think this solves this comment in the code, but need someone to test
it:
```rust
//NOTE: When running bevy on Adreno GPU chipsets in WebGL, any value above 1 will result in a crash
// when loading the wgsl "pbr_functions.wgsl" in the function apply_fog.
```
2024-04-08 17:00:09 +00:00
Martín Maita
3fc0c6869d
Bump crate-ci/typos from 1.19.0 to 1.20.4 (#12907)
# Objective

- Adopting https://github.com/bevyengine/bevy/pull/12903.

## Solution

- Bump crate-ci/typos from 1.19.0 to 1.20.4.
- Fixed a typo in `crates/bevy_pbr/src/render/pbr_functions.wgsl` file.
- Added "PNG", "iy" and "SME" as exceptions to prevent false positives.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-08 15:31:11 +00:00
JMS55
31b5943ad4
Add previous_view_uniforms.inverse_view (#12902)
# Objective
- Upload previous frame's inverse_view matrix to the GPU for use with
https://github.com/bevyengine/bevy/pull/12898.

---

## Changelog
- Added `prepass_bindings::previous_view_uniforms.inverse_view`.
- Renamed `prepass_bindings::previous_view_proj` to
`prepass_bindings::previous_view_uniforms.view_proj`.
- Renamed `PreviousViewProjectionUniformOffset` to
`PreviousViewUniformOffset`.
- Renamed `PreviousViewProjection` to `PreviousViewData`.

## Migration Guide
- Renamed `prepass_bindings::previous_view_proj` to
`prepass_bindings::previous_view_uniforms.view_proj`.
- Renamed `PreviousViewProjectionUniformOffset` to
`PreviousViewUniformOffset`.
- Renamed `PreviousViewProjection` to `PreviousViewData`.
2024-04-07 18:59:16 +00:00
robtfm
452821dd52
more robust gpu image use (#12606)
# Objective

make morph targets and tonemapping more tolerant of delayed image
loading.

neither of these actually fail currently unless using a bespoke loader
(and even then it would be rare), but i am working on adding throttling
for asset gpu uploads (as a stopgap until we can do proper asset
streaming) and they break with that.

## Solution

when a mesh with morph targets is uploaded to the gpu, the prepare
function uploads the morph target texture if it's available, otherwise
it uploads without morph targets. this is generally fine as long as
morph targets are typically loaded from bytes (in gltf loader), but may
fail for a custom loader if the asset server async-loads the target
texture and the texture is not available yet. the mesh fails to render
and doesn't update when the image is loaded
-> if morph targets are specified but not ready yet, retry mesh upload
next frame

tonemapping `unwrap`s on the lookup table image. this is never a problem
since the image is added via `include_bytes!`, but could be a problem in
future with asset gpu throttling/streaming.
-> if the lookup texture is not yet available, use a fallback
-> in the node, check if the fallback was used before caching the bind
group
2024-04-07 17:18:58 +00:00
François Mockers
a9964f442d
fix msaa shift with irradiance volumes in mesh pipeline key (#12845)
# Objective

- #12791 broke example `irradiance_volumes`
- Fixes #12876 

```
wgpu error: Validation Error

Caused by:
    In Device::create_render_pipeline
      note: label = `pbr_opaque_mesh_pipeline`
    Color state [0] is invalid
    Sample count 8 is not supported by format Rgba8UnormSrgb on this device. The WebGPU spec guarentees [1, 4] samples are supported by this format. With the TEXTURE_ADAPTER_SPECIFIC_FORMAT_FEATURES feature your device supports [1, 2, 4].
```

## Solution

- Shift bits a bit more
2024-04-05 17:50:23 +00:00
James Liu
a4ed1b88b8
Relax BufferVec's type constraints (#12866)
# Objective
Since BufferVec was first introduced, `bytemuck` has added additional
traits with fewer restrictions than `Pod`. Within BufferVec, we only
rely on the constraints of `bytemuck::cast_slice` to a `u8` slice, which
now only requires `T: NoUninit` which is a strict superset of `Pod`
types.

## Solution
Change out the `Pod` generic type constraint with `NoUninit`. Also
taking the opportunity to substitute `cast_slice` with
`must_cast_slice`, which avoids a runtime panic in place of a compile
time failure if `T` cannot be used.

---

## Changelog
Changed: `BufferVec` now supports working with types containing
`NoUninit` but not `Pod` members.
Changed: `BufferVec` will now fail to compile if used with a type that
cannot be safely read from. Most notably, this includes ZSTs, which
would previously always panic at runtime.
2024-04-05 02:11:41 +00:00
Patrick Walton
37522fd0ae
Micro-optimize queue_material_meshes, primarily to remove bit manipulation. (#12791)
This commit makes the following optimizations:

## `MeshPipelineKey`/`BaseMeshPipelineKey` split

`MeshPipelineKey` has been split into `BaseMeshPipelineKey`, which lives
in `bevy_render` and `MeshPipelineKey`, which lives in `bevy_pbr`.
Conceptually, `BaseMeshPipelineKey` is a superclass of
`MeshPipelineKey`. For `BaseMeshPipelineKey`, the bits start at the
highest (most significant) bit and grow downward toward the lowest bit;
for `MeshPipelineKey`, the bits start at the lowest bit and grow upward
toward the highest bit. This prevents them from colliding.

The goal of this is to avoid having to reassemble bits of the pipeline
key for every mesh every frame. Instead, we can just use a bitwise or
operation to combine the pieces that make up a `MeshPipelineKey`.

## `specialize_slow`

Previously, all of `specialize()` was marked as `#[inline]`. This
bloated `queue_material_meshes` unnecessarily, as a large chunk of it
ended up being a slow path that was rarely hit. This commit refactors
the function to move the slow path to `specialize_slow()`.

Together, these two changes shave about 5% off `queue_material_meshes`:

![Screenshot 2024-03-29
130002](https://github.com/bevyengine/bevy/assets/157897/a7e5a994-a807-4328-b314-9003429dcdd2)

## Migration Guide

- The `primitive_topology` field on `GpuMesh` is now an accessor method:
`GpuMesh::primitive_topology()`.
- For performance reasons, `MeshPipelineKey` has been split into
`BaseMeshPipelineKey`, which lives in `bevy_render`, and
`MeshPipelineKey`, which lives in `bevy_pbr`. These two should be
combined with bitwise-or to produce the final `MeshPipelineKey`.
2024-04-01 21:58:53 +00:00
Cameron
01649f13e2
Refactor App and SubApp internals for better separation (#9202)
# Objective

This is a necessary precursor to #9122 (this was split from that PR to
reduce the amount of code to review all at once).

Moving `!Send` resource ownership to `App` will make it unambiguously
`!Send`. `SubApp` must be `Send`, so it can't wrap `App`.

## Solution

Refactor `App` and `SubApp` to not have a recursive relationship. Since
`SubApp` no longer wraps `App`, once `!Send` resources are moved out of
`World` and into `App`, `SubApp` will become unambiguously `Send`.

There could be less code duplication between `App` and `SubApp`, but
that would break `App` method chaining.

## Changelog

- `SubApp` no longer wraps `App`.
- `App` fields are no longer publicly accessible.
- `App` can no longer be converted into a `SubApp`.
- Various methods now return references to a `SubApp` instead of an
`App`.
## Migration Guide

- To construct a sub-app, use `SubApp::new()`. `App` can no longer
convert into `SubApp`.
- If you implemented a trait for `App`, you may want to implement it for
`SubApp` as well.
- If you're accessing `app.world` directly, you now have to use
`app.world()` and `app.world_mut()`.
- `App::sub_app` now returns `&SubApp`.
- `App::sub_app_mut`  now returns `&mut SubApp`.
- `App::get_sub_app` now returns `Option<&SubApp>.`
- `App::get_sub_app_mut` now returns `Option<&mut SubApp>.`
2024-03-31 03:16:10 +00:00
Patrick Walton
4dadebd9c4
Improve performance by binning together opaque items instead of sorting them. (#12453)
Today, we sort all entities added to all phases, even the phases that
don't strictly need sorting, such as the opaque and shadow phases. This
results in a performance loss because our `PhaseItem`s are rather large
in memory, so sorting is slow. Additionally, determining the boundaries
of batches is an O(n) process.

This commit makes Bevy instead applicable place phase items into *bins*
keyed by *bin keys*, which have the invariant that everything in the
same bin is potentially batchable. This makes determining batch
boundaries O(1), because everything in the same bin can be batched.
Instead of sorting each entity, we now sort only the bin keys. This
drops the sorting time to near-zero on workloads with few bins like
`many_cubes --no-frustum-culling`. Memory usage is improved too, with
batch boundaries and dynamic indices now implicit instead of explicit.
The improved memory usage results in a significant win even on
unbatchable workloads like `many_cubes --no-frustum-culling
--vary-material-data-per-instance`, presumably due to cache effects.

Not all phases can be binned; some, such as transparent and transmissive
phases, must still be sorted. To handle this, this commit splits
`PhaseItem` into `BinnedPhaseItem` and `SortedPhaseItem`. Most of the
logic that today deals with `PhaseItem`s has been moved to
`SortedPhaseItem`. `BinnedPhaseItem` has the new logic.

Frame time results (in ms/frame) are as follows:

| Benchmark                | `binning` | `main`  | Speedup |
| ------------------------ | --------- | ------- | ------- |
| `many_cubes -nfc -vpi` | 232.179     | 312.123   | 34.43%  |
| `many_cubes -nfc`        | 25.874 | 30.117 | 16.40%  |
| `many_foxes`             | 3.276 | 3.515 | 7.30%   |

(`-nfc` is short for `--no-frustum-culling`; `-vpi` is short for
`--vary-per-instance`.)

---

## Changelog

### Changed

* Render phases have been split into binned and sorted phases. Binned
phases, such as the common opaque phase, achieve improved CPU
performance by avoiding the sorting step.

## Migration Guide

- `PhaseItem` has been split into `BinnedPhaseItem` and
`SortedPhaseItem`. If your code has custom `PhaseItem`s, you will need
to migrate them to one of these two types. `SortedPhaseItem` requires
the fewest code changes, but you may want to pick `BinnedPhaseItem` if
your phase doesn't require sorting, as that enables higher performance.

## Tracy graphs

`many-cubes --no-frustum-culling`, `main` branch:
<img width="1064" alt="Screenshot 2024-03-12 180037"
src="https://github.com/bevyengine/bevy/assets/157897/e1180ce8-8e89-46d2-85e3-f59f72109a55">

`many-cubes --no-frustum-culling`, this branch:
<img width="1064" alt="Screenshot 2024-03-12 180011"
src="https://github.com/bevyengine/bevy/assets/157897/0899f036-6075-44c5-a972-44d95895f46c">

You can see that `batch_and_prepare_binned_render_phase` is a much
smaller fraction of the time. Zooming in on that function, with yellow
being this branch and red being `main`, we see:

<img width="1064" alt="Screenshot 2024-03-12 175832"
src="https://github.com/bevyengine/bevy/assets/157897/0dfc8d3f-49f4-496e-8825-a66e64d356d0">

The binning happens in `queue_material_meshes`. Again with yellow being
this branch and red being `main`:
<img width="1064" alt="Screenshot 2024-03-12 175755"
src="https://github.com/bevyengine/bevy/assets/157897/b9b20dc1-11c8-400c-a6cc-1c2e09c1bb96">

We can see that there is a small regression in `queue_material_meshes`
performance, but it's not nearly enough to outweigh the large gains in
`batch_and_prepare_binned_render_phase`.

---------

Co-authored-by: James Liu <contact@jamessliu.com>
2024-03-30 02:55:02 +00:00
JMS55
4f20faaa43
Meshlet rendering (initial feature) (#10164)
# Objective
- Implements a more efficient, GPU-driven
(https://github.com/bevyengine/bevy/issues/1342) rendering pipeline
based on meshlets.
- Meshes are split into small clusters of triangles called meshlets,
each of which acts as a mini index buffer into the larger mesh data.
Meshlets can be compressed, streamed, culled, and batched much more
efficiently than monolithic meshes.


![image](https://github.com/bevyengine/bevy/assets/47158642/cb2aaad0-7a9a-4e14-93b0-15d4e895b26a)

![image](https://github.com/bevyengine/bevy/assets/47158642/7534035b-1eb7-4278-9b99-5322e4401715)

# Misc
* Future work: https://github.com/bevyengine/bevy/issues/11518
* Nanite reference:
https://advances.realtimerendering.com/s2021/Karis_Nanite_SIGGRAPH_Advances_2021_final.pdf
Two pass occlusion culling explained very well:
https://medium.com/@mil_kru/two-pass-occlusion-culling-4100edcad501

---------

Co-authored-by: Ricky Taylor <rickytaylor26@gmail.com>
Co-authored-by: vero <email@atlasdostal.com>
Co-authored-by: François <mockersf@gmail.com>
Co-authored-by: atlas dostal <rodol@rivalrebels.com>
2024-03-25 19:08:27 +00:00
LeshaInc
737b719dda
Add pipeline statistics (#9135)
# Objective

It's useful to have access to render pipeline statistics, since they
provide more information than FPS alone. For example, the number of
drawn triangles can be used to debug culling and LODs. The number of
fragment shader invocations can provide a more stable alternative metric
than GPU elapsed time.

See also: Render node GPU timing overlay #8067, which doesn't provide
pipeline statistics, but adds a nice overlay.

## Solution

Add `RenderDiagnosticsPlugin`, which enables collecting pipeline
statistics and CPU & GPU timings.

---

## Changelog

- Add `RenderDiagnosticsPlugin`
- Add `RenderContext::diagnostic_recorder` method

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-03-17 20:29:35 +00:00
robtfm
cca4ab3663
try_insert NoAutomaticBatching (#12396)
# Objective

fix occasional crash from commands.insert when quickly spawning and
despawning skinned/morphed meshes
 
## Solution

use `try_insert` instead of `insert`. if the entity is deleted we don't
mind failing to add the `NoAutomaticBatching` marker.
2024-03-10 02:14:33 +00:00
James Liu
512b7463a3
Disentangle bevy_utils/bevy_core's reexported dependencies (#12313)
# Objective
Make bevy_utils less of a compilation bottleneck. Tackle #11478.

## Solution
* Move all of the directly reexported dependencies and move them to
where they're actually used.
* Remove the UUID utilities that have gone unused since `TypePath` took
over for `TypeUuid`.
* There was also a extraneous bytemuck dependency on `bevy_core` that
has not been used for a long time (since `encase` became the primary way
to prepare GPU buffers).
* Remove the `all_tuples` macro reexport from bevy_ecs since it's
accessible from `bevy_utils`.

---

## Changelog
Removed: Many of the reexports from bevy_utils (petgraph, uuid, nonmax,
smallvec, and thiserror).
Removed: bevy_core's reexports of bytemuck.

## Migration Guide
bevy_utils' reexports of petgraph, uuid, nonmax, smallvec, and thiserror
have been removed.

bevy_core' reexports of bytemuck's types has been removed. 

Add them as dependencies in your own crate instead.
2024-03-07 02:30:15 +00:00
vero
13d37c534f
Fix directional light shadow frustum culling near clip plane to infinity (#12342)
# Objective

- Fix slightly wrong logic from #11442
- Directional lights should not have a near clip plane

## Solution

- Push near clip out to infinity, so that the frustum normal is still
available if its needed for whatever reason in shader
- also opportunistically nabs a typo
2024-03-06 19:47:12 +00:00
Patrick Walton
f9cc91d5a1
Intern mesh vertex buffer layouts so that we don't have to compare them over and over. (#12216)
Although we cached hashes of `MeshVertexBufferLayout`, we were paying
the cost of `PartialEq` on `InnerMeshVertexBufferLayout` for every
entity, every frame. This patch changes that logic to place
`MeshVertexBufferLayout`s in `Arc`s so that they can be compared and
hashed by pointer. This results in a 28% speedup in the
`queue_material_meshes` phase of `many_cubes`, with frustum culling
disabled.

Additionally, this patch contains two minor changes:

1. This commit flattens the specialized mesh pipeline cache to one level
of hash tables instead of two. This saves a hash lookup.

2. The example `many_cubes` has been given a `--no-frustum-culling`
flag, to aid in benchmarking.

See the Tracy profile:

<img width="1064" alt="Screenshot 2024-02-29 144406"
src="https://github.com/bevyengine/bevy/assets/157897/18632f1d-1fdd-4ac7-90ed-2d10306b2a1e">

## Migration guide

* Duplicate `MeshVertexBufferLayout`s are now combined into a single
object, `MeshVertexBufferLayoutRef`, which contains an
atomically-reference-counted pointer to the layout. Code that was using
`MeshVertexBufferLayout` may need to be updated to use
`MeshVertexBufferLayoutRef` instead.
2024-03-01 20:56:21 +00:00
Alice Cecile
599e5e4e76
Migrate from LegacyColor to bevy_color::Color (#12163)
# Objective

- As part of the migration process we need to a) see the end effect of
the migration on user ergonomics b) check for serious perf regressions
c) actually migrate the code
- To accomplish this, I'm going to attempt to migrate all of the
remaining user-facing usages of `LegacyColor` in one PR, being careful
to keep a clean commit history.
- Fixes #12056.

## Solution

I've chosen to use the polymorphic `Color` type as our standard
user-facing API.

- [x] Migrate `bevy_gizmos`.
- [x] Take `impl Into<Color>` in all `bevy_gizmos` APIs
- [x] Migrate sprites
- [x] Migrate UI
- [x] Migrate `ColorMaterial`
- [x] Migrate `MaterialMesh2D`
- [x] Migrate fog
- [x] Migrate lights
- [x] Migrate StandardMaterial
- [x] Migrate wireframes
- [x] Migrate clear color
- [x] Migrate text
- [x] Migrate gltf loader
- [x] Register color types for reflection
- [x] Remove `LegacyColor`
- [x] Make sure CI passes

Incidental improvements to ease migration:

- added `Color::srgba_u8`, `Color::srgba_from_array` and friends
- added `set_alpha`, `is_fully_transparent` and `is_fully_opaque` to the
`Alpha` trait
- add and immediately deprecate (lol) `Color::rgb` and friends in favor
of more explicit and consistent `Color::srgb`
- standardized on white and black for most example text colors
- added vector field traits to `LinearRgba`: ~~`Add`, `Sub`,
`AddAssign`, `SubAssign`,~~ `Mul<f32>` and `Div<f32>`. Multiplications
and divisions do not scale alpha. `Add` and `Sub` have been cut from
this PR.
- added `LinearRgba` and `Srgba` `RED/GREEN/BLUE`
- added `LinearRgba_to_f32_array` and `LinearRgba::to_u32`

## Migration Guide

Bevy's color types have changed! Wherever you used a
`bevy::render::Color`, a `bevy::color::Color` is used instead.

These are quite similar! Both are enums storing a color in a specific
color space (or to be more precise, using a specific color model).
However, each of the different color models now has its own type.

TODO...

- `Color::rgba`, `Color::rgb`, `Color::rbga_u8`, `Color::rgb_u8`,
`Color::rgb_from_array` are now `Color::srgba`, `Color::srgb`,
`Color::srgba_u8`, `Color::srgb_u8` and `Color::srgb_from_array`.
- `Color::set_a` and `Color::a` is now `Color::set_alpha` and
`Color::alpha`. These are part of the `Alpha` trait in `bevy_color`.
- `Color::is_fully_transparent` is now part of the `Alpha` trait in
`bevy_color`
- `Color::r`, `Color::set_r`, `Color::with_r` and the equivalents for
`g`, `b` `h`, `s` and `l` have been removed due to causing silent
relatively expensive conversions. Convert your `Color` into the desired
color space, perform your operations there, and then convert it back
into a polymorphic `Color` enum.
- `Color::hex` is now `Srgba::hex`. Call `.into` or construct a
`Color::Srgba` variant manually to convert it.
- `WireframeMaterial`, `ExtractedUiNode`, `ExtractedDirectionalLight`,
`ExtractedPointLight`, `ExtractedSpotLight` and `ExtractedSprite` now
store a `LinearRgba`, rather than a polymorphic `Color`
- `Color::rgb_linear` and `Color::rgba_linear` are now
`Color::linear_rgb` and `Color::linear_rgba`
- The various CSS color constants are no longer stored directly on
`Color`. Instead, they're defined in the `Srgba` color space, and
accessed via `bevy::color::palettes::css`. Call `.into()` on them to
convert them into a `Color` for quick debugging use, and consider using
the much prettier `tailwind` palette for prototyping.
- The `LIME_GREEN` color has been renamed to `LIMEGREEN` to comply with
the standard naming.
- Vector field arithmetic operations on `Color` (add, subtract, multiply
and divide by a f32) have been removed. Instead, convert your colors
into `LinearRgba` space, and perform your operations explicitly there.
This is particularly relevant when working with emissive or HDR colors,
whose color channel values are routinely outside of the ordinary 0 to 1
range.
- `Color::as_linear_rgba_f32` has been removed. Call
`LinearRgba::to_f32_array` instead, converting if needed.
- `Color::as_linear_rgba_u32` has been removed. Call
`LinearRgba::to_u32` instead, converting if needed.
- Several other color conversion methods to transform LCH or HSL colors
into float arrays or `Vec` types have been removed. Please reimplement
these externally or open a PR to re-add them if you found them
particularly useful.
- Various methods on `Color` such as `rgb` or `hsl` to convert the color
into a specific color space have been removed. Convert into
`LinearRgba`, then to the color space of your choice.
- Various implicitly-converting color value methods on `Color` such as
`r`, `g`, `b` or `h` have been removed. Please convert it into the color
space of your choice, then check these properties.
- `Color` no longer implements `AsBindGroup`. Store a `LinearRgba`
internally instead to avoid conversion costs.

---------

Co-authored-by: Alice Cecile <alice.i.cecil@gmail.com>
Co-authored-by: Afonso Lage <lage.afonso@gmail.com>
Co-authored-by: Rob Parrett <robparrett@gmail.com>
Co-authored-by: Zachary Harrold <zac@harrold.com.au>
2024-02-29 19:35:12 +00:00
JMS55
40bfce556a
Add random shader utils, fix cluster_debug_visualization (#11956)
# Objective
- Partially addresses https://github.com/bevyengine/bevy/issues/11470
(I'd like to add Spatiotemporal Blue Noise in the future, but that's a
bit more controversial).
- Fix cluster_debug_visualization which has not compiled for a while

---

## Changelog
- Added random white noise shader functions to `bevy_pbr::utils`

## Migration Guide
- The `bevy_pbr::utils::random1D` shader function has been replaced by
the similar `bevy_pbr::utils::rand_f`.
2024-02-26 15:59:44 +00:00
Jan Hohenheim
ad5d790e9e
Fix WebGL not rendering StandardMaterial (#12110)
# Objective

- Fixes #12081

## Solution

Passing the `Affine2` as a neatly packed `mat3x2` breaks WebGL with
`drawElementsInstanced: Buffer for uniform block is smaller than
UNIFORM_BLOCK_DATA_SIZE.`
I fixed this by using a `mat3x3` instead.
Alternative solutions that come to mind:
- Pass in a `mat3x2` on non-webgl targets and a `mat3x3` otherwise. I
guess I could use `#ifdef SIXTEEN_BYTE_ALIGNMENT` for this, but it
doesn't seem quite right? This would be more efficient, but decrease
code quality.
- Do something about `UNIFORM_BLOCK_DATA_SIZE`. I don't know how, so I'd
need some guidance here.

@superdump let me know if you'd like me to implement other variants.
Otherwise, I vote for merging this as a quick fix for `main` and then
improving the packing in subsequent PRs :)

## Additional notes

Ideally we should merge this before @JMS55 rebases #10164 so that they
don't have to rebase everything a second time.
2024-02-25 22:42:28 +00:00
James Liu
fd91c61d72
Cleanup: Use Parallel in extract_meshes (#12084)
# Objective
#7348 added `bevy_utils::Parallel` and replaced the usage of the
`ThreadLocal<Cell<Vec<...>>>` in `check_visibility`, but we were also
using it in `extract_meshes`.

## Solution
Refactor the system to use `Parallel` instead.
2024-02-25 19:06:54 +00:00
Alex
a7be8a2655
Prefer UVec2 when working with texture dimensions (#11698)
# Objective

The physical width and height (pixels) of an image is always integers,
but for `GpuImage` bevy currently stores them as `Vec2` (`f32`).
Switching to `UVec2` makes this more consistent with the [underlying
texture data](https://docs.rs/wgpu/latest/wgpu/struct.Extent3d.html).

I'm not sure if this is worth the change in the surface level API. If
not, feel free to close this PR.

## Solution

- Replace uses of `Vec2` with `UVec2` when referring to texture
dimensions.
- Use integer types for the texture atlas dimensions and sections.


[`Sprite::rect`](a81a2d1da3/crates/bevy_sprite/src/sprite.rs (L29))
remains unchanged, so manually specifying a sub-pixel region of an image
is still possible.

---

## Changelog

- `GpuImage` now stores its size as `UVec2` instead of `Vec2`.
- Texture atlases store their size and sections as `UVec2` and `URect`
respectively.
- `UiImageSize` stores its size as `UVec2`.

## Migration Guide

- Change floating point types (`Vec2`, `Rect`) to their respective
unsigned integer versions (`UVec2`, `URect`) when using `GpuImage`,
`TextureAtlasLayout`, `TextureAtlasBuilder`,
`DynamicAtlasTextureBuilder` or `FontAtlas`.
2024-02-25 15:23:04 +00:00
eri
5f8f3b532c
Check cfg during CI and fix feature typos (#12103)
# Objective

- Add the new `-Zcheck-cfg` checks to catch more warnings
- Fixes #12091

## Solution

- Create a new `cfg-check` to the CI that runs `cargo check -Zcheck-cfg
--workspace` using cargo nightly (and fails if there are warnings)
- Fix all warnings generated by the new check

---

## Changelog

- Remove all redundant imports
- Fix cfg wasm32 targets
- Add 3 dead code exceptions (should StandardColor be unused?)
- Convert ios_simulator to a feature (I'm not sure if this is the right
way to do it, but the check complained before)

## Migration Guide

No breaking changes

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-02-25 15:19:27 +00:00
Alice Cecile
de004da8d5
Rename bevy_render::Color to LegacyColor (#12069)
# Objective

The migration process for `bevy_color` (#12013) will be fairly involved:
there will be hundreds of affected files, and a large number of APIs.

## Solution

To allow us to proceed granularly, we're going to keep both
`bevy_color::Color` (new) and `bevy_render::Color` (old) around until
the migration is complete.

However, simply doing this directly is confusing! They're both called
`Color`, making it very hard to tell when a portion of the code has been
ported.

As discussed in #12056, by renaming the old `Color` type, we can make it
easier to gradually migrate over, one API at a time.

## Migration Guide

THIS MIGRATION GUIDE INTENTIONALLY LEFT BLANK.

This change should not be shipped to end users: delete this section in
the final migration guide!

---------

Co-authored-by: Alice Cecile <alice.i.cecil@gmail.com>
2024-02-24 21:35:32 +00:00
IceSentry
e79b9b62ce
Make more things pub in the renderer (#12053)
# Objective

- Some properties of public types are private but sometimes it's useful
to be able to set those

## Solution

- Make more stuff pub

---

## Changelog

- `MaterialBindGroupId` internal id is now pub and added a new()
constructor
- `ExtractedPointLight` and `ExtractedDirectionalLight` properties are
now all pub

---------

Co-authored-by: James Liu <contact@jamessliu.com>
2024-02-23 06:13:37 +00:00
Sam Pettersson
caa7ec68d4
FIX: iOS Simulator not rendering due to missing CUBE_ARRAY_TEXTURES (#12052)
This PR closes #11978

# Objective

Fix rendering on iOS Simulators.

iOS Simulator doesn't support the capability CUBE_ARRAY_TEXTURES, since
0.13 this started to make iOS Simulator not render anything with the
following message being outputted:

```
2024-02-19T14:59:34.896266Z ERROR bevy_render::render_resource::pipeline_cache: failed to create shader module: Validation Error

Caused by:
    In Device::create_shader_module
    
Shader validation error: 


    Type [40] '' is invalid
    Capability Capabilities(CUBE_ARRAY_TEXTURES) is required
```

## Solution

- Split up NO_ARRAY_TEXTURES_SUPPORT into both NO_ARRAY_TEXTURES_SUPPORT
and NO_CUBE_ARRAY_TEXTURES_SUPPORT and correctly apply
NO_ARRAY_TEXTURES_SUPPORT for iOS Simulator using the cfg flag
introduced in #10178.

---

## Changelog

### Fixed
- Rendering on iOS Simulator due to missing CUBE_ARRAY_TEXTURES support.

---------

Co-authored-by: Sam Pettersson <sam.pettersson@geoguessr.com>
2024-02-23 01:24:59 +00:00
IceSentry
a513493dcc
Make Globals visible in vertex shaders (#12032)
# Objective

- Globals are supposed to be available in vertex shader but that was
mistakenly removed in 0.13

## Solution

- Configure the visibility of the globals correctly

Fixes https://github.com/bevyengine/bevy/issues/12015
2024-02-21 23:16:43 +00:00
Jan Hohenheim
8531033b31
Add support for KHR_texture_transform (#11904)
Adopted #8266, so copy-pasting the description from there:

# Objective

Support the KHR_texture_transform extension for the glTF loader.

- Fixes #6335
- Fixes #11869 
- Implements part of #11350
- Implements the GLTF part of #399 

## Solution

As is, this only supports a single transform. Looking at Godot's source,
they support one transform with an optional second one for detail, AO,
and emission. glTF specifies one per texture. The public domain
materials I looked at seem to share the same transform. So maybe having
just one is acceptable for now. I tried to include a warning if multiple
different transforms exist for the same material.

Note the gltf crate doesn't expose the texture transform for the normal
and occlusion textures, which it should, so I just ignored those for
now. (note by @janhohenheim: this is still the case)

Via `cargo run --release --example scene_viewer
~/src/clone/glTF-Sample-Models/2.0/TextureTransformTest/glTF/TextureTransformTest.gltf`:


![texture_transform](https://user-images.githubusercontent.com/283864/228938298-aa2ef524-555b-411d-9637-fd0dac226fb0.png)

## Changelog

Support for the
[KHR_texture_transform](https://github.com/KhronosGroup/glTF/tree/main/extensions/2.0/Khronos/KHR_texture_transform)
extension added. Texture UVs that were scaled, rotated, or offset in a
GLTF are now properly handled.

---------

Co-authored-by: Al McElrath <hello@yrns.org>
Co-authored-by: Kanabenki <lucien.menassol@gmail.com>
2024-02-21 01:11:28 +00:00
Robert Swain
1d0ea78f36
Save 16 bytes per MeshUniform in uniform/storage buffers (#11999)
# Objective

- Save 16 bytes per MeshUniform in uniform/storage buffers.

## Solution

- Reorder members of MeshUniform to capitalise on alignment and size
rules for tighter data packing. Before the size of a MeshUniform was 160
bytes, and after it is 144 bytes, saving 16 bytes of unused padding for
alignment.

---

## Changelog

- Reduced the size of MeshUniform by 16 bytes.
2024-02-20 16:25:25 +00:00
James Liu
6d547d7ce6
Allow Mesh-related queue phase systems to parallelize (#11804)
# Objective
Partially addresses #3548. `queue_shadows` and `queue_material_meshes`
cannot parallelize because of the `ResMut<RenderMeshInstances>`
parameter for `queue_material_meshes`.

## Solution
Change the `material_bind_group` field to use atomics instead of needing
full mutable access. Change the `ResMut` to a `Res`, which should allow
both sets of systems to parallelize without issue.

## Performance
Tested against `many_foxes`, this has a significant improvement over the
entire render schedule. (Yellow is this PR, red is main)

![image](https://github.com/bevyengine/bevy/assets/3137680/6cc7f346-4f50-4f12-a383-682a9ce1daf6)

The use of atomics does seem to have a negative effect on
`queue_material_meshes` (roughly a 8.29% increase in time spent in the
system).

![image](https://github.com/bevyengine/bevy/assets/3137680/7907079a-863d-4760-aa5b-df68c006ea36)

`queue_shadows` seems to be ever so slightly slower (1.6% more time
spent) in the system.

![image](https://github.com/bevyengine/bevy/assets/3137680/6d90af73-b922-45e4-bae5-df200e8b9784)

`batch_and_prepare_render_phase` seems to be a mix, but overall seems to
be slightly *faster* by about 5%.

![image](https://github.com/bevyengine/bevy/assets/3137680/fac638ff-8c90-436b-9362-c6209b18957c)
2024-02-20 00:12:41 +00:00
Patrick Walton
3058c17d6a
Disable irradiance volumes on WebGL and WebGPU. (#11909)
They cause the number of texture bindings to overflow on those
platforms. Ultimately, we shouldn't unconditionally disable them, but
this fixes a crash blocking 0.13.

Closes #11885.
2024-02-17 01:49:46 +00:00
Patrick Walton
7883eea54f
Add MeshPipelineKey::LIGHTMAPPED as applicable during the shadow map pass. (#11910)
I did this during the prepass, but I neglected to do it during the
shadow map pass, causing a panic when directional lights with shadows
were enabled with lightmapped meshes present. This patch fixes the
issue.

Closes #11898.
2024-02-17 00:25:32 +00:00
Robin KAY
4ebc560dfb
Change MeshUniform::new() to be public. (#11880)
# Objective

Provide a public replacement for `Into<MeshUniform>` trait impl which
was removed by #10231.

I made use of this in the `bevy_mod_outline` crate and will have to
duplicate this function if it's not accessible.

## Solution

Change the MeshUniform::new() method to be public.
2024-02-15 22:13:17 +00:00
robtfm
73bf730da9
fix shadow batching (#11645)
# Objective

`RenderMeshInstance::material_bind_group_id` is only set from
`queue_material_meshes::<M>`. this field is used (only) for determining
batch groups, so some items may be batched incorrectly if they have
never been in the camera's view or if they don't use the Material
abstraction.

in particular, shadow views render more meshes than the main camera, and
currently batch some meshes where the object has never entered the
camera view together. this is quite hard to trigger, but should occur in
a scene with out-of-view alpha-mask materials (so that the material
instance actually affects the shadow) in the path of a light.

this is also a footgun for custom pipelines: failing to set the
material_bind_group_id will result in all meshes being batched together
and all using the closest/furthest material to the camera (depending on
sort order).

## Solution

- queue_shadows now sets the material_bind_group_id correctly
- `MeshPipeline` doesn't attempt to batch meshes if the
material_bind_group_id has not been set. custom pipelines still need to
set this field to take advantage of batching, but will at least render
correctly if it is not set
2024-02-14 00:31:45 +00:00
Doonv
1c67e020f7
Move EntityHash related types into bevy_ecs (#11498)
# Objective

Reduce the size of `bevy_utils`
(https://github.com/bevyengine/bevy/issues/11478)

## Solution

Move `EntityHash` related types into `bevy_ecs`. This also allows us
access to `Entity`, which means we no longer need `EntityHashMap`'s
first generic argument.

---

## Changelog

- Moved `bevy::utils::{EntityHash, EntityHasher, EntityHashMap,
EntityHashSet}` into `bevy::ecs::entity::hash` .
- Removed `EntityHashMap`'s first generic argument. It is now hardcoded
to always be `Entity`.

## Migration Guide

- Uses of `bevy::utils::{EntityHash, EntityHasher, EntityHashMap,
EntityHashSet}` now have to be imported from `bevy::ecs::entity::hash`.
- Uses of `EntityHashMap` no longer have to specify the first generic
parameter. It is now hardcoded to always be `Entity`.
2024-02-12 15:02:24 +00:00
Patrick Walton
3af8526786
Stop extracting mesh entities to the render world. (#11803)
This fixes a `FIXME` in `extract_meshes` and results in a performance
improvement.

As a result of this change, meshes in the render world might not be
attached to entities anymore. Therefore, the `entity` parameter to
`RenderCommand::render()` is now wrapped in an `Option`. Most
applications that use the render app's ECS can simply unwrap the
`Option`.

Note that for now sprites, gizmos, and UI elements still use the render
world as usual.

## Migration guide

* For efficiency reasons, some meshes in the render world may not have
corresponding `Entity` IDs anymore. As a result, the `entity` parameter
to `RenderCommand::render()` is now wrapped in an `Option`. Custom
rendering code may need to be updated to handle the case in which no
`Entity` exists for an object that is to be rendered.
2024-02-10 10:46:10 +00:00
JMS55
f4dab8a4e8
Multithreaded render command encoding (#9172)
# Objective
- Encoding many GPU commands (such as in a renderpass with many draws,
such as the main opaque pass) onto a `wgpu::CommandEncoder` is very
expensive, and takes a long time.
- To improve performance, we want to perform the command encoding for
these heavy passes in parallel.

## Solution
- `RenderContext` can now queue up "command buffer generation tasks"
which are closures that will generate a command buffer when called.
- When finalizing the render context to produce the final list of
command buffers, these tasks are run in parallel on the
`ComputeTaskPool` to produce their corresponding command buffers.
- The general idea is that the node graph will run in serial, but in a
node, instead of doing rendering work, you can add tasks to do render
work in parallel with other node's tasks that get ran at the end of the
graph execution.

## Nodes Parallelized
- `MainOpaquePass3dNode`
- `PrepassNode`
- `DeferredGBufferPrepassNode`
- `ShadowPassNode` (One task per view)


## Future Work
- For large number of draws calls, might be worth further subdividing
passes into 2+ tasks.
- Extend this to UI, 2d, transparent, and transmissive nodes?
- Needs testing - small command buffers are inefficient - it may be
worth reverting to the serial command encoder usage for render phases
with few items.
- All "serial" (traditional) rendering work must finish before parallel
rendering tasks (the new stuff) can start to run.
- There is still only one submission to the graphics queue at the end of
the graph execution. There is still no ability to submit work earlier.

## Performance Improvement
Thanks to @Elabajaba for testing on Bistro.


![image](https://github.com/bevyengine/bevy/assets/47158642/be50dafa-85eb-4da5-a5cd-c0a044f1e76f)


TLDR: Without shadow mapping, this PR has no impact. _With_ shadow
mapping, this PR gives **~40 more fps** than main.

---

## Changelog
- `MainOpaquePass3dNode`, `PrepassNode`, `DeferredGBufferPrepassNode`,
and each shadow map within `ShadowPassNode` are now encoded in parallel,
giving _greatly_ increased CPU performance, mainly when shadow mapping
is enabled.
  - Does not work on WASM or AMD+Windows+Vulkan.
- Added `RenderContext::add_command_buffer_generation_task()`.
- `RenderContext::new()` now takes adapter info
- Some render graph and Node related types and methods now have
additional lifetime constraints.


## Migration Guide
`RenderContext::new()` now takes adapter info
- Some render graph and Node related types and methods now have
additional lifetime constraints.

---------

Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com>
Co-authored-by: François <mockersf@gmail.com>
2024-02-09 07:35:35 +00:00
Patrick Walton
4c15dd0fc5
Implement irradiance volumes. (#10268)
# Objective

Bevy could benefit from *irradiance volumes*, also known as *voxel
global illumination* or simply as light probes (though this term is not
preferred, as multiple techniques can be called light probes).
Irradiance volumes are a form of baked global illumination; they work by
sampling the light at the centers of each voxel within a cuboid. At
runtime, the voxels surrounding the fragment center are sampled and
interpolated to produce indirect diffuse illumination.

## Solution

This is divided into two sections. The first is copied and pasted from
the irradiance volume module documentation and describes the technique.
The second part consists of notes on the implementation.

### Overview

An *irradiance volume* is a cuboid voxel region consisting of
regularly-spaced precomputed samples of diffuse indirect light. They're
ideal if you have a dynamic object such as a character that can move
about
static non-moving geometry such as a level in a game, and you want that
dynamic object to be affected by the light bouncing off that static
geometry.

To use irradiance volumes, you need to precompute, or *bake*, the
indirect
light in your scene. Bevy doesn't currently come with a way to do this.
Fortunately, [Blender] provides a [baking tool] as part of the Eevee
renderer, and its irradiance volumes are compatible with those used by
Bevy.
The [`bevy-baked-gi`] project provides a tool, `export-blender-gi`, that
can
extract the baked irradiance volumes from the Blender `.blend` file and
package them up into a `.ktx2` texture for use by the engine. See the
documentation in the `bevy-baked-gi` project for more details as to this
workflow.

Like all light probes in Bevy, irradiance volumes are 1×1×1 cubes that
can
be arbitrarily scaled, rotated, and positioned in a scene with the
[`bevy_transform::components::Transform`] component. The 3D voxel grid
will
be stretched to fill the interior of the cube, and the illumination from
the
irradiance volume will apply to all fragments within that bounding
region.

Bevy's irradiance volumes are based on Valve's [*ambient cubes*] as used
in
*Half-Life 2* ([Mitchell 2006], slide 27). These encode a single color
of
light from the six 3D cardinal directions and blend the sides together
according to the surface normal.

The primary reason for choosing ambient cubes is to match Blender, so
that
its Eevee renderer can be used for baking. However, they also have some
advantages over the common second-order spherical harmonics approach:
ambient cubes don't suffer from ringing artifacts, they are smaller (6
colors for ambient cubes as opposed to 9 for spherical harmonics), and
evaluation is faster. A smaller basis allows for a denser grid of voxels
with the same storage requirements.

If you wish to use a tool other than `export-blender-gi` to produce the
irradiance volumes, you'll need to pack the irradiance volumes in the
following format. The irradiance volume of resolution *(Rx, Ry, Rz)* is
expected to be a 3D texture of dimensions *(Rx, 2Ry, 3Rz)*. The
unnormalized
texture coordinate *(s, t, p)* of the voxel at coordinate *(x, y, z)*
with
side *S* ∈ *{-X, +X, -Y, +Y, -Z, +Z}* is as follows:

```text
s = x

t = y + ⎰  0 if S ∈ {-X, -Y, -Z}
        ⎱ Ry if S ∈ {+X, +Y, +Z}

        ⎧   0 if S ∈ {-X, +X}
p = z + ⎨  Rz if S ∈ {-Y, +Y}
        ⎩ 2Rz if S ∈ {-Z, +Z}
```

Visually, in a left-handed coordinate system with Y up, viewed from the
right, the 3D texture looks like a stacked series of voxel grids, one
for
each cube side, in this order:

| **+X** | **+Y** | **+Z** |
| ------ | ------ | ------ |
| **-X** | **-Y** | **-Z** |

A terminology note: Other engines may refer to irradiance volumes as
*voxel
global illumination*, *VXGI*, or simply as *light probes*. Sometimes
*light
probe* refers to what Bevy calls a reflection probe. In Bevy, *light
probe*
is a generic term that encompasses all cuboid bounding regions that
capture
indirect illumination, whether based on voxels or not.

Note that, if binding arrays aren't supported (e.g. on WebGPU or WebGL
2),
then only the closest irradiance volume to the view will be taken into
account during rendering.

[*ambient cubes*]:
https://advances.realtimerendering.com/s2006/Mitchell-ShadingInValvesSourceEngine.pdf

[Mitchell 2006]:
https://advances.realtimerendering.com/s2006/Mitchell-ShadingInValvesSourceEngine.pdf

[Blender]: http://blender.org/

[baking tool]:
https://docs.blender.org/manual/en/latest/render/eevee/render_settings/indirect_lighting.html

[`bevy-baked-gi`]: https://github.com/pcwalton/bevy-baked-gi

### Implementation notes

This patch generalizes light probes so as to reuse as much code as
possible between irradiance volumes and the existing reflection probes.
This approach was chosen because both techniques share numerous
similarities:

1. Both irradiance volumes and reflection probes are cuboid bounding
regions.
2. Both are responsible for providing baked indirect light.
3. Both techniques involve presenting a variable number of textures to
the shader from which indirect light is sampled. (In the current
implementation, this uses binding arrays.)
4. Both irradiance volumes and reflection probes require gathering and
sorting probes by distance on CPU.
5. Both techniques require the GPU to search through a list of bounding
regions.
6. Both will eventually want to have falloff so that we can smoothly
blend as objects enter and exit the probes' influence ranges. (This is
not implemented yet to keep this patch relatively small and reviewable.)

To do this, we generalize most of the methods in the reflection probes
patch #11366 to be generic over a trait, `LightProbeComponent`. This
trait is implemented by both `EnvironmentMapLight` (for reflection
probes) and `IrradianceVolume` (for irradiance volumes). Using a trait
will allow us to add more types of light probes in the future. In
particular, I highly suspect we will want real-time reflection planes
for mirrors in the future, which can be easily slotted into this
framework.

## Changelog

> This section is optional. If this was a trivial fix, or has no
externally-visible impact, you can delete this section.

### Added
* A new `IrradianceVolume` asset type is available for baked voxelized
light probes. You can bake the global illumination using Blender or
another tool of your choice and use it in Bevy to apply indirect
illumination to dynamic objects.
2024-02-06 23:23:20 +00:00
Kanabenki
312df3cec7
Use warn_once where relevant instead of manually implementing a single warn check (#11693)
# Objective

- Some places manually use a `bool` /`AtomicBool` to warn once.

## Solution

- Use the `warn_once` macro which internally creates an `AtomicBool`.

Downside: in some case the warning state would have been reset after
recreating the struct carrying the warn state, whereas now it will
always warn only once per program run (For example, if all
`MeshPipeline`s are dropped or the `World` is recreated for
`Local<bool>`/ a `bool` resource, which shouldn't happen over the course
of a standard `App` run).


---

## Changelog

### Removed

- `FontAtlasWarning` has been removed, but the corresponding warning is
still emitted.
2024-02-05 21:05:43 +00:00
Marco Buono
91c467ebfc
Gate diffuse and specular transmission behind shader defs (#11627)
# Objective

- Address #10338

## Solution

- When implementing specular and diffuse transmission, I inadvertently
introduced a performance regression. On high-end hardware it is barely
noticeable, but **for lower-end hardware it can be pretty brutal**. If I
understand it correctly, this is likely due to use of masking by the GPU
to implement control flow, which means that you still pay the price for
the branches you don't take;
- To avoid that, this PR introduces new shader defs (controlled via
`StandardMaterialKey`) that conditionally include the transmission
logic, that way the shader code for both types of transmission isn't
even sent to the GPU if you're not using them;
- This PR also renames ~~`STANDARDMATERIAL_NORMAL_MAP`~~ to
`STANDARD_MATERIAL_NORMAL_MAP` for consistency with the naming
convention used elsewhere in the codebase. (Drive-by fix)

---

## Changelog

- Added new shader defs, set when using transmission in the
`StandardMaterial`:
  - `STANDARD_MATERIAL_SPECULAR_TRANSMISSION`;
  - `STANDARD_MATERIAL_DIFFUSE_TRANSMISSION`;
  - `STANDARD_MATERIAL_SPECULAR_OR_DIFFUSE_TRANSMISSION`.
- Fixed performance regression caused by the introduction of
transmission, by gating transmission shader logic behind the newly
introduced shader defs;
- Renamed ~~`STANDARDMATERIAL_NORMAL_MAP`~~ to
`STANDARD_MATERIAL_NORMAL_MAP` for consistency;

## Migration Guide

- If you were using `#ifdef STANDARDMATERIAL_NORMAL_MAP` on your shader
code, make sure to update the name to `STANDARD_MATERIAL_NORMAL_MAP`;
(with an underscore between `STANDARD` and `MATERIAL`)
2024-02-02 15:01:56 +00:00
Rafał Harabień
16ce8c6136
Optimize extract_clusters and prepare_clusters systems (#10633)
# Objective

When developing my game I realized `extract_clusters` and
`prepare_clusters` systems are taking a lot of time despite me creating
very little lights. Reducing number of clusters from the default 4096 to
2048 or less greatly improved performance and stabilized FPS (~300 ->
1000+). I debugged it and found out that the main reason for this is
cloning `VisiblePointLights` in `extract_clusters` system. It contains
light entities grouped by clusters that they affect. The problem is that
we clone 4096 (assuming the default clusters configuration) vectors
every frame. If many of them happen to be non-empty it starts to be a
bottleneck because there is a lot of heap allocation. It wouldn't be a
problem if we reused those vectors in following frames but we don't.

## Solution

Avoid cloning multiple vectors and instead build a single vector
containing data for all clusters.

I've recorded a trace in `3d_scene` example with disabled v-sync before
and after the change.
Mean FPS went from 424 to 990. Mean time for `extract_clusters` system
was reduced from 210 us to 24 us and `prepare_clusters` from 189 us to
87 us.


![image](https://github.com/bevyengine/bevy/assets/160391/ab66aa9d-1fa7-4993-9827-8be76b530972)

---

## Changelog

- Improved performance of `extract_clusters` and `prepare_clusters`
systems for scenes where lights affect a big part of it.
2024-01-29 17:50:22 +00:00
vero
45967b03b5
Fix specular envmap in deferred (#11534)
# Objective

- Fixes #11414

## Solution

- Add specular occlusion to g-buffer so PbrInput can be properly
reconstructed for shading with a non-zero value allowing the spec envmap
to be seen


![image](https://github.com/bevyengine/bevy/assets/11307157/84aa8312-7c06-4dc7-92da-5d94b54b133d)

---------

Co-authored-by: JMS55 <47158642+JMS55@users.noreply.github.com>
2024-01-29 16:39:49 +00:00
Elabajaba
35ac1b152e
Update to wgpu 0.19 and raw-window-handle 0.6 (#11280)
# Objective

Keep core dependencies up to date.

## Solution

Update the dependencies.

wgpu 0.19 only supports raw-window-handle (rwh) 0.6, so bumping that was
included in this.

The rwh 0.6 version bump is just the simplest way of doing it. There
might be a way we can take advantage of wgpu's new safe surface creation
api, but I'm not familiar enough with bevy's window management to
untangle it and my attempt ended up being a mess of lifetimes and rustc
complaining about missing trait impls (that were implemented). Thanks to
@MiniaczQ for the (much simpler) rwh 0.6 version bump code.

Unblocks https://github.com/bevyengine/bevy/pull/9172 and
https://github.com/bevyengine/bevy/pull/10812

~~This might be blocked on cpal and oboe updating their ndk versions to
0.8, as they both currently target ndk 0.7 which uses rwh 0.5.2~~ Tested
on android, and everything seems to work correctly (audio properly stops
when minimized, and plays when re-focusing the app).

---

## Changelog

- `wgpu` has been updated to 0.19! The long awaited arcanization has
been merged (for more info, see
https://gfx-rs.github.io/2023/11/24/arcanization.html), and Vulkan
should now be working again on Intel GPUs.
- Targeting WebGPU now requires that you add the new `webgpu` feature
(setting the `RUSTFLAGS` environment variable to
`--cfg=web_sys_unstable_apis` is still required). This feature currently
overrides the `webgl2` feature if you have both enabled (the `webgl2`
feature is enabled by default), so it is not recommended to add it as a
default feature to libraries without putting it behind a flag that
allows library users to opt out of it! In the future we plan on
supporting wasm binaries that can target both webgl2 and webgpu now that
wgpu added support for doing so (see
https://github.com/bevyengine/bevy/issues/11505).
- `raw-window-handle` has been updated to version 0.6.

## Migration Guide

- `bevy_render::instance_index::get_instance_index()` has been removed
as the webgl2 workaround is no longer required as it was fixed upstream
in wgpu. The `BASE_INSTANCE_WORKAROUND` shaderdef has also been removed.
- WebGPU now requires the new `webgpu` feature to be enabled. The
`webgpu` feature currently overrides the `webgl2` feature so you no
longer need to disable all default features and re-add them all when
targeting `webgpu`, but binaries built with both the `webgpu` and
`webgl2` features will only target the webgpu backend, and will only
work on browsers that support WebGPU.
- Places where you conditionally compiled things for webgl2 need to be
updated because of this change, eg:
- `#[cfg(any(not(feature = "webgl"), not(target_arch = "wasm32")))]`
becomes `#[cfg(any(not(feature = "webgl") ,not(target_arch = "wasm32"),
feature = "webgpu"))]`
- `#[cfg(all(feature = "webgl", target_arch = "wasm32"))]` becomes
`#[cfg(all(feature = "webgl", target_arch = "wasm32", not(feature =
"webgpu")))]`
- `if cfg!(all(feature = "webgl", target_arch = "wasm32"))` becomes `if
cfg!(all(feature = "webgl", target_arch = "wasm32", not(feature =
"webgpu")))`
- `create_texture_with_data` now also takes a `TextureDataOrder`. You
can probably just set this to `TextureDataOrder::default()`
- `TextureFormat`'s `block_size` has been renamed to `block_copy_size`
- See the `wgpu` changelog for anything I might've missed:
https://github.com/gfx-rs/wgpu/blob/trunk/CHANGELOG.md

---------

Co-authored-by: François <mockersf@gmail.com>
2024-01-26 18:14:21 +00:00
JMS55
a796d53a05
Meshlet prep (#11442)
# Objective

- Prep for https://github.com/bevyengine/bevy/pull/10164
- Make deferred_lighting_pass_id a ColorAttachment
- Correctly extract shadow view frusta so that the view uniforms get
populated
- Make some needed things public
- Misc formatting
2024-01-22 15:28:33 +00:00
Alice Cecile
eb07d16871
Revert rendering-related associated type name changes (#11027)
# Objective

> Can anyone explain to me the reasoning of renaming all the types named
Query to Data. I'm talking about this PR
https://github.com/bevyengine/bevy/pull/10779 It doesn't make sense to
me that a bunch of types that are used to run queries aren't named Query
anymore. Like ViewQuery on the ViewNode is the type of the Query. I
don't really understand the point of the rename, it just seems like it
hides the fact that a query will run based on those types.


[@IceSentry](https://discord.com/channels/691052431525675048/692572690833473578/1184946251431694387)

## Solution

Revert several renames in #10779.

## Changelog

- `ViewNode::ViewData` is now `ViewNode::ViewQuery` again.

## Migration Guide

- This PR amends the migration guide in
https://github.com/bevyengine/bevy/pull/10779

---------

Co-authored-by: atlas dostal <rodol@rivalrebels.com>
2024-01-22 15:01:55 +00:00
re0312
04aedf12fa
optimize batch_and_prepare_render_phase (#11323)
# Objective

- since #9685  ,bevy introduce automatic batching of draw commands, 
- `batch_and_prepare_render_phase` take the responsibility for batching
`phaseItem`,
- `GetBatchData` trait is used for indentify each phaseitem how to
batch. it defines a associated type `Data `used for Query to fetch data
from world.

- however,the impl of `GetBatchData ` in bevy always set ` type
Data=Entity` then we acually get following code
`let entity:Entity =query.get(item.entity())` that cause unnecessary
overhead .

## Solution

- remove associated type `Data ` and `Filter` from `GetBatchData `,
- change the type of the `query_item ` parameter in get_batch_data from`
Self::Data` to `Entity`.
- `batch_and_prepare_render_phase ` no longer takes a query using
`F::Data, F::Filter`
- `get_batch_data `now returns `Option<(Self::BufferData,
Option<Self::CompareData>)>`

---

## Performance
based in main merged with #11290 
Window 11 ,Intel 13400kf, NV 4070Ti

![image](https://github.com/bevyengine/bevy/assets/45868716/f63b9d98-6aee-4057-a2c7-a2162b2db765)
frame time from 3.34ms to 3 ms,  ~ 10%


![image](https://github.com/bevyengine/bevy/assets/45868716/a06eea9c-f79e-4324-8392-8d321560c5ba)
`batch_and_prepare_render_phase` from 800us ~ 400 us  

## Migration Guide
trait `GetBatchData` no longer hold associated type  `Data `and `Filter`
`get_batch_data` `query_item `type from `Self::Data` to `Entity` and
return `Option<(Self::BufferData, Option<Self::CompareData>)>`
`batch_and_prepare_render_phase`  should not have a query
2024-01-20 09:30:44 +00:00
Patrick Walton
83d6600267
Implement minimal reflection probes (fixed macOS, iOS, and Android). (#11366)
This pull request re-submits #10057, which was backed out for breaking
macOS, iOS, and Android. I've tested this version on macOS and Android
and on the iOS simulator.

# Objective

This pull request implements *reflection probes*, which generalize
environment maps to allow for multiple environment maps in the same
scene, each of which has an axis-aligned bounding box. This is a
standard feature of physically-based renderers and was inspired by [the
corresponding feature in Blender's Eevee renderer].

## Solution

This is a minimal implementation of reflection probes that allows
artists to define cuboid bounding regions associated with environment
maps. For every view, on every frame, a system builds up a list of the
nearest 4 reflection probes that are within the view's frustum and
supplies that list to the shader. The PBR fragment shader searches
through the list, finds the first containing reflection probe, and uses
it for indirect lighting, falling back to the view's environment map if
none is found. Both forward and deferred renderers are fully supported.

A reflection probe is an entity with a pair of components, *LightProbe*
and *EnvironmentMapLight* (as well as the standard *SpatialBundle*, to
position it in the world). The *LightProbe* component (along with the
*Transform*) defines the bounding region, while the
*EnvironmentMapLight* component specifies the associated diffuse and
specular cubemaps.

A frequent question is "why two components instead of just one?" The
advantages of this setup are:

1. It's readily extensible to other types of light probes, in particular
*irradiance volumes* (also known as ambient cubes or voxel global
illumination), which use the same approach of bounding cuboids. With a
single component that applies to both reflection probes and irradiance
volumes, we can share the logic that implements falloff and blending
between multiple light probes between both of those features.

2. It reduces duplication between the existing *EnvironmentMapLight* and
these new reflection probes. Systems can treat environment maps attached
to cameras the same way they treat environment maps applied to
reflection probes if they wish.

Internally, we gather up all environment maps in the scene and place
them in a cubemap array. At present, this means that all environment
maps must have the same size, mipmap count, and texture format. A
warning is emitted if this restriction is violated. We could potentially
relax this in the future as part of the automatic mipmap generation
work, which could easily do texture format conversion as part of its
preprocessing.

An easy way to generate reflection probe cubemaps is to bake them in
Blender and use the `export-blender-gi` tool that's part of the
[`bevy-baked-gi`] project. This tool takes a `.blend` file containing
baked cubemaps as input and exports cubemap images, pre-filtered with an
embedded fork of the [glTF IBL Sampler], alongside a corresponding
`.scn.ron` file that the scene spawner can use to recreate the
reflection probes.

Note that this is intentionally a minimal implementation, to aid
reviewability. Known issues are:

* Reflection probes are basically unsupported on WebGL 2, because WebGL
2 has no cubemap arrays. (Strictly speaking, you can have precisely one
reflection probe in the scene if you have no other cubemaps anywhere,
but this isn't very useful.)

* Reflection probes have no falloff, so reflections will abruptly change
when objects move from one bounding region to another.

* As mentioned before, all cubemaps in the world of a given type
(diffuse or specular) must have the same size, format, and mipmap count.

Future work includes:

* Blending between multiple reflection probes.

* A falloff/fade-out region so that reflected objects disappear
gradually instead of vanishing all at once.

* Irradiance volumes for voxel-based global illumination. This should
reuse much of the reflection probe logic, as they're both GI techniques
based on cuboid bounding regions.

* Support for WebGL 2, by breaking batches when reflection probes are
used.

These issues notwithstanding, I think it's best to land this with
roughly the current set of functionality, because this patch is useful
as is and adding everything above would make the pull request
significantly larger and harder to review.

---

## Changelog

### Added

* A new *LightProbe* component is available that specifies a bounding
region that an *EnvironmentMapLight* applies to. The combination of a
*LightProbe* and an *EnvironmentMapLight* offers *reflection probe*
functionality similar to that available in other engines.

[the corresponding feature in Blender's Eevee renderer]:
https://docs.blender.org/manual/en/latest/render/eevee/light_probes/reflection_cubemaps.html

[`bevy-baked-gi`]: https://github.com/pcwalton/bevy-baked-gi

[glTF IBL Sampler]: https://github.com/KhronosGroup/glTF-IBL-Sampler
2024-01-19 07:33:52 +00:00
JMS55
fcd7c0fc3d
Exposure settings (adopted) (#11347)
Rebased and finished version of
https://github.com/bevyengine/bevy/pull/8407. Huge thanks to @GitGhillie
for adjusting all the examples, and the many other people who helped
write this PR (@superdump , @coreh , among others) :)

Fixes https://github.com/bevyengine/bevy/issues/8369

---

## Changelog
- Added a `brightness` control to `Skybox`.
- Added an `intensity` control to `EnvironmentMapLight`.
- Added `ExposureSettings` and `PhysicalCameraParameters` for
controlling exposure of 3D cameras.
- Removed the baked-in `DirectionalLight` exposure Bevy previously
hardcoded internally.

## Migration Guide
- If using a `Skybox` or `EnvironmentMapLight`, use the new `brightness`
and `intensity` controls to adjust their strength.
- All 3D scene will now have different apparent brightnesses due to Bevy
implementing proper exposure controls. You will have to adjust the
intensity of your lights and/or your camera exposure via the new
`ExposureSettings` component to compensate.

---------

Co-authored-by: Robert Swain <robert.swain@gmail.com>
Co-authored-by: GitGhillie <jillisnoordhoek@gmail.com>
Co-authored-by: Marco Buono <thecoreh@gmail.com>
Co-authored-by: vero <email@atlasdostal.com>
Co-authored-by: atlas dostal <rodol@rivalrebels.com>
2024-01-16 14:53:21 +00:00
Aevyrie
839d2f8353
Approximate indirect specular occlusion (#11152)
# Objective

- The current PBR renderer over-brightens indirect specular reflections,
which tends to cause objects to appear to glow, because specular
occlusion is not accounted for.

## Solution

- Attenuate indirect specular term with an approximation for specular
occlusion, using [[Lagarde et al., 2014] (pg.
76)](https://seblagarde.files.wordpress.com/2015/07/course_notes_moving_frostbite_to_pbr_v32.pdf).

| Before | After | Animation |
| --- | --- | --- |
| <img width="1840" alt="before bike"
src="https://github.com/bevyengine/bevy/assets/2632925/b6e10d15-a998-4a94-875a-1c2b1e98348a">
| <img width="1840" alt="after bike"
src="https://github.com/bevyengine/bevy/assets/2632925/53b1479c-b1e4-427f-b140-53df26ca7193">
|
![ezgif-1-fbcbaf272b](https://github.com/bevyengine/bevy/assets/2632925/c2dece1c-eb3d-4e05-92a2-46cf83052c7c)
|
| <img width="1840" alt="classroom before"
src="https://github.com/bevyengine/bevy/assets/2632925/b16c0e74-741e-4f40-a7df-8863eaa62596">
| <img width="1840" alt="classroom after"
src="https://github.com/bevyengine/bevy/assets/2632925/26f9e971-0c63-4ee9-9544-964e5703d65e">
|
![ezgif-1-0f390edd06](https://github.com/bevyengine/bevy/assets/2632925/d8894e52-380f-4528-aa0d-1ca249108178)
|

---

## Changelog

- Ambient occlusion now applies to indirect specular reflections to
approximate how objects occlude specular light.

## Migration Guide

- Renamed `PbrInput::occlusion` to `diffuse_occlusion`, and added
`specular_occlusion`.
2024-01-15 16:10:55 +00:00