[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.
To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight`, `PointLight`, or `SpotLight`, as appropriate. This
shadow size value represents the size of the light and should be tuned
as appropriate for your scene. Higher values result in a wider penumbra
(i.e. blurrier shadows).
When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR #13003 to
land first).
In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1 meters. This change was necessary to make the point light shadow
map in the example look reasonable, as otherwise the shadows appeared
far too aliased.
A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.
Both temporal and non-temporal shadows are rather noisy in the example,
and, as mentioned before, this is unavoidable without downsampling the
depth buffer, which we can't do yet. Note also that the shadows don't
look particularly great for point lights; the example simply isn't an
ideal scene for them. Nevertheless, I felt that the benefits of the
ability to do a side-by-side comparison of directional and point lights
outweighed the unsightliness of the point light shadows in that example,
so I kept the point light feature in.
Fixes#3631.
[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf
## Changelog
### Added
* Percentage-closer soft shadows (PCSS) are now supported, allowing
shadows to become blurrier as they stretch away from objects. To use
them, set the `soft_shadow_size` field in `DirectionalLight`,
`PointLight`, or `SpotLight`, as applicable.
* The near Z value for shadow maps is now customizable via the
`shadow_map_near_z` field in `DirectionalLight`, `PointLight`, and
`SpotLight`.
## Screenshots
PCSS off:
![Screenshot 2024-05-24
120012](https://github.com/bevyengine/bevy/assets/157897/0d35fe98-245b-44fb-8a43-8d0272a73b86)
PCSS on:
![Screenshot 2024-05-24
115959](https://github.com/bevyengine/bevy/assets/157897/83397ef8-1317-49dd-bfb3-f8286d7610cd)
---------
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Torstein Grindvik <52322338+torsteingrindvik@users.noreply.github.com>
# Objective
- Fixes#15106
## Solution
- Trivial refactor to rename the method. The duplicate method `push` was
removed as well. This will simpify the API and make the semantics more
clear. `Add` implies that the action happens immediately, whereas in
reality, the command is queued to be run eventually.
- `ChildBuilder::add_command` has similarly been renamed to
`queue_command`.
## Testing
Unit tests should suffice for this simple refactor.
---
## Migration Guide
- `Commands::add` and `Commands::push` have been replaced with
`Commnads::queue`.
- `ChildBuilder::add_command` has been renamed to
`ChildBuilder::queue_command`.
# Objective
- Fixes#15236
## Solution
- Use bevy_math::ops instead of std floating point operations.
## Testing
- Did you test these changes? If so, how?
Unit tests and `cargo run -p ci -- test`
- How can other people (reviewers) test your changes? Is there anything
specific they need to know?
Execute `cargo run -p ci -- test` on Windows.
- If relevant, what platforms did you test these changes on, and are
there any important ones you can't test?
Windows
## Migration Guide
- Not a breaking change
- Projects should use bevy math where applicable
---------
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: IQuick 143 <IQuick143cz@gmail.com>
Co-authored-by: Joona Aalto <jondolf.dev@gmail.com>
# Objective
The names of numerous rendering components in Bevy are inconsistent and
a bit confusing. Relevant names include:
- `AutoExposureSettings`
- `AutoExposureSettingsUniform`
- `BloomSettings`
- `BloomUniform` (no `Settings`)
- `BloomPrefilterSettings`
- `ChromaticAberration` (no `Settings`)
- `ContrastAdaptiveSharpeningSettings`
- `DepthOfFieldSettings`
- `DepthOfFieldUniform` (no `Settings`)
- `FogSettings`
- `SmaaSettings`, `Fxaa`, `TemporalAntiAliasSettings` (really
inconsistent??)
- `ScreenSpaceAmbientOcclusionSettings`
- `ScreenSpaceReflectionsSettings`
- `VolumetricFogSettings`
Firstly, there's a lot of inconsistency between `Foo`/`FooSettings` and
`FooUniform`/`FooSettingsUniform` and whether names are abbreviated or
not.
Secondly, the `Settings` post-fix seems unnecessary and a bit confusing
semantically, since it makes it seem like the component is mostly just
auxiliary configuration instead of the core *thing* that actually
enables the feature. This will be an even bigger problem once bundles
like `TemporalAntiAliasBundle` are deprecated in favor of required
components, as users will expect a component named `TemporalAntiAlias`
(or similar), not `TemporalAntiAliasSettings`.
## Solution
Drop the `Settings` post-fix from the component names, and change some
names to be more consistent.
- `AutoExposure`
- `AutoExposureUniform`
- `Bloom`
- `BloomUniform`
- `BloomPrefilter`
- `ChromaticAberration`
- `ContrastAdaptiveSharpening`
- `DepthOfField`
- `DepthOfFieldUniform`
- `DistanceFog`
- `Smaa`, `Fxaa`, `TemporalAntiAliasing` (note: we might want to change
to `Taa`, see "Discussion")
- `ScreenSpaceAmbientOcclusion`
- `ScreenSpaceReflections`
- `VolumetricFog`
I kept the old names as deprecated type aliases to make migration a bit
less painful for users. We should remove them after the next release.
(And let me know if I should just... not add them at all)
I also added some very basic docs for a few types where they were
missing, like on `Fxaa` and `DepthOfField`.
## Discussion
- `TemporalAntiAliasing` is still inconsistent with `Smaa` and `Fxaa`.
Consensus [on
Discord](https://discord.com/channels/691052431525675048/743663924229963868/1280601167209955431)
seemed to be that renaming to `Taa` would probably be fine, but I think
it's a bit more controversial, and it would've required renaming a lot
of related types like `TemporalAntiAliasNode`,
`TemporalAntiAliasBundle`, and `TemporalAntiAliasPlugin`, so I think
it's better to leave to a follow-up.
- I think `Fog` should probably have a more specific name like
`DistanceFog` considering it seems to be distinct from `VolumetricFog`.
~~This should probably be done in a follow-up though, so I just removed
the `Settings` post-fix for now.~~ (done)
---
## Migration Guide
Many rendering components have been renamed for improved consistency and
clarity.
- `AutoExposureSettings` → `AutoExposure`
- `BloomSettings` → `Bloom`
- `BloomPrefilterSettings` → `BloomPrefilter`
- `ContrastAdaptiveSharpeningSettings` → `ContrastAdaptiveSharpening`
- `DepthOfFieldSettings` → `DepthOfField`
- `FogSettings` → `DistanceFog`
- `SmaaSettings` → `Smaa`
- `TemporalAntiAliasSettings` → `TemporalAntiAliasing`
- `ScreenSpaceAmbientOcclusionSettings` → `ScreenSpaceAmbientOcclusion`
- `ScreenSpaceReflectionsSettings` → `ScreenSpaceReflections`
- `VolumetricFogSettings` → `VolumetricFog`
---------
Co-authored-by: Carter Anderson <mcanders1@gmail.com>
# Objective
`NoFrustumCulling` prevents meshes from being considered out of view
based on AABBs (sometimes useful for skinned meshes which don't
recalculate AABBs currently). it currently only applies for primary view
rendering, not for shadow rendering which can result in missing shadows.
## Solution
Add checks for `NoFrustumCulling` to `check_dir_light_mesh_visibility`
and `check_point_light_mesh_visibility` so that `NoFrustumCulling`
entities are rendered to all shadow views as well as all primary views.
# Objective
Make choosing of diffuse indirect lighting explicit, instead of using
numerical conditions like `all(indirect_light == vec3(0.0f))`, as using
that may lead to unwanted light leakage.
## Solution
Use an explicit `found_diffuse_indirect` condition to indicate the found
indirect lighting source.
## Testing
I have tested examples `lightmaps`, `irradiance_volumes` and
`reflection_probes`, there are no visual changes. For further testing,
consider a "cave" scene with lightmaps and irradiance volumes. In the
cave there are some purly dark occluded area, those dark area will
sample the irradiance volume, and that is easy to leak light.
Hello,
I'd like to contribute to this project by adding some useful constants
and improving the documentation for the AspectRatio struct. Here's a
summary of the changes I've made:
1. Added new constants for common aspect ratios:
- SIXTEEN_NINE (16:9)
- FOUR_THREE (4:3)
- ULTRAWIDE (21:9)
2. Enhanced the overall documentation:
- Improved module-level documentation with an overview and use cases
- Expanded explanation of the AspectRatio struct with examples
- Added detailed descriptions and examples for all methods (both
existing and new)
- Included explanations for the newly introduced constant values
- Added clarifications for From trait implementations
These changes aim to make the AspectRatio API more user-friendly and
easier to understand. The new constants provide convenient access to
commonly used aspect ratios, which I believe will be helpful in many
scenarios.
---------
Co-authored-by: Gonçalo Rica Pais da Silva <bluefinger@gmail.com>
Co-authored-by: Lixou <82600264+DasLixou@users.noreply.github.com>
Since `StandardMaterial::emissive_exposure_weight` does not get packed
into the gbuffer in the deferred case, unpacking uses an implicit
default value for emissive's alpha channel.
This resulted in divergent behavior between the forward and deferred
renderers when using standard materials with default
emissive_exposure_weight, this value defaulting to `0.0` in the forward
case and `1.0` in the other.
This patch changes the implicit value in the deferred case to `0.0` in
order to match the behavior of the forward renderer. However, this still
does not solve the case where `emissive_exposure_weight` is not `0.0`.
### Builder changes
- Increased meshlet max vertices/triangles from 64v/64t to 255v/128t
(meshoptimizer won't allow 256v sadly). This gives us a much greater
percentage of meshlets with max triangle count (128). Still not perfect,
we still end up with some tiny <=10 triangle meshlets that never really
get simplified, but it's progress.
- Removed the error target limit. Now we allow meshoptimizer to simplify
as much as possible. No reason to cap this out, as the cluster culling
code will choose a good LOD level anyways. Again leads to higher quality
LOD trees.
- After some discussion and consulting the Nanite slides again, changed
meshlet group error from _adding_ the max child's error to the group
error, to doing `group_error = max(group_error, max_child_error)`. Error
is already cumulative between LODs as the edges we're collapsing during
simplification get longer each time.
- Bumped the 65% simplification threshold to allow up to 95% of the
original geometry (e.g. accept simplification as valid even if we only
simplified 5% of the triangles). This gives us closer to
log2(initial_meshlet_count) LOD levels, and fewer meshlet roots in the
DAG.
Still more work to be done in the future here. Maybe trying METIS for
meshlet building instead of meshoptimizer.
Using ~8 clusters per group instead of ~4 might also make a big
difference. The Nanite slides say that they have 8-32 meshlets per
group, suggesting some kind of heuristic. Unfortunately meshopt's
compute_cluster_bounds won't work with large groups atm
(https://github.com/zeux/meshoptimizer/discussions/750#discussioncomment-10562641)
so hard to test.
Based on discussion from
https://github.com/bevyengine/bevy/discussions/14998,
https://github.com/zeux/meshoptimizer/discussions/750, and discord.
### Runtime changes
- cluster:triangle packed IDs are now stored 25:7 instead of 26:6 bits,
as max triangles per cluster are now 128 instead of 64
- Hardware raster now spawns 128 * 3 vertices instead of 64 * 3 vertices
to account for the new max triangles limit
- Hardware raster now outputs NaN triangles (0 / 0) instead of
zero-positioned triangles for extra vertex invocations over the cluster
triangle count. Shouldn't really be a difference idt, but I did it
anyways.
- Software raster now does 128 threads per workgroup instead of 64
threads. Each thread now loads, projects, and caches a vertex (vertices
0-127), and then if needed does so again (vertices 128-254). Each thread
then rasterizes one of 128 triangles.
- Fixed a bug with `needs_dispatch_remap`. I had the condition backwards
in my last PR, I probably committed it by accident after testing the
non-default code path on my GPU.
# Objective
- Crate-level prelude modules, such as `bevy_ecs::prelude`, are plagued
with inconsistency! Let's fix it!
## Solution
Format all preludes based on the following rules:
1. All preludes should have brief documentation in the format of:
> The _name_ prelude.
>
> This includes the most common types in this crate, re-exported for
your convenience.
2. All documentation should be outer, not inner. (`///` instead of
`//!`.)
3. No prelude modules should be annotated with `#[doc(hidden)]`. (Items
within them may, though I'm not sure why this was done.)
## Testing
- I manually searched for the term `mod prelude` and updated all
occurrences by hand. 🫠
---------
Co-authored-by: Gino Valente <49806985+MrGVSV@users.noreply.github.com>
# Objective
As discussed in https://github.com/bevyengine/bevy/issues/7386, system
order ambiguities within `DefaultPlugins` are a source of bugs in the
engine and badly pollute diagnostic output for users.
We should eliminate them!
This PR is an alternative to #15027: with all external ambiguities
silenced, this should be much less prone to merge conflicts and the test
output should be much easier for authors to understand.
Note that system order ambiguities are still permitted in the
`RenderApp`: these need a bit of thought in terms of how to test them,
and will be fairly involved to fix. While these aren't *good*, they'll
generally only cause graphical bugs, not logic ones.
## Solution
All remaining system order ambiguities have been resolved.
Review this PR commit-by-commit to see how each of these problems were
fixed.
## Testing
`cargo run --example ambiguity_detection` passes with no panics or
logging!
# Objective
- Fixes#14974
## Solution
- Replace all* instances of `NonZero*` with `NonZero<*>`
## Testing
- CI passed locally.
---
## Notes
Within the `bevy_reflect` implementations for `std` types,
`impl_reflect_value!()` will continue to use the type aliases instead,
as it inappropriately parses the concrete type parameter as a generic
argument. If the `ZeroablePrimitive` trait was stable, or the macro
could be modified to accept a finite list of types, then we could fully
migrate.
# Objective
Fixes#14883
## Solution
Pretty simple update to `EntityCommands` methods to consume `self` and
return it rather than taking `&mut self`. The things probably worth
noting:
* I added `#[allow(clippy::should_implement_trait)]` to the `add` method
because it causes a linting conflict with `std::ops::Add`.
* `despawn` and `log_components` now return `Self`. I'm not sure if
that's exactly the desired behavior so I'm happy to adjust if that seems
wrong.
## Testing
Tested with `cargo run -p ci`. I think that should be sufficient to call
things good.
## Migration Guide
The most likely migration needed is changing code from this:
```
let mut entity = commands.get_or_spawn(entity);
if depth_prepass {
entity.insert(DepthPrepass);
}
if normal_prepass {
entity.insert(NormalPrepass);
}
if motion_vector_prepass {
entity.insert(MotionVectorPrepass);
}
if deferred_prepass {
entity.insert(DeferredPrepass);
}
```
to this:
```
let mut entity = commands.get_or_spawn(entity);
if depth_prepass {
entity = entity.insert(DepthPrepass);
}
if normal_prepass {
entity = entity.insert(NormalPrepass);
}
if motion_vector_prepass {
entity = entity.insert(MotionVectorPrepass);
}
if deferred_prepass {
entity.insert(DeferredPrepass);
}
```
as can be seen in several of the example code updates here. There will
probably also be instances where mutable `EntityCommands` vars no longer
need to be mutable.
# Objective
- Faster meshlet rasterization path for small triangles
- Avoid having to allocate and write out a triangle buffer
- Refactor gpu_scene.rs
## Solution
- Replace the 32bit visbuffer texture with a 64bit visbuffer buffer,
where the left 32 bits encode depth, and the right 32 bits encode the
existing cluster + triangle IDs. Can't use 64bit textures, wgpu/naga
doesn't support atomic ops on textures yet.
- Instead of writing out a buffer of packed cluster + triangle IDs (per
triangle) to raster, the culling pass now writes out a buffer of just
cluster IDs (per cluster, so less memory allocated, cheaper to write
out).
- Clusters for software raster are allocated from the left side
- Clusters for hardware raster are allocated in the same buffer, from
the right side
- The buffer size is fixed at MeshletPlugin build time, and should be
set to a reasonable value for your scene (no warning on overflow, and no
good way to determine what value you need outside of renderdoc - I plan
to fix this in a future PR adding a meshlet stats overlay)
- Currently I don't have a heuristic for software vs hardware raster
selection for each cluster. The existing code is just a placeholder. I
need to profile on a release scene and come up with a heuristic,
probably in a future PR.
- The culling shader is getting pretty hard to follow at this point, but
I don't want to spend time improving it as the entire shader/pass is
getting rewritten/replaced in the near future.
- Software raster is a compute workgroup per-cluster. Each workgroup
loads and transforms the <=64 vertices of the cluster, and then
rasterizes the <=64 triangles of the cluster.
- Two variants are implemented: Scanline for clusters with any larger
triangles (still smaller than hardware is good at), and brute-force for
very very tiny triangles
- Once the shader determines that a pixel should be filled in, it does
an atomicMax() on the visbuffer to store the results, copying how Nanite
works
- On devices with a low max workgroups per dispatch limit, an extra
compute pass is inserted before software raster to convert from a 1d to
2d dispatch (I don't think 3d would ever be necessary).
- I haven't implemented the top-left rule or subpixel precision yet, I'm
leaving that for a future PR since I get usable results without it for
now
- Resources used:
https://kristoffer-dyrkorn.github.io/triangle-rasterizer and chapters
6-8 of
https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index
- Hardware raster now spawns 64*3 vertex invocations per meshlet,
instead of the actual meshlet vertex count. Extra invocations just
early-exit.
- While this is slower than the existing system, hardware draws should
be rare now that software raster is usable, and it saves a ton of memory
using the unified cluster ID buffer. This would be fixed if wgpu had
support for mesh shaders.
- Instead of writing to a color+depth attachment, the hardware raster
pass also does the same atomic visbuffer writes that software raster
uses.
- We have to bind a dummy render target anyways, as wgpu doesn't
currently support render passes without any attachments
- Material IDs are no longer written out during the main rasterization
passes.
- If we had async compute queues, we could overlap the software and
hardware raster passes.
- New material and depth resolve passes run at the end of the visbuffer
node, and write out view depth and material ID depth textures
### Misc changes
- Fixed cluster culling importing, but never actually using the previous
view uniforms when doing occlusion culling
- Fixed incorrectly adding the LOD error twice when building the meshlet
mesh
- Splitup gpu_scene module into meshlet_mesh_manager, instance_manager,
and resource_manager
- resource_manager is still too complex and inefficient (extract and
prepare are way too expensive). I plan on improving this in a future PR,
but for now ResourceManager is mostly a 1:1 port of the leftover
MeshletGpuScene bits.
- Material draw passes have been renamed to the more accurate material
shade pass, as well as some other misc renaming (in the future, these
will be compute shaders even, and not actual draw calls)
---
## Migration Guide
- TBD (ask me at the end of the release for meshlet changes as a whole)
---------
Co-authored-by: vero <email@atlasdostal.com>
# Objective
Adding more features to `AsBindGroup` proc macro means making the trait
arguments uglier. Downstream implementors of the trait without the proc
macro might want to do different things than our default arguments.
## Solution
Make `AsBindGroup` take an associated `Param` type.
## Migration Guide
`AsBindGroup` now allows the user to specify a `SystemParam` to be used
for creating bind groups.
# Objective
- There is a flaw in the implementation of `FogVolume`'s
`density_texture_offset` from #14868. Because of the way I am wrapping
the UVW coordinates in the volumetric fog shader, a seam is visible when
the 3d texture is wrapping around from one side to the other:
![density_texture_offset_seam](https://github.com/user-attachments/assets/89527ef2-5e1b-4b90-8e73-7a3e607697d4)
## Solution
- This PR fixes the issue by removing the wrapping from the shader and
instead leaving it to the user to configure the 3d noise texture to use
`ImageAddressMode::Repeat` if they want it to repeat. Using
`ImageAddressMode::Repeat` is the proper solution to avoid the obvious
seam:
![density_texture_seam_fixed](https://github.com/user-attachments/assets/06e871a6-2db1-4501-b425-4141605f9b26)
- The sampler cannot be implicitly configured to use
`ImageAddressMode::Repeat` because that's not always desirable. For
example, the `fog_volumes` example wouldn't work properly because the
texture from the edges of the volume would overflow to the other sides,
which would be bad in this instance (but it's good in the case of the
`scrolling_fog` example). So leaving it to the user to decide on their
own whether they want the density texture to repeat seems to be the best
solution.
## Testing
- The `scrolling_fog` example still looks the same, it was just changed
to explicitly declare that the density texture should be repeating when
loading the asset. The `fog_volumes` example is unaffected.
<details>
<summary>Minimal reproduction example on current main</summary>
<pre>
use bevy::core_pipeline::experimental::taa::{TemporalAntiAliasBundle,
TemporalAntiAliasPlugin};
use bevy::pbr::{FogVolume, VolumetricFogSettings, VolumetricLight};
use bevy::prelude::*;<br>
fn main() {
App::new()
.add_plugins((DefaultPlugins, TemporalAntiAliasPlugin))
.add_systems(Startup, setup)
.run();
}<br>
fn setup(mut commands: Commands, assets: Res<AssetServer>) {
commands.spawn((
Camera3dBundle {
transform: Transform::from_xyz(3.5, -1.0, 0.4)
.looking_at(Vec3::new(0.0, 0.0, 0.4), Vec3::Y),
msaa: Msaa::Off,
..default()
},
TemporalAntiAliasBundle::default(),
VolumetricFogSettings {
ambient_intensity: 0.0,
jitter: 0.5,
..default()
},
));<br>
commands.spawn((
DirectionalLightBundle {
transform: Transform::from_xyz(-6.0, 5.0, -9.0)
.looking_at(Vec3::new(0.0, 0.0, 0.0), Vec3::Y),
directional_light: DirectionalLight {
illuminance: 32_000.0,
shadows_enabled: true,
..default()
},
..default()
},
VolumetricLight,
));<br>
commands.spawn((
SpatialBundle {
visibility: Visibility::Visible,
transform: Transform::from_xyz(0.0, 0.0,
0.0).with_scale(Vec3::splat(3.0)),
..default()
},
FogVolume {
density_texture: Some(assets.load("volumes/fog_noise.ktx2")),
density_texture_offset: Vec3::new(0.0, 0.0, 0.4),
scattering: 1.0,
..default()
},
));
}
</pre>
</details>
# Objective
- The goal of this PR is to make it possible to move the density texture
of a `FogVolume` over time in order to create dynamic effects like fog
moving in the wind.
- You could theoretically move the `FogVolume` itself, but this is not
ideal, because the `FogVolume` AABB would eventually leave the area. If
you want an area to remain foggy while also creating the impression that
the fog is moving in the wind, a scrolling density texture is a better
solution.
## Solution
- The PR adds a `density_texture_offset` field to the `FogVolume`
component. This offset is in the UVW coordinates of the density texture,
meaning that a value of `(0.5, 0.0, 0.0)` moves the 3d texture by half
along the x-axis.
- Values above 1.0 are wrapped, a 1.5 offset is the same as a 0.5
offset. This makes it so that the density texture wraps around on the
other side, meaning that a repeating 3d noise texture can seamlessly
scroll forever. It also makes it easy to move the density texture over
time by simply increasing the offset every frame.
## Testing
- A `scrolling_fog` example has been added to demonstrate the feature.
It uses the offset to scroll a repeating 3d noise density texture to
create the impression of fog moving in the wind.
- The camera is looking at a pillar with the sun peaking behind it. This
highlights the effect the changing density has on the volumetric
lighting interactions.
- Temporal anti-aliasing combined with the `jitter` option of
`VolumetricFogSettings` is used to improve the quality of the effect.
---
## Showcase
https://github.com/user-attachments/assets/3aa50ebd-771c-4c99-ab5d-255c0c3be1a8
# Objective
Fixes#14782
## Solution
Enable the lint and fix all upcoming hints (`--fix`). Also tried to
figure out the false-positive (see review comment). Maybe split this PR
up into multiple parts where only the last one enables the lint, so some
can already be merged resulting in less many files touched / less
potential for merge conflicts?
Currently, there are some cases where it might be easier to read the
code with the qualifier, so perhaps remove the import of it and adapt
its cases? In the current stage it's just a plain adoption of the
suggestions in order to have a base to discuss.
## Testing
`cargo clippy` and `cargo run -p ci` are happy.
# Objective
currently if we use an image with the wrong sampler type in a material,
wgpu panics with an invalid texture format. turn this into a warning and
fail more gracefully.
## Solution
the expected sampler type is specified in the AsBindGroup derive, so we
can just check the image sampler is what it should be.
i am not totally sure about the mapping of image sampler type to
#[sampler(type)], i assumed:
```
"filtering" => [ TextureSampleType::Float { filterable: true } ],
"non_filtering" => [
TextureSampleType::Float { filterable: false },
TextureSampleType::Sint,
TextureSampleType::Uint,
],
"comparison" => [ TextureSampleType::Depth ],
```
# Objective
Fixes#14365
## Migration Guide
- When using the iterator returned by `Mesh::attributes` or
`Mesh::attributes_mut` the first value of the tuple is not the
`MeshVertexAttribute` instead of `MeshVertexAttributeId`. To access the
`MeshVertexAttributeId` use the `MeshVertexAttribute.id` field.
Signed-off-by: Sarthak Singh <sarthak.singh99@gmail.com>
Basically it's https://github.com/bevyengine/bevy/pull/13792 with the
bumped versions of `encase` and `hexasphere`.
---------
Co-authored-by: Robert Swain <robert.swain@gmail.com>
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
# Objective
- Fix issue #2611
## Solution
- Add `--generate-link-to-definition` to all the `rustdoc-args` arrays
in the `Cargo.toml`s (for docs.rs)
- Add `--generate-link-to-definition` to the `RUSTDOCFLAGS` environment
variable in the docs workflow (for dev-docs.bevyengine.org)
- Document all the workspace crates in the docs workflow (needed because
otherwise only the source code of the `bevy` package will be included,
making the argument useless)
- I think this also fixes#3662, since it fixes the bug on
dev-docs.bevyengine.org, while on docs.rs it has been fixed for a while
on their side.
---
## Changelog
- The source code viewer on docs.rs now includes links to the
definitions.
# Objective
- It's possible to have errors in a draw command, but these errors are
ignored
## Solution
- Return a result with the error
## Changelog
Renamed `RenderCommandResult::Failure` to `RenderCommandResult::Skip`
Added a `reason` string parameter to `RenderCommandResult::Failure`
## Migration Guide
If you were using `RenderCommandResult::Failure` to just ignore an error
and retry later, use `RenderCommandResult::Skip` instead.
This wasn't intentional, but this PR should also help with
https://github.com/bevyengine/bevy/issues/12660 since we can turn a few
unwraps into error messages now.
---------
Co-authored-by: Charlotte McElwain <charlotte.c.mcelwain@gmail.com>
The "uberbuffers" PR #14257 caused some examples to fail intermittently
for different reasons:
1. `morph_targets` could fail because vertex displacements for morph
targets are keyed off the vertex index. With buffer packing, the vertex
index can vary based on the position in the buffer, which caused the
morph targets to be potentially incorrect. The solution is to include
the first vertex index with the `MeshUniform` (and `MeshInputUniform` if
GPU preprocessing is in use), so that the shader can calculate the true
vertex index before performing the morph operation. This results in
wasted space in `MeshUniform`, which is unfortunate, but we'll soon be
filling in the padding with the ID of the material when bindless
textures land, so this had to happen sooner or later anyhow.
Including the vertex index in the `MeshInputUniform` caused an ordering
problem. The `MeshInputUniform` was created during the extraction phase,
before the allocations occurred, so the extraction logic didn't know
where the mesh vertex data was going to end up. The solution is to move
the `MeshInputUniform` creation (the `collect_meshes_for_gpu_building`
system) to after the allocations phase. This should be better for
parallelism anyhow, because it allows the extraction phase to finish
quicker. It's also something we'll have to do for bindless in any event.
2. The `lines` and `fog_volumes` examples could fail because their
custom drawing nodes weren't updated to supply the vertex and index
offsets in their `draw_indexed` and `draw` calls. This commit fixes this
oversight.
Fixes#14366.
Switches `Msaa` from being a globally configured resource to a per
camera view component.
Closes#7194
# Objective
Allow individual views to describe their own MSAA settings. For example,
when rendering to different windows or to different parts of the same
view.
## Solution
Make `Msaa` a component that is required on all camera bundles.
## Testing
Ran a variety of examples to ensure that nothing broke.
TODO:
- [ ] Make sure android still works per previous comment in
`extract_windows`.
---
## Migration Guide
`Msaa` is no longer configured as a global resource, and should be
specified on each spawned camera if a non-default setting is desired.
---------
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: François Mockers <francois.mockers@vleue.com>
# Objective
- Fixes: https://github.com/bevyengine/bevy/issues/14036
## Solution
- Add a world space transformation for the environment sample direction.
## Testing
- I have tested the newly added `transform` field using the newly added
`rotate_environment_map` example.
https://github.com/user-attachments/assets/2de77c65-14bc-48ee-b76a-fb4e9782dbdb
## Migration Guide
- Since we have added a new filed to the `EnvironmentMapLight` struct,
users will need to include `..default()` or some rotation value in their
initialization code.
This commit uses the [`offset-allocator`] crate to combine vertex and
index arrays from different meshes into single buffers. Since the
primary source of `wgpu` overhead is from validation and synchronization
when switching buffers, this significantly improves Bevy's rendering
performance on many scenes.
This patch is a more flexible version of #13218, which also used slabs.
Unlike #13218, which used slabs of a fixed size, this commit implements
slabs that start small and can grow. In addition to reducing memory
usage, supporting slab growth reduces the number of vertex and index
buffer switches that need to happen during rendering, leading to
improved performance. To prevent pathological fragmentation behavior,
slabs are capped to a maximum size, and mesh arrays that are too large
get their own dedicated slabs.
As an additional improvement over #13218, this commit allows the
application to customize all allocator heuristics. The
`MeshAllocatorSettings` resource contains values that adjust the minimum
and maximum slab sizes, the cutoff point at which meshes get their own
dedicated slabs, and the rate at which slabs grow. Hopefully-sensible
defaults have been chosen for each value.
Unfortunately, WebGL 2 doesn't support the *base vertex* feature, which
is necessary to pack vertex arrays from different meshes into the same
buffer. `wgpu` represents this restriction as the downlevel flag
`BASE_VERTEX`. This patch detects that bit and ensures that all vertex
buffers get dedicated slabs on that platform. Even on WebGL 2, though,
we can combine all *index* arrays into single buffers to reduce buffer
changes, and we do so.
The following measurements are on Bistro:
Overall frame time improves from 8.74 ms to 5.53 ms (1.58x speedup):
![Screenshot 2024-07-09
163521](https://github.com/bevyengine/bevy/assets/157897/5d83c824-c0ee-434c-bbaf-218ff7212c48)
Render system time improves from 6.57 ms to 3.54 ms (1.86x speedup):
![Screenshot 2024-07-09
163559](https://github.com/bevyengine/bevy/assets/157897/d94e2273-c3a0-496a-9f88-20d394129610)
Opaque pass time improves from 4.64 ms to 2.33 ms (1.99x speedup):
![Screenshot 2024-07-09
163536](https://github.com/bevyengine/bevy/assets/157897/e4ef6e48-d60e-44ae-9a71-b9a731c99d9a)
## Migration Guide
### Changed
* Vertex and index buffers for meshes may now be packed alongside other
buffers, for performance.
* `GpuMesh` has been renamed to `RenderMesh`, to reflect the fact that
it no longer directly stores handles to GPU objects.
* Because meshes no longer have their own vertex and index buffers, the
responsibility for the buffers has moved from `GpuMesh` (now called
`RenderMesh`) to the `MeshAllocator` resource. To access the vertex data
for a mesh, use `MeshAllocator::mesh_vertex_slice`. To access the index
data for a mesh, use `MeshAllocator::mesh_index_slice`.
[`offset-allocator`]: https://github.com/pcwalton/offset-allocator
Currently, volumetric fog is global and affects the entire scene
uniformly. This is inadequate for many use cases, such as local smoke
effects. To address this problem, this commit introduces *fog volumes*,
which are axis-aligned bounding boxes (AABBs) that specify fog
parameters inside their boundaries. Such volumes can also specify a
*density texture*, a 3D texture of voxels that specifies the density of
the fog at each point.
To create a fog volume, add a `FogVolume` component to an entity (which
is included in the new `FogVolumeBundle` convenience bundle). Like light
probes, a fog volume is conceptually a 1×1×1 cube centered on the
origin; a transform can be used to position and resize this region. Many
of the fields on the existing `VolumetricFogSettings` have migrated to
the new `FogVolume` component. `VolumetricFogSettings` on a camera is
still needed to enable volumetric fog. However, by itself
`VolumetricFogSettings` is no longer sufficient to enable volumetric
fog; a `FogVolume` must be present. Applications that wish to retain the
old global fog behavior can simply surround the scene with a large fog
volume.
By way of implementation, this commit converts the volumetric fog shader
from a full-screen shader to one applied to a mesh. The strategy is
different depending on whether the camera is inside or outside the fog
volume. If the camera is inside the fog volume, the mesh is simply a
plane scaled to the viewport, effectively falling back to a full-screen
pass. If the camera is outside the fog volume, the mesh is a cube
transformed to coincide with the boundaries of the fog volume's AABB.
Importantly, in the latter case, only the front faces of the cuboid are
rendered. Instead of treating the boundaries of the fog as a sphere
centered on the camera position, as we did prior to this patch, we
raytrace the far planes of the AABB to determine the portion of each ray
contained within the fog volume. We then raymarch in shadow map space as
usual. If a density texture is present, we modulate the fixed density
value with the trilinearly-interpolated value from that texture.
Furthermore, this patch introduces optional jitter to fog volumes,
intended for use with TAA. This modifies the position of the ray from
frame to frame using interleaved gradient noise, in order to reduce
aliasing artifacts. Many implementations of volumetric fog in games use
this technique. Note that this patch makes no attempt to write a motion
vector; this is because when a view ray intersects multiple voxels
there's no single direction of motion. Consequently, fog volumes can
have ghosting artifacts, but because fog is "ghostly" by its nature,
these artifacts are less objectionable than they would be for opaque
objects.
A new example, `fog_volumes`, has been added. It demonstrates a single
fog volume containing a voxelized representation of the Stanford bunny.
The existing `volumetric_fog` example has been updated to use the new
local volumetrics API.
## Changelog
### Added
* Local `FogVolume`s are now supported, to localize fog to specific
regions. They can optionally have 3D density voxel textures for precise
control over the distribution of the fog.
### Changed
* `VolumetricFogSettings` on a camera no longer enables volumetric fog;
instead, it simply enables the processing of `FogVolume`s within the
scene.
## Migration Guide
* A `FogVolume` is now necessary in order to enable volumetric fog, in
addition to `VolumetricFogSettings` on the camera. Existing uses of
volumetric fog can be migrated by placing a large `FogVolume`
surrounding the scene.
---------
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: François Mockers <mockersf@gmail.com>
# Objective
- Using bincode to deserialize binary into a MeshletMesh is expensive
(~77ms for a 5mb file).
## Solution
- Write a custom deserializer using bytemuck's Pod types and slice
casting.
- Total asset load time has gone from ~102ms to ~12ms.
- Change some types I never meant to be public to private and other misc
cleanup.
## Testing
- Ran the meshlet example and added timing spans to the asset loader.
---
## Changelog
- Improved `MeshletMesh` loading speed
- The `MeshletMesh` disk format has changed, and
`MESHLET_MESH_ASSET_VERSION` has been bumped
- `MeshletMesh` fields are now private
- Renamed `MeshletMeshSaverLoad` to `MeshletMeshSaverLoader`
- The `Meshlet`, `MeshletBoundingSpheres`, and `MeshletBoundingSphere`
types are now private
- Removed `MeshletMeshSaveOrLoadError::SerializationOrDeserialization`
- Added `MeshletMeshSaveOrLoadError::WrongFileType`
## Migration Guide
- Regenerate your `MeshletMesh` assets, as the disk format has changed,
and `MESHLET_MESH_ASSET_VERSION` has been bumped
- `MeshletMesh` fields are now private
- `MeshletMeshSaverLoad` is now named `MeshletMeshSaverLoader`
- The `Meshlet`, `MeshletBoundingSpheres`, and `MeshletBoundingSphere`
types are now private
- `MeshletMeshSaveOrLoadError::SerializationOrDeserialization` has been
removed
- Added `MeshletMeshSaveOrLoadError::WrongFileType`, match on this
variant if you match on `MeshletMeshSaveOrLoadError`
# Objective
- After #11804 , The queue_prepass_material_meshes function is now
executed in parallel with other queue_* systems. This optimization
introduced a potential issue where mesh_instance.should_batch() could
return false in queue_prepass_material_meshes due to an unset
material_bind_group_id.
# Objective
- After #13894, I noticed the performance of `many_lights `dropped from
120+ to 60+. I reviewed the PR but couldn't identify any mistakes. After
profiling, I discovered that `Hashmap::Clone `was very slow when its not
empty, causing `extract_light` to increase from 3ms to 8ms.
- Lighting only checks visibility for 3D Meshes. We don't need to
maintain a TypeIdMap for this, as it not only impacts performance
negatively but also reduces ergonomics.
## Solution
- use VisibleMeshEntities for lighint visibility checking.
## Performance
cargo run --release --example many_lights --features bevy/trace_tracy
name="bevy_pbr::light::check_point_light_mesh_visibility"}
![image](https://github.com/bevyengine/bevy/assets/45868716/8bad061a-f936-45a0-9bb9-4fbdaceec08b)
system{name="bevy_pbr::render::light::extract_lights"}
![image](https://github.com/bevyengine/bevy/assets/45868716/ca75b46c-b4ad-45d3-8c8d-66442447b753)
## Migration Guide
> now `SpotLightBundle` , `CascadesVisibleEntities `and
`CubemapVisibleEntities `use VisibleMeshEntities instead of
`VisibleEntities`
---------
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
# Objective
- Bevy currently has lot of invalid intra-doc links, let's fix them!
- Also make CI test them, to avoid future regressions.
- Helps with #1983 (but doesn't fix it, as there could still be explicit
links to docs.rs that are broken)
## Solution
- Make `cargo r -p ci -- doc-check` check fail on warnings (could also
be changed to just some specific lints)
- Manually fix all the warnings (note that in some cases it was unclear
to me what the fix should have been, I'll try to highlight them in a
self-review)
Bump version after release
This PR has been auto-generated
Co-authored-by: Bevy Auto Releaser <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: François Mockers <mockersf@gmail.com>
# Objective
Both `Material` and `MaterialExtension` (base and extension) can derive
Debug, so there's no reason to not allow `ExtendedMaterial` to derive it
## Solution
- Describe the solution used to achieve the objective above.
Add `Debug` to the list of derived traits
## Testing
- Did you test these changes? If so, how?
I compiled my test project on latest commit, making sure it actually
compiles
- How can other people (reviewers) test your changes? Is there anything
specific they need to know?
Create an ExtendedMaterial instance, try to `println!("{:?}",
material);`
Co-authored-by: NWPlayer123 <NWPlayer123@users.noreply.github.com>
# Objective
- Standard Material is starting to run out of samplers (currently uses
13 with no additional features off, I think in 0.13 it was 12).
- This change adds a new feature switch, modelled on the other ones
which add features to Standard Material, to turn off the new anisotropy
feature by default.
## Solution
- feature + texture define
## Testing
- Anisotropy example still works fine
- Other samples work fine
- Standard Material now takes 12 samplers by default on my Mac instead
of 13
## Migration Guide
- Add feature pbr_anisotropy_texture if you are using that texture in
any standard materials.
---------
Co-authored-by: John Payne <20407779+johngpayne@users.noreply.github.com>
# Objective
The `AssetReader` trait allows customizing the behavior of fetching
bytes for an `AssetPath`, and expects implementors to return `dyn
AsyncRead + AsyncSeek`. This gives implementors of `AssetLoader` great
flexibility to tightly integrate their asset loading behavior with the
asynchronous task system.
However, almost all implementors of `AssetLoader` don't use the async
functionality at all, and just call `AsyncReadExt::read_to_end(&mut
Vec<u8>)`. This is incredibly inefficient, as this method repeatedly
calls `poll_read` on the trait object, filling the vector 32 bytes at a
time. At my work we have assets that are hundreds of megabytes which
makes this a meaningful overhead.
## Solution
Turn the `Reader` type alias into an actual trait, with a provided
method `read_to_end`. This provided method should be more efficient than
the existing extension method, as the compiler will know the underlying
type of `Reader` when generating this function, which removes the
repeated dynamic dispatches and allows the compiler to make further
optimizations after inlining. Individual implementors are able to
override the provided implementation -- for simple asset readers that
just copy bytes from one buffer to another, this allows removing a large
amount of overhead from the provided implementation.
Now that `Reader` is an actual trait, I also improved the ergonomics for
implementing `AssetReader`. Currently, implementors are expected to box
their reader and return it as a trait object, which adds unnecessary
boilerplate to implementations. This PR changes that trait method to
return a pseudo trait alias, which allows implementors to return `impl
Reader` instead of `Box<dyn Reader>`. Now, the boilerplate for boxing
occurs in `ErasedAssetReader`.
## Testing
I made identical changes to my company's fork of bevy. Our app, which
makes heavy use of `read_to_end` for asset loading, still worked
properly after this. I am not aware if we have a more systematic way of
testing asset loading for correctness.
---
## Migration Guide
The trait method `bevy_asset::io::AssetReader::read` (and `read_meta`)
now return an opaque type instead of a boxed trait object. Implementors
of these methods should change the type signatures appropriately
```rust
impl AssetReader for MyReader {
// Before
async fn read<'a>(&'a self, path: &'a Path) -> Result<Box<Reader<'a>>, AssetReaderError> {
let reader = // construct a reader
Box::new(reader) as Box<Reader<'a>>
}
// After
async fn read<'a>(&'a self, path: &'a Path) -> Result<impl Reader + 'a, AssetReaderError> {
// create a reader
}
}
```
`bevy::asset::io::Reader` is now a trait, rather than a type alias for a
trait object. Implementors of `AssetLoader::load` will need to adjust
the method signature accordingly
```rust
impl AssetLoader for MyLoader {
async fn load<'a>(
&'a self,
// Before:
reader: &'a mut bevy::asset::io::Reader,
// After:
reader: &'a mut dyn bevy::asset::io::Reader,
_: &'a Self::Settings,
load_context: &'a mut LoadContext<'_>,
) -> Result<Self::Asset, Self::Error> {
}
```
Additionally, implementors of `AssetReader` that return a type
implementing `futures_io::AsyncRead` and `AsyncSeek` might need to
explicitly implement `bevy::asset::io::Reader` for that type.
```rust
impl bevy::asset::io::Reader for MyAsyncReadAndSeek {}
```
# Objective
- Fixes#14059
- `morphed_skinned_mesh_layout` is the same as
`morphed_skinned_motion_mesh_layout` but shouldn't have the skin / morph
from previous frame, as they're used for motion
## Solution
- Remove the extra entries
## Testing
- Run with the glTF file reproducing #14059, it works
As reported in #14004, many third-party plugins, such as Hanabi, enqueue
entities that don't have meshes into render phases. However, the
introduction of indirect mode added a dependency on mesh-specific data,
breaking this workflow. This is because GPU preprocessing requires that
the render phases manage indirect draw parameters, which don't apply to
objects that aren't meshes. The existing code skips over binned entities
that don't have indirect draw parameters, which causes the rendering to
be skipped for such objects.
To support this workflow, this commit adds a new field,
`non_mesh_items`, to `BinnedRenderPhase`. This field contains a simple
list of (bin key, entity) pairs. After drawing batchable and unbatchable
objects, the non-mesh items are drawn one after another. Bevy itself
doesn't enqueue any items into this list; it exists solely for the
application and/or plugins to use.
Additionally, this commit switches the asset ID in the standard bin keys
to be an untyped asset ID rather than that of a mesh. This allows more
flexibility, allowing bins to be keyed off any type of asset.
This patch adds a new example, `custom_phase_item`, which simultaneously
serves to demonstrate how to use this new feature and to act as a
regression test so this doesn't break again.
Fixes#14004.
## Changelog
### Added
* `BinnedRenderPhase` now contains a `non_mesh_items` field for plugins
to add custom items to.
The comment was incorrect - we are already looking at the pyramid
texture so we do not need to transform the size in any way. Doing that
resulted in a mip that was too fine to be selected in certain cases,
which resulted in a 2x2 pixel footprint not actually fully covering the
cluster sphere - sometimes this could lead to a non-conservative depth
value being computed which resulted in the cluster being marked as
invisible incorrectly.
This change updates meshopt-rs to 0.3 to take advantage of the newly
added sparse simplification mode: by default, simplifier assumes that
the entire mesh is simplified and runs a set of calculations that are
O(vertex count), but in our case we simplify many small mesh subsets
which is inefficient.
Sparse mode instead assumes that the simplified subset is only using a
portion of the vertex buffer, and optimizes accordingly. This changes
the meaning of the error (as it becomes relative to the subset, in our
case a meshlet group); to ensure consistent error selection, we also use
the ErrorAbsolute mode which allows us to operate in mesh coordinate
space.
Additionally, meshopt 0.3 runs optimizeMeshlet automatically as part of
`build_meshlets` so we no longer need to call it ourselves.
This reduces the time to build meshlet representation for Stanford Bunny
mesh from ~1.65s to ~0.45s (3.7x) in optimized builds.