Mirrors/bevy

mirror of https://github.com/bevyengine/bevy synced 2024-12-26 21:13:09 +00:00

Author	SHA1	Message	Date
Benjamin Brienen	29508f065f	Fix floating point math (#15239 ) # Objective - Fixes #15236 ## Solution - Use bevy_math::ops instead of std floating point operations. ## Testing - Did you test these changes? If so, how? Unit tests and `cargo run -p ci -- test` - How can other people (reviewers) test your changes? Is there anything specific they need to know? Execute `cargo run -p ci -- test` on Windows. - If relevant, what platforms did you test these changes on, and are there any important ones you can't test? Windows ## Migration Guide - Not a breaking change - Projects should use bevy math where applicable --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: IQuick 143 <IQuick143cz@gmail.com> Co-authored-by: Joona Aalto <jondolf.dev@gmail.com>	2024-09-16 23:28:12 +00:00
Joona Aalto	afbbbd7335	Rename rendering components for improved consistency and clarity (#15035 ) # Objective The names of numerous rendering components in Bevy are inconsistent and a bit confusing. Relevant names include: - `AutoExposureSettings` - `AutoExposureSettingsUniform` - `BloomSettings` - `BloomUniform` (no `Settings`) - `BloomPrefilterSettings` - `ChromaticAberration` (no `Settings`) - `ContrastAdaptiveSharpeningSettings` - `DepthOfFieldSettings` - `DepthOfFieldUniform` (no `Settings`) - `FogSettings` - `SmaaSettings`, `Fxaa`, `TemporalAntiAliasSettings` (really inconsistent??) - `ScreenSpaceAmbientOcclusionSettings` - `ScreenSpaceReflectionsSettings` - `VolumetricFogSettings` Firstly, there's a lot of inconsistency between `Foo`/`FooSettings` and `FooUniform`/`FooSettingsUniform` and whether names are abbreviated or not. Secondly, the `Settings` post-fix seems unnecessary and a bit confusing semantically, since it makes it seem like the component is mostly just auxiliary configuration instead of the core thing that actually enables the feature. This will be an even bigger problem once bundles like `TemporalAntiAliasBundle` are deprecated in favor of required components, as users will expect a component named `TemporalAntiAlias` (or similar), not `TemporalAntiAliasSettings`. ## Solution Drop the `Settings` post-fix from the component names, and change some names to be more consistent. - `AutoExposure` - `AutoExposureUniform` - `Bloom` - `BloomUniform` - `BloomPrefilter` - `ChromaticAberration` - `ContrastAdaptiveSharpening` - `DepthOfField` - `DepthOfFieldUniform` - `DistanceFog` - `Smaa`, `Fxaa`, `TemporalAntiAliasing` (note: we might want to change to `Taa`, see "Discussion") - `ScreenSpaceAmbientOcclusion` - `ScreenSpaceReflections` - `VolumetricFog` I kept the old names as deprecated type aliases to make migration a bit less painful for users. We should remove them after the next release. (And let me know if I should just... not add them at all) I also added some very basic docs for a few types where they were missing, like on `Fxaa` and `DepthOfField`. ## Discussion - `TemporalAntiAliasing` is still inconsistent with `Smaa` and `Fxaa`. Consensus [on Discord](https://discord.com/channels/691052431525675048/743663924229963868/1280601167209955431) seemed to be that renaming to `Taa` would probably be fine, but I think it's a bit more controversial, and it would've required renaming a lot of related types like `TemporalAntiAliasNode`, `TemporalAntiAliasBundle`, and `TemporalAntiAliasPlugin`, so I think it's better to leave to a follow-up. - I think `Fog` should probably have a more specific name like `DistanceFog` considering it seems to be distinct from `VolumetricFog`. ~~This should probably be done in a follow-up though, so I just removed the `Settings` post-fix for now.~~ (done) --- ## Migration Guide Many rendering components have been renamed for improved consistency and clarity. - `AutoExposureSettings` → `AutoExposure` - `BloomSettings` → `Bloom` - `BloomPrefilterSettings` → `BloomPrefilter` - `ContrastAdaptiveSharpeningSettings` → `ContrastAdaptiveSharpening` - `DepthOfFieldSettings` → `DepthOfField` - `FogSettings` → `DistanceFog` - `SmaaSettings` → `Smaa` - `TemporalAntiAliasSettings` → `TemporalAntiAliasing` - `ScreenSpaceAmbientOcclusionSettings` → `ScreenSpaceAmbientOcclusion` - `ScreenSpaceReflectionsSettings` → `ScreenSpaceReflections` - `VolumetricFogSettings` → `VolumetricFog` --------- Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2024-09-10 01:11:46 +00:00
JMS55	a0faf9cd01	More triangles/vertices per meshlet (#15023 ) ### Builder changes - Increased meshlet max vertices/triangles from 64v/64t to 255v/128t (meshoptimizer won't allow 256v sadly). This gives us a much greater percentage of meshlets with max triangle count (128). Still not perfect, we still end up with some tiny <=10 triangle meshlets that never really get simplified, but it's progress. - Removed the error target limit. Now we allow meshoptimizer to simplify as much as possible. No reason to cap this out, as the cluster culling code will choose a good LOD level anyways. Again leads to higher quality LOD trees. - After some discussion and consulting the Nanite slides again, changed meshlet group error from _adding_ the max child's error to the group error, to doing `group_error = max(group_error, max_child_error)`. Error is already cumulative between LODs as the edges we're collapsing during simplification get longer each time. - Bumped the 65% simplification threshold to allow up to 95% of the original geometry (e.g. accept simplification as valid even if we only simplified 5% of the triangles). This gives us closer to log2(initial_meshlet_count) LOD levels, and fewer meshlet roots in the DAG. Still more work to be done in the future here. Maybe trying METIS for meshlet building instead of meshoptimizer. Using ~8 clusters per group instead of ~4 might also make a big difference. The Nanite slides say that they have 8-32 meshlets per group, suggesting some kind of heuristic. Unfortunately meshopt's compute_cluster_bounds won't work with large groups atm (https://github.com/zeux/meshoptimizer/discussions/750#discussioncomment-10562641) so hard to test. Based on discussion from https://github.com/bevyengine/bevy/discussions/14998, https://github.com/zeux/meshoptimizer/discussions/750, and discord. ### Runtime changes - cluster:triangle packed IDs are now stored 25:7 instead of 26:6 bits, as max triangles per cluster are now 128 instead of 64 - Hardware raster now spawns 128 * 3 vertices instead of 64 * 3 vertices to account for the new max triangles limit - Hardware raster now outputs NaN triangles (0 / 0) instead of zero-positioned triangles for extra vertex invocations over the cluster triangle count. Shouldn't really be a difference idt, but I did it anyways. - Software raster now does 128 threads per workgroup instead of 64 threads. Each thread now loads, projects, and caches a vertex (vertices 0-127), and then if needed does so again (vertices 128-254). Each thread then rasterizes one of 128 triangles. - Fixed a bug with `needs_dispatch_remap`. I had the condition backwards in my last PR, I probably committed it by accident after testing the non-default code path on my GPU.	2024-09-08 17:55:57 +00:00
Zachary Harrold	bc13161416	Migrated `NonZero` to `NonZero<>` (#14978 ) # Objective - Fixes #14974 ## Solution - Replace all* instances of `NonZero` with `NonZero<>` ## Testing - CI passed locally. --- ## Notes Within the `bevy_reflect` implementations for `std` types, `impl_reflect_value!()` will continue to use the type aliases instead, as it inappropriately parses the concrete type parameter as a generic argument. If the `ZeroablePrimitive` trait was stable, or the macro could be modified to accept a finite list of types, then we could fully migrate.	2024-08-30 02:37:47 +00:00
JMS55	6cc96f4c1f	Meshlet software raster + start of cleanup (#14623 ) # Objective - Faster meshlet rasterization path for small triangles - Avoid having to allocate and write out a triangle buffer - Refactor gpu_scene.rs ## Solution - Replace the 32bit visbuffer texture with a 64bit visbuffer buffer, where the left 32 bits encode depth, and the right 32 bits encode the existing cluster + triangle IDs. Can't use 64bit textures, wgpu/naga doesn't support atomic ops on textures yet. - Instead of writing out a buffer of packed cluster + triangle IDs (per triangle) to raster, the culling pass now writes out a buffer of just cluster IDs (per cluster, so less memory allocated, cheaper to write out). - Clusters for software raster are allocated from the left side - Clusters for hardware raster are allocated in the same buffer, from the right side - The buffer size is fixed at MeshletPlugin build time, and should be set to a reasonable value for your scene (no warning on overflow, and no good way to determine what value you need outside of renderdoc - I plan to fix this in a future PR adding a meshlet stats overlay) - Currently I don't have a heuristic for software vs hardware raster selection for each cluster. The existing code is just a placeholder. I need to profile on a release scene and come up with a heuristic, probably in a future PR. - The culling shader is getting pretty hard to follow at this point, but I don't want to spend time improving it as the entire shader/pass is getting rewritten/replaced in the near future. - Software raster is a compute workgroup per-cluster. Each workgroup loads and transforms the <=64 vertices of the cluster, and then rasterizes the <=64 triangles of the cluster. - Two variants are implemented: Scanline for clusters with any larger triangles (still smaller than hardware is good at), and brute-force for very very tiny triangles - Once the shader determines that a pixel should be filled in, it does an atomicMax() on the visbuffer to store the results, copying how Nanite works - On devices with a low max workgroups per dispatch limit, an extra compute pass is inserted before software raster to convert from a 1d to 2d dispatch (I don't think 3d would ever be necessary). - I haven't implemented the top-left rule or subpixel precision yet, I'm leaving that for a future PR since I get usable results without it for now - Resources used: https://kristoffer-dyrkorn.github.io/triangle-rasterizer and chapters 6-8 of https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index - Hardware raster now spawns 64*3 vertex invocations per meshlet, instead of the actual meshlet vertex count. Extra invocations just early-exit. - While this is slower than the existing system, hardware draws should be rare now that software raster is usable, and it saves a ton of memory using the unified cluster ID buffer. This would be fixed if wgpu had support for mesh shaders. - Instead of writing to a color+depth attachment, the hardware raster pass also does the same atomic visbuffer writes that software raster uses. - We have to bind a dummy render target anyways, as wgpu doesn't currently support render passes without any attachments - Material IDs are no longer written out during the main rasterization passes. - If we had async compute queues, we could overlap the software and hardware raster passes. - New material and depth resolve passes run at the end of the visbuffer node, and write out view depth and material ID depth textures ### Misc changes - Fixed cluster culling importing, but never actually using the previous view uniforms when doing occlusion culling - Fixed incorrectly adding the LOD error twice when building the meshlet mesh - Splitup gpu_scene module into meshlet_mesh_manager, instance_manager, and resource_manager - resource_manager is still too complex and inefficient (extract and prepare are way too expensive). I plan on improving this in a future PR, but for now ResourceManager is mostly a 1:1 port of the leftover MeshletGpuScene bits. - Material draw passes have been renamed to the more accurate material shade pass, as well as some other misc renaming (in the future, these will be compute shaders even, and not actual draw calls) --- ## Migration Guide - TBD (ask me at the end of the release for meshlet changes as a whole) --------- Co-authored-by: vero <email@atlasdostal.com>	2024-08-26 17:54:34 +00:00
Sarthak Singh	2c4ef37b76	Changed `Mesh::attributes*` functions to return `MeshVertexAttribute` (#14394 ) # Objective Fixes #14365 ## Migration Guide - When using the iterator returned by `Mesh::attributes` or `Mesh::attributes_mut` the first value of the tuple is not the `MeshVertexAttribute` instead of `MeshVertexAttributeId`. To access the `MeshVertexAttributeId` use the `MeshVertexAttribute.id` field. Signed-off-by: Sarthak Singh <sarthak.singh99@gmail.com>	2024-08-12 15:54:28 +00:00
Jan Hohenheim	6f7c554daa	Fix common capitalization errors in documentation (#14562 ) WASM -> Wasm MacOS -> macOS Nothing important, just something that annoyed me for a while :)	2024-07-31 21:16:05 +00:00
charlotte	abaea01e30	Fixup Msaa docs. (#14442 ) Minor doc fixes missed in #14273	2024-07-22 21:37:25 +00:00
Patrick Walton	d235d41af1	Fix the example regressions from packed growable buffers. (#14375 ) The "uberbuffers" PR #14257 caused some examples to fail intermittently for different reasons: 1. `morph_targets` could fail because vertex displacements for morph targets are keyed off the vertex index. With buffer packing, the vertex index can vary based on the position in the buffer, which caused the morph targets to be potentially incorrect. The solution is to include the first vertex index with the `MeshUniform` (and `MeshInputUniform` if GPU preprocessing is in use), so that the shader can calculate the true vertex index before performing the morph operation. This results in wasted space in `MeshUniform`, which is unfortunate, but we'll soon be filling in the padding with the ID of the material when bindless textures land, so this had to happen sooner or later anyhow. Including the vertex index in the `MeshInputUniform` caused an ordering problem. The `MeshInputUniform` was created during the extraction phase, before the allocations occurred, so the extraction logic didn't know where the mesh vertex data was going to end up. The solution is to move the `MeshInputUniform` creation (the `collect_meshes_for_gpu_building` system) to after the allocations phase. This should be better for parallelism anyhow, because it allows the extraction phase to finish quicker. It's also something we'll have to do for bindless in any event. 2. The `lines` and `fog_volumes` examples could fail because their custom drawing nodes weren't updated to supply the vertex and index offsets in their `draw_indexed` and `draw` calls. This commit fixes this oversight. Fixes #14366.	2024-07-22 18:55:51 +00:00
charlotte	03fd1b46ef	Move `Msaa` to component (#14273 ) Switches `Msaa` from being a globally configured resource to a per camera view component. Closes #7194 # Objective Allow individual views to describe their own MSAA settings. For example, when rendering to different windows or to different parts of the same view. ## Solution Make `Msaa` a component that is required on all camera bundles. ## Testing Ran a variety of examples to ensure that nothing broke. TODO: - [ ] Make sure android still works per previous comment in `extract_windows`. --- ## Migration Guide `Msaa` is no longer configured as a global resource, and should be specified on each spawned camera if a non-default setting is desired. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: François Mockers <francois.mockers@vleue.com>	2024-07-22 18:28:23 +00:00
Sou1gh0st	9da18cce2a	Add support for environment map transformation (#14290 ) # Objective - Fixes: https://github.com/bevyengine/bevy/issues/14036 ## Solution - Add a world space transformation for the environment sample direction. ## Testing - I have tested the newly added `transform` field using the newly added `rotate_environment_map` example. https://github.com/user-attachments/assets/2de77c65-14bc-48ee-b76a-fb4e9782dbdb ## Migration Guide - Since we have added a new filed to the `EnvironmentMapLight` struct, users will need to include `..default()` or some rotation value in their initialization code.	2024-07-19 15:00:50 +00:00
JMS55	6e8d43a037	Faster MeshletMesh deserialization (#14193 ) # Objective - Using bincode to deserialize binary into a MeshletMesh is expensive (~77ms for a 5mb file). ## Solution - Write a custom deserializer using bytemuck's Pod types and slice casting. - Total asset load time has gone from ~102ms to ~12ms. - Change some types I never meant to be public to private and other misc cleanup. ## Testing - Ran the meshlet example and added timing spans to the asset loader. --- ## Changelog - Improved `MeshletMesh` loading speed - The `MeshletMesh` disk format has changed, and `MESHLET_MESH_ASSET_VERSION` has been bumped - `MeshletMesh` fields are now private - Renamed `MeshletMeshSaverLoad` to `MeshletMeshSaverLoader` - The `Meshlet`, `MeshletBoundingSpheres`, and `MeshletBoundingSphere` types are now private - Removed `MeshletMeshSaveOrLoadError::SerializationOrDeserialization` - Added `MeshletMeshSaveOrLoadError::WrongFileType` ## Migration Guide - Regenerate your `MeshletMesh` assets, as the disk format has changed, and `MESHLET_MESH_ASSET_VERSION` has been bumped - `MeshletMesh` fields are now private - `MeshletMeshSaverLoad` is now named `MeshletMeshSaverLoader` - The `Meshlet`, `MeshletBoundingSpheres`, and `MeshletBoundingSphere` types are now private - `MeshletMeshSaveOrLoadError::SerializationOrDeserialization` has been removed - Added `MeshletMeshSaveOrLoadError::WrongFileType`, match on this variant if you match on `MeshletMeshSaveOrLoadError`	2024-07-15 15:06:02 +00:00
Joseph	5876352206	Optimize common usages of `AssetReader` (#14082 ) # Objective The `AssetReader` trait allows customizing the behavior of fetching bytes for an `AssetPath`, and expects implementors to return `dyn AsyncRead + AsyncSeek`. This gives implementors of `AssetLoader` great flexibility to tightly integrate their asset loading behavior with the asynchronous task system. However, almost all implementors of `AssetLoader` don't use the async functionality at all, and just call `AsyncReadExt::read_to_end(&mut Vec<u8>)`. This is incredibly inefficient, as this method repeatedly calls `poll_read` on the trait object, filling the vector 32 bytes at a time. At my work we have assets that are hundreds of megabytes which makes this a meaningful overhead. ## Solution Turn the `Reader` type alias into an actual trait, with a provided method `read_to_end`. This provided method should be more efficient than the existing extension method, as the compiler will know the underlying type of `Reader` when generating this function, which removes the repeated dynamic dispatches and allows the compiler to make further optimizations after inlining. Individual implementors are able to override the provided implementation -- for simple asset readers that just copy bytes from one buffer to another, this allows removing a large amount of overhead from the provided implementation. Now that `Reader` is an actual trait, I also improved the ergonomics for implementing `AssetReader`. Currently, implementors are expected to box their reader and return it as a trait object, which adds unnecessary boilerplate to implementations. This PR changes that trait method to return a pseudo trait alias, which allows implementors to return `impl Reader` instead of `Box<dyn Reader>`. Now, the boilerplate for boxing occurs in `ErasedAssetReader`. ## Testing I made identical changes to my company's fork of bevy. Our app, which makes heavy use of `read_to_end` for asset loading, still worked properly after this. I am not aware if we have a more systematic way of testing asset loading for correctness. --- ## Migration Guide The trait method `bevy_asset::io::AssetReader::read` (and `read_meta`) now return an opaque type instead of a boxed trait object. Implementors of these methods should change the type signatures appropriately ```rust impl AssetReader for MyReader { // Before async fn read<'a>(&'a self, path: &'a Path) -> Result<Box<Reader<'a>>, AssetReaderError> { let reader = // construct a reader Box::new(reader) as Box<Reader<'a>> } // After async fn read<'a>(&'a self, path: &'a Path) -> Result<impl Reader + 'a, AssetReaderError> { // create a reader } } ``` `bevy::asset::io::Reader` is now a trait, rather than a type alias for a trait object. Implementors of `AssetLoader::load` will need to adjust the method signature accordingly ```rust impl AssetLoader for MyLoader { async fn load<'a>( &'a self, // Before: reader: &'a mut bevy::asset::io::Reader, // After: reader: &'a mut dyn bevy::asset::io::Reader, _: &'a Self::Settings, load_context: &'a mut LoadContext<'_>, ) -> Result<Self::Asset, Self::Error> { } ``` Additionally, implementors of `AssetReader` that return a type implementing `futures_io::AsyncRead` and `AsyncSeek` might need to explicitly implement `bevy::asset::io::Reader` for that type. ```rust impl bevy::asset::io::Reader for MyAsyncReadAndSeek {} ```	2024-07-01 19:59:42 +00:00
Arseny Kapoulkine	9148847589	Fix incorrect computation of mips for cluster occlusion lookup (#14042 ) The comment was incorrect - we are already looking at the pyramid texture so we do not need to transform the size in any way. Doing that resulted in a mip that was too fine to be selected in certain cases, which resulted in a 2x2 pixel footprint not actually fully covering the cluster sphere - sometimes this could lead to a non-conservative depth value being computed which resulted in the cluster being marked as invisible incorrectly.	2024-06-27 05:57:01 +00:00
Arseny Kapoulkine	4cd188568a	Improve MeshletMesh::from_mesh performance further (#14038 ) This change updates meshopt-rs to 0.3 to take advantage of the newly added sparse simplification mode: by default, simplifier assumes that the entire mesh is simplified and runs a set of calculations that are O(vertex count), but in our case we simplify many small mesh subsets which is inefficient. Sparse mode instead assumes that the simplified subset is only using a portion of the vertex buffer, and optimizes accordingly. This changes the meaning of the error (as it becomes relative to the subset, in our case a meshlet group); to ensure consistent error selection, we also use the ErrorAbsolute mode which allows us to operate in mesh coordinate space. Additionally, meshopt 0.3 runs optimizeMeshlet automatically as part of `build_meshlets` so we no longer need to call it ourselves. This reduces the time to build meshlet representation for Stanford Bunny mesh from ~1.65s to ~0.45s (3.7x) in optimized builds.	2024-06-27 00:06:22 +00:00
JMS55	158ccc6d6a	Fix meshlet interactions with regular shading passes (#13816 ) * Fixes https://github.com/bevyengine/bevy/issues/13813 * Fixes https://github.com/bevyengine/bevy/issues/13810 Tested a combined scene with both regular meshes and meshlet meshes with: * Regular forward setup * Forward + normal/motion vector prepasses * Deferred (with depth prepass since that's required) * Deferred + depth/normal/motion vector prepasses Still broken: * Using meshlet meshes rendering in deferred and regular meshes rendering in forward + depth/normal prepass. I don't know how to fix this at the moment, so for now I've just add instructions to not mix them.	2024-06-21 19:06:08 +00:00
Arseny Kapoulkine	6eec73a9a5	Make meshlet processing deterministic (#13913 ) This is a followup to https://github.com/bevyengine/bevy/pull/13904 based on the discussion there, and switches two HashMaps that used meshlet ids as keys to Vec. In addition to a small further performance boost for `from_mesh` (1.66s => 1.60s), this makes processing deterministic modulo threading issues wrt CRT rand described in the linked PR. This is valuable for debugging, as you can visually or programmatically inspect the meshlet distribution before/after making changes that should not change the output, whereas previously every asset rebuild would change the meshlet structure. Tested with https://github.com/bevyengine/bevy/pull/13431; after this change, the visual output of meshlets is consistent between asset rebuilds, and the MD5 of the output GLB file does not change either, which was not the case before.	2024-06-20 00:58:43 +00:00
Arseny Kapoulkine	001cc147c6	Improve MeshletMesh::from_mesh performance (#13904 ) This change reworks `find_connected_meshlets` to scale more linearly with the mesh size, which significantly reduces the cost of building meshlet representations. As a small extra complexity reduction, it moves `simplify_scale` call out of the loop so that it's called once (it only depends on the vertex data => is safe to cache). The new implementation of connectivity analysis builds edge=>meshlet list data structure, which allows us to only iterate through `tuple_combinations` of a (usually) small list. There is still some redundancy as if two meshlets share two edges, they will be represented in the meshlet lists twice, but it's overall much faster. Since the hash traversal is non-deterministic, to keep this part of the algorithm deterministic for reproducible results we sort the output adjacency lists. Overall this reduces the time to process bunny mesh from ~4.2s to ~1.7s when using release; in unoptimized builds the delta is even more significant. This was tested by using https://github.com/bevyengine/bevy/pull/13431 and: a) comparing the result of `find_connected_meshlets` using old and new code; they are equal in all steps of the clustering process b) comparing the rendered result of the old code vs new code after making the rest of the algorithm deterministic: right now the loop that iterates through the result of `group_meshlets()` call executes in different order between program runs. This is orthogonal to this change and can be fixed separately. Note: a future change can shrink the processing time further from ~1.7s to ~0.4s with a small diff but that requires an update to meshopt crate which is pending in https://github.com/gwihlidal/meshopt-rs/pull/42. This change is independent.	2024-06-18 08:29:17 +00:00
Jan Hohenheim	6273227e09	Fix lints introduced in Rust beta 1.80 (#13899 ) Resolves #13895 Mostly just involves being more explicit about which parts of the docs belong to a list and which begin a new paragraph. - found a few docs that were malformed because of exactly this, so I fixed that by introducing a paragraph - added indentation to nearly all multiline lists - fixed a few minor typos - added `#[allow(dead_code)]` to types that are needed to test annotations but are never constructed ([here](https://github.com/bevyengine/bevy/pull/13899/files#diff-b02b63604e569c8577c491e7a2030d456886d8f6716eeccd46b11df8aac75dafR1514) and [here](https://github.com/bevyengine/bevy/pull/13899/files#diff-b02b63604e569c8577c491e7a2030d456886d8f6716eeccd46b11df8aac75dafR1523)) - verified that `cargo +beta run -p ci -- lints` passes - verified that `cargo +beta run -p ci -- test` passes	2024-06-17 17:22:01 +00:00
JMS55	fd30e0c67d	Fix meshlet vertex attribute interpolation (#13775 ) # Objective - Mikktspace requires that we normalize world normals/tangents _before_ interpolation across vertices, and then do _not_ normalize after. I had it backwards. - We do not (am not supposed to?) need a second set of barycentrics for motion vectors. If you think about the typical raster pipeline, in the vertex shader we calculate previous_world_position, and then it gets interpolated using the current triangle's barycentrics. ## Solution - Fix normal/tangent processing - Reuse barycentrics for motion vector calculations - Not implementing this for 0.14, but long term I aim to remove explicit vertex tangents and calculate them in the shader on the fly. ## Testing - I tested out some of the normal maps we have in repo. Didn't seem to make a difference, but mikktspace is all about correctness across various baking tools. I probably just didn't have any of the ones that would cause it to break. - Didn't test motion vectors as there's a known bug with the depth buffer and meshlets that I'm waiting on the render graph rewrite to fix.	2024-06-10 20:18:43 +00:00
JMS55	50ee483665	Meshlet misc (#13761 ) - Copy module docs so that they show up in the re-export - Change meshlet_id to cluster_id in the debug visualization - Small doc tweaks	2024-06-10 13:06:08 +00:00
JMS55	175e146228	Misc meshlet changes (#13705 ) * Rename cull_meshlets -> cull_clusters * Rename meshlet_visible -> cluster_visible * Add an if statement around meshlet_second_pass_candidates writes, maybe a small bit of performance.	2024-06-06 17:10:07 +00:00
Ricky Taylor	9b9d3d81cb	Normalise matrix naming (#13489 ) # Objective - Fixes #10909 - Fixes #8492 ## Solution - Name all matrices `x_from_y`, for example `world_from_view`. ## Testing - I've tested most of the 3D examples. The `lighting` example particularly should hit a lot of the changes and appears to run fine. --- ## Changelog - Renamed matrices across the engine to follow a `y_from_x` naming, making the space conversion more obvious. ## Migration Guide - `Frustum`'s `from_view_projection`, `from_view_projection_custom_far` and `from_view_projection_no_far` were renamed to `from_clip_from_world`, `from_clip_from_world_custom_far` and `from_clip_from_world_no_far`. - `ComputedCameraValues::projection_matrix` was renamed to `clip_from_view`. - `CameraProjection::get_projection_matrix` was renamed to `get_clip_from_view` (this affects implementations on `Projection`, `PerspectiveProjection` and `OrthographicProjection`). - `ViewRangefinder3d::from_view_matrix` was renamed to `from_world_from_view`. - `PreviousViewData`'s members were renamed to `view_from_world` and `clip_from_world`. - `ExtractedView`'s `projection`, `transform` and `view_projection` were renamed to `clip_from_view`, `world_from_view` and `clip_from_world`. - `ViewUniform`'s `view_proj`, `unjittered_view_proj`, `inverse_view_proj`, `view`, `inverse_view`, `projection` and `inverse_projection` were renamed to `clip_from_world`, `unjittered_clip_from_world`, `world_from_clip`, `world_from_view`, `view_from_world`, `clip_from_view` and `view_from_clip`. - `GpuDirectionalCascade::view_projection` was renamed to `clip_from_world`. - `MeshTransforms`' `transform` and `previous_transform` were renamed to `world_from_local` and `previous_world_from_local`. - `MeshUniform`'s `transform`, `previous_transform`, `inverse_transpose_model_a` and `inverse_transpose_model_b` were renamed to `world_from_local`, `previous_world_from_local`, `local_from_world_transpose_a` and `local_from_world_transpose_b` (the `Mesh` type in WGSL mirrors this, however `transform` and `previous_transform` were named `model` and `previous_model`). - `Mesh2dTransforms::transform` was renamed to `world_from_local`. - `Mesh2dUniform`'s `transform`, `inverse_transpose_model_a` and `inverse_transpose_model_b` were renamed to `world_from_local`, `local_from_world_transpose_a` and `local_from_world_transpose_b` (the `Mesh2d` type in WGSL mirrors this). - In WGSL, in `bevy_pbr::mesh_functions`, `get_model_matrix` and `get_previous_model_matrix` were renamed to `get_world_from_local` and `get_previous_world_from_local`. - In WGSL, `bevy_sprite::mesh2d_functions::get_model_matrix` was renamed to `get_world_from_local`.	2024-06-03 16:56:53 +00:00
JMS55	5536079945	Meshlet single pass depth downsampling (SPD) (#13003 ) # Objective - Using multiple raster passes to generate the depth pyramid is extremely slow - Pulling data from the source image is the largest bottleneck, it's important to sample in a cache-aware pattern - Barriers and pipeline drain between the raster passes is the second largest bottleneck - Each separate RenderPass on the CPU is _really_ expensive ## Solution - Port [FidelityFX SPD](https://gpuopen.com/fidelityfx-spd) to WGSL, replacing meshlet's existing multiple raster passes with a ~~single~~ two compute dispatches. Lack of coherent buffers means we have to do the the last 64x64 tile from mip 7+ in a separate dispatch to ensure the mip 6 writes were flushed :( - Workgroup shared memory version only at the moment, as the subgroup operation is blocked by our upgrade to wgpu 0.20 #13186 - Don't enforce a power-of-2 depth pyramid texture size, simply scaling by 0.5 is fine	2024-06-03 12:41:14 +00:00
Aevyrie	b45786df41	Add Skybox Motion Vectors (#13617 ) # Objective - Add motion vector support to the skybox - This fixes the last remaining "gap" to complete the motion blur feature ## Solution - Add a pipeline for the skybox to write motion vectors to the prepass ## Testing - Used examples to test motion vectors using motion blur https://github.com/bevyengine/bevy/assets/2632925/74c0778a-7e77-4e68-8111-05791e4bfdd2 --------- Co-authored-by: Patrick Walton <pcwalton@mimiga.net>	2024-06-02 16:09:28 +00:00
Patrick Walton	f398674e51	Implement opt-in sharp screen-space reflections for the deferred renderer, with improved raymarching code. (#13418 ) This commit, a revamp of #12959, implements screen-space reflections (SSR), which approximate real-time reflections based on raymarching through the depth buffer and copying samples from the final rendered frame. This patch is a relatively minimal implementation of SSR, so as to provide a flexible base on which to customize and build in the future. However, it's based on the production-quality [raymarching code by Tomasz Stachowiak](https://gist.github.com/h3r2tic/9c8356bdaefbe80b1a22ae0aaee192db). For a general basic overview of screen-space reflections, see [1](https://lettier.github.io/3d-game-shaders-for-beginners/screen-space-reflection.html). The raymarching shader uses the basic algorithm of tracing forward in large steps, refining that trace in smaller increments via binary search, and then using the secant method. No temporal filtering or roughness blurring, is performed at all; for this reason, SSR currently only operates on very shiny surfaces. No acceleration via the hierarchical Z-buffer is implemented (though note that https://github.com/bevyengine/bevy/pull/12899 will add the infrastructure for this). Reflections are traced at full resolution, which is often considered slow. All of these improvements and more can be follow-ups. SSR is built on top of the deferred renderer and is currently only supported in that mode. Forward screen-space reflections are possible albeit uncommon (though e.g. Doom Eternal uses them); however, they require tracing from the previous frame, which would add complexity. This patch leaves the door open to implementing SSR in the forward rendering path but doesn't itself have such an implementation. Screen-space reflections aren't supported in WebGL 2, because they require sampling from the depth buffer, which Naga can't do because of a bug (`sampler2DShadow` is incorrectly generated instead of `sampler2D`; this is the same reason why depth of field is disabled on that platform). To add screen-space reflections to a camera, use the `ScreenSpaceReflectionsBundle` bundle or the `ScreenSpaceReflectionsSettings` component. In addition to `ScreenSpaceReflectionsSettings`, `DepthPrepass` and `DeferredPrepass` must also be present for the reflections to show up. The `ScreenSpaceReflectionsSettings` component contains several settings that artists can tweak, and also comes with sensible defaults. A new example, `ssr`, has been added. It's loosely based on the [three.js ocean sample](https://threejs.org/examples/webgl_shaders_ocean.html), but all the assets are original. Note that the three.js demo has no screen-space reflections and instead renders a mirror world. In contrast to #12959, this demo tests not only a cube but also a more complex model (the flight helmet). ## Changelog ### Added * Screen-space reflections can be enabled for very smooth surfaces by adding the `ScreenSpaceReflections` component to a camera. Deferred rendering must be enabled for the reflections to appear. ![Screenshot 2024-05-18 143555](https://github.com/bevyengine/bevy/assets/157897/b8675b39-8a89-433e-a34e-1b9ee1233267) ![Screenshot 2024-05-18 143606](https://github.com/bevyengine/bevy/assets/157897/cc9e1cd0-9951-464a-9a08-e589210e5606)	2024-05-27 13:43:40 +00:00
IceSentry	aa907d5437	Remove unnecessary .view_layouts (#13394 ) # Objective - The volumetric fog PR originally needed to be modified to use `.view_layouts` but that was changed in another PR. The merge with main still kept those around. ## Solution - Remove them because they aren't necessary	2024-05-16 19:12:36 +00:00
Patrick Walton	19bfa41768	Implement volumetric fog and volumetric lighting, also known as light shafts or god rays. (#13057 ) This commit implements a more physically-accurate, but slower, form of fog than the `bevy_pbr::fog` module does. Notably, this volumetric fog allows for light beams from directional lights to shine through, creating what is known as light shafts or god rays. To add volumetric fog to a scene, add `VolumetricFogSettings` to the camera, and add `VolumetricLight` to directional lights that you wish to be volumetric. `VolumetricFogSettings` has numerous settings that allow you to define the accuracy of the simulation, as well as the look of the fog. Currently, only interaction with directional lights that have shadow maps is supported. Note that the overhead of the effect scales directly with the number of directional lights in use, so apply `VolumetricLight` sparingly for the best results. The overall algorithm, which is implemented as a postprocessing effect, is a combination of the techniques described in [Scratchapixel] and [this blog post]. It uses raymarching in screen space, transformed into shadow map space for sampling and combined with physically-based modeling of absorption and scattering. Bevy employs the widely-used [Henyey-Greenstein phase function] to model asymmetry; this essentially allows light shafts to fade into and out of existence as the user views them. Volumetric rendering is a huge subject, and I deliberately kept the scope of this commit small. Possible follow-ups include: 1. Raymarching at a lower resolution. 2. A post-processing blur (especially useful when combined with (1)). 3. Supporting point lights and spot lights. 4. Supporting lights with no shadow maps. 5. Supporting irradiance volumes and reflection probes. 6. Voxel components that reuse the volumetric fog code to create voxel shapes. 7. Horizon: Zero Dawn-style clouds. These are all useful, but out of scope of this patch for now, to keep things tidy and easy to review. A new example, `volumetric_fog`, has been added to demonstrate the effect. ## Changelog ### Added * A new component, `VolumetricFog`, is available, to allow for a more physically-accurate, but more resource-intensive, form of fog. * A new component, `VolumetricLight`, can be placed on directional lights to make them interact with `VolumetricFog`. Notably, this allows such lights to emit light shafts/god rays. ![Screenshot 2024-04-21 162808](https://github.com/bevyengine/bevy/assets/157897/7a1fc81d-eed5-4735-9419-286c496391a9) ![Screenshot 2024-04-21 132005](https://github.com/bevyengine/bevy/assets/157897/e6d3b5ca-8f59-488d-a3de-15e95aaf4995) [Scratchapixel]: https://www.scratchapixel.com/lessons/3d-basic-rendering/volume-rendering-for-developers/intro-volume-rendering.html [this blog post]: https://www.alexandre-pestana.com/volumetric-lights/ [Henyey-Greenstein phase function]: https://www.pbr-book.org/4ed/Volume_Scattering/Phase_Functions#TheHenyeyndashGreensteinPhaseFunction	2024-05-16 17:13:18 +00:00
JMS55	77ebabc4fe	Meshlet remove per-cluster data upload (#13125 ) # Objective - Per-cluster (instance of a meshlet) data upload is ridiculously expensive in both CPU and GPU time (8 bytes per cluster, millions of clusters, you very quickly run into PCIE bandwidth maximums, and lots of CPU-side copies and malloc). - We need to be uploading only per-instance/entity data. Anything else needs to be done on the GPU. ## Solution - Per instance, upload: - `meshlet_instance_meshlet_counts_prefix_sum` - An exclusive prefix sum over the count of how many clusters each instance has. - `meshlet_instance_meshlet_slice_starts` - The starting index of the meshlets for each instance within the `meshlets` buffer. - A new `fill_cluster_buffers` pass once at the start of the frame has a thread per cluster, and finds its instance ID and meshlet ID via a binary search of `meshlet_instance_meshlet_counts_prefix_sum` to find what instance it belongs to, and then uses that plus `meshlet_instance_meshlet_slice_starts` to find what number meshlet within the instance it is. The shader then writes out the per-cluster instance/meshlet ID buffers for later passes to quickly read from. - I've gone from 45 -> 180 FPS in my stress test scene, and saved ~30ms/frame of overall CPU/GPU time.	2024-05-04 19:56:19 +00:00
BD103	45bb6253e2	Restore dragons to their seat of power (#13124 ) # Objective - There is an unfortunate lack of dragons in the meshlet docs. - Dragons are symbolic of majesty, power, storms, and meshlets. - A dragon habitat such as our docs requires cultivation to ensure each winged lizard reaches their fullest, fiery selves. ## Solution - Fix the link to the dragon image. - The link originally targeted the `meshlet` branch, but that was later deleted after it was merged into `main`. --- ## Changelog - Added a dragon back into the `MeshletPlugin` documentation.	2024-04-28 07:20:16 +00:00
JMS55	e1a0da0fa6	Meshlet LOD-compatible two-pass occlusion culling (#12898 ) Keeping track of explicit visibility per cluster between frames does not work with LODs, and leads to worse culling (using the final depth buffer from the previous frame is more accurate). Instead, we need to generate a second depth pyramid after the second raster pass, and then use that in the first culling pass in the next frame to test if a cluster would have been visible last frame or not. As part of these changes, the write_index_buffer pass has been folded into the culling pass for a large performance gain, and to avoid tracking a lot of extra state that would be needed between passes. Prepass previous model/view stuff was adapted to work with meshlets as well. Also fixed a bug with materials, and other misc improvements. --------- Co-authored-by: François <mockersf@gmail.com> Co-authored-by: atlas dostal <rodol@rivalrebels.com> Co-authored-by: vero <email@atlasdostal.com> Co-authored-by: Patrick Walton <pcwalton@mimiga.net> Co-authored-by: Robert Swain <robert.swain@gmail.com>	2024-04-28 05:30:20 +00:00
JMS55	6d6810c90d	Meshlet continuous LOD (#12755 ) Adds a basic level of detail system to meshlets. An extremely brief summary is as follows: * In `from_mesh.rs`, once we've built the first level of clusters, we group clusters, simplify the new mega-clusters, and then split the simplified groups back into regular sized clusters. Repeat several times (ideally until you can't anymore). This forms a directed acyclic graph (DAG), where the children are the meshlets from the previous level, and the parents are the more simplified versions of their children. The leaf nodes are meshlets formed from the original mesh. * In `cull_meshlets.wgsl`, each cluster selects whether to render or not based on the LOD bounding sphere (different than the culling bounding sphere) of the current meshlet, the LOD bounding sphere of its parent (the meshlet group from simplification), and the simplification error relative to its children of both the current meshlet and its parent meshlet. This kind of breaks two pass occlusion culling, which will be fixed in a future PR by using an HZB from the previous frame to get the initial list of occluders. Many, _many_ improvements to be done in the future https://github.com/bevyengine/bevy/issues/11518, not least of which is code quality and speed. I don't even expect this to work on many types of input meshes. This is just a basic implementation/draft for collaboration. Arguable how much we want to do in this PR, I'll leave that up to maintainers. I've erred on the side of "as basic as possible". References: * Slides 27-77 (video available on youtube) https://advances.realtimerendering.com/s2021/Karis_Nanite_SIGGRAPH_Advances_2021_final.pdf * https://blog.traverseresearch.nl/creating-a-directed-acyclic-graph-from-a-mesh-1329e57286e5 * https://jglrxavpok.github.io/2024/01/19/recreating-nanite-lod-generation.html, https://jglrxavpok.github.io/2024/03/12/recreating-nanite-faster-lod-generation.html, https://jglrxavpok.github.io/2024/04/02/recreating-nanite-runtime-lod-selection.html, and https://github.com/jglrxavpok/Carrot * https://github.com/gents83/INOX/tree/master/crates/plugins/binarizer/src * https://cs418.cs.illinois.edu/website/text/nanite.html ![image](https://github.com/bevyengine/bevy/assets/47158642/e40bff9b-7d0c-4a19-a3cc-2aad24965977) ![image](https://github.com/bevyengine/bevy/assets/47158642/442c7da3-7761-4da7-9acd-37f15dd13e26) --------- Co-authored-by: Ricky Taylor <rickytaylor26@gmail.com> Co-authored-by: vero <email@atlasdostal.com> Co-authored-by: François <mockersf@gmail.com> Co-authored-by: atlas dostal <rodol@rivalrebels.com> Co-authored-by: Patrick Walton <pcwalton@mimiga.net>	2024-04-23 21:43:53 +00:00
Brezak	f68bc01544	Run `CheckVisibility` after all the other visibility system sets have… (#12962 ) # Objective Make visibility system ordering explicit. Fixes #12953. ## Solution Specify `CheckVisibility` happens after all other `VisibilitySystems` sets have happened. --------- Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com>	2024-04-18 20:33:29 +00:00
BD103	7b8d502083	Fix beta lints (#12980 ) # Objective - Fixes #12976 ## Solution This one is a doozy. - Run `cargo +beta clippy --workspace --all-targets --all-features` and fix all issues - This includes: - Moving inner attributes to be outer attributes, when the item in question has both inner and outer attributes - Use `ptr::from_ref` in more scenarios - Extend the valid idents list used by `clippy:doc_markdown` with more names - Use `Clone::clone_from` when possible - Remove redundant `ron` import - Add backticks to so many identifiers and items - I'm sorry whoever has to review this --- ## Changelog - Added links to more identifiers in documentation.	2024-04-16 02:46:46 +00:00
Patrick Walton	8577a448f7	Fix rendering of sprites, text, and meshlets after #12582 . (#12945 ) `Sprite`, `Text`, and `Handle<MeshletMesh>` were types of renderable entities that the new segregated visible entity system didn't handle, so they didn't appear. Because `bevy_text` depends on `bevy_sprite`, and the visibility computation of text happens in the latter crate, I had to introduce a new marker component, `SpriteSource`. `SpriteSource` marks entities that aren't themselves sprites but become sprites during rendering. I added this component to `Text2dBundle`. Unfortunately, this is technically a breaking change, although I suspect it won't break anybody in practice except perhaps editors. Fixes #12935. ## Changelog ### Changed * `Text2dBundle` now includes a new marker component, `SpriteSource`. Bevy uses this internally to optimize visibility calculation. ## Migration Guide * `Text` now requires a `SpriteSource` marker component in order to appear. This component has been added to `Text2dBundle`.	2024-04-13 14:15:00 +00:00
Patrick Walton	d59b1e71ef	Implement percentage-closer filtering (PCF) for point lights. (#12910 ) I ported the two existing PCF techniques to the cubemap domain as best I could. Generally, the technique is to create a 2D orthonormal basis using Gram-Schmidt normalization, then apply the technique over that basis. The results look fine, though the shadow bias often needs adjusting. For comparison, Unity uses a 4-tap pattern for PCF on point lights of (1, 1, 1), (-1, -1, 1), (-1, 1, -1), (1, -1, -1). I tried this but didn't like the look, so I went with the design above, which ports the 2D techniques to the 3D domain. There's surprisingly little material on point light PCF. I've gone through every example using point lights and verified that the shadow maps look fine, adjusting biases as necessary. Fixes #3628. --- ## Changelog ### Added * Shadows from point lights now support percentage-closer filtering (PCF), and as a result look less aliased. ### Changed * `ShadowFilteringMethod::Castano13` and `ShadowFilteringMethod::Jimenez14` have been renamed to `ShadowFilteringMethod::Gaussian` and `ShadowFilteringMethod::Temporal` respectively. ## Migration Guide * `ShadowFilteringMethod::Castano13` and `ShadowFilteringMethod::Jimenez14` have been renamed to `ShadowFilteringMethod::Gaussian` and `ShadowFilteringMethod::Temporal` respectively.	2024-04-10 20:16:08 +00:00
Robert Swain	ab7cbfa8fc	Consolidate Render(Ui)Materials(2d) into RenderAssets (#12827 ) # Objective - Replace `RenderMaterials` / `RenderMaterials2d` / `RenderUiMaterials` with `RenderAssets` to enable implementing changes to one thing, `RenderAssets`, that applies to all use cases rather than duplicating changes everywhere for multiple things that should be one thing. - Adopts #8149 ## Solution - Make RenderAsset generic over the destination type rather than the source type as in #8149 - Use `RenderAssets<PreparedMaterial<M>>` etc for render materials --- ## Changelog - Changed: - The `RenderAsset` trait is now implemented on the destination type. Its `SourceAsset` associated type refers to the type of the source asset. - `RenderMaterials`, `RenderMaterials2d`, and `RenderUiMaterials` have been replaced by `RenderAssets<PreparedMaterial<M>>` and similar. ## Migration Guide - `RenderAsset` is now implemented for the destination type rather that the source asset type. The source asset type is now the `RenderAsset` trait's `SourceAsset` associated type.	2024-04-09 13:26:34 +00:00
JMS55	31b5943ad4	Add previous_view_uniforms.inverse_view (#12902 ) # Objective - Upload previous frame's inverse_view matrix to the GPU for use with https://github.com/bevyengine/bevy/pull/12898. --- ## Changelog - Added `prepass_bindings::previous_view_uniforms.inverse_view`. - Renamed `prepass_bindings::previous_view_proj` to `prepass_bindings::previous_view_uniforms.view_proj`. - Renamed `PreviousViewProjectionUniformOffset` to `PreviousViewUniformOffset`. - Renamed `PreviousViewProjection` to `PreviousViewData`. ## Migration Guide - Renamed `prepass_bindings::previous_view_proj` to `prepass_bindings::previous_view_uniforms.view_proj`. - Renamed `PreviousViewProjectionUniformOffset` to `PreviousViewUniformOffset`. - Renamed `PreviousViewProjection` to `PreviousViewData`.	2024-04-07 18:59:16 +00:00
James Liu	a4ed1b88b8	Relax BufferVec's type constraints (#12866 ) # Objective Since BufferVec was first introduced, `bytemuck` has added additional traits with fewer restrictions than `Pod`. Within BufferVec, we only rely on the constraints of `bytemuck::cast_slice` to a `u8` slice, which now only requires `T: NoUninit` which is a strict superset of `Pod` types. ## Solution Change out the `Pod` generic type constraint with `NoUninit`. Also taking the opportunity to substitute `cast_slice` with `must_cast_slice`, which avoids a runtime panic in place of a compile time failure if `T` cannot be used. --- ## Changelog Changed: `BufferVec` now supports working with types containing `NoUninit` but not `Pod` members. Changed: `BufferVec` will now fail to compile if used with a type that cannot be safely read from. Most notably, this includes ZSTs, which would previously always panic at runtime.	2024-04-05 02:11:41 +00:00
Cameron	01649f13e2	Refactor `App` and `SubApp` internals for better separation (#9202 ) # Objective This is a necessary precursor to #9122 (this was split from that PR to reduce the amount of code to review all at once). Moving `!Send` resource ownership to `App` will make it unambiguously `!Send`. `SubApp` must be `Send`, so it can't wrap `App`. ## Solution Refactor `App` and `SubApp` to not have a recursive relationship. Since `SubApp` no longer wraps `App`, once `!Send` resources are moved out of `World` and into `App`, `SubApp` will become unambiguously `Send`. There could be less code duplication between `App` and `SubApp`, but that would break `App` method chaining. ## Changelog - `SubApp` no longer wraps `App`. - `App` fields are no longer publicly accessible. - `App` can no longer be converted into a `SubApp`. - Various methods now return references to a `SubApp` instead of an `App`. ## Migration Guide - To construct a sub-app, use `SubApp::new()`. `App` can no longer convert into `SubApp`. - If you implemented a trait for `App`, you may want to implement it for `SubApp` as well. - If you're accessing `app.world` directly, you now have to use `app.world()` and `app.world_mut()`. - `App::sub_app` now returns `&SubApp`. - `App::sub_app_mut` now returns `&mut SubApp`. - `App::get_sub_app` now returns `Option<&SubApp>.` - `App::get_sub_app_mut` now returns `Option<&mut SubApp>.`	2024-03-31 03:16:10 +00:00
James Liu	e62a01f403	Make PersistentGpuBufferable a safe trait (#12744 ) # Objective Fixes #12727. All parts that `PersistentGpuBuffer` interact with should be 100% safe both on the CPU and the GPU: `Queue::write_buffer_with` zeroes out the slice being written to and when uploading to the GPU, and all slice writes are bounds checked on the CPU side. ## Solution Make `PersistentGpuBufferable` a safe trait. Enforce it's correct implementation via assertions. Re-enable `forbid(unsafe_code)` on `bevy_pbr`.	2024-03-29 13:14:34 +00:00
James Liu	56bcbb0975	Forbid unsafe in most crates in the engine (#12684 ) # Objective Resolves #3824. `unsafe` code should be the exception, not the norm in Rust. It's obviously needed for various use cases as it's interfacing with platforms and essentially running the borrow checker at runtime in the ECS, but the touted benefits of Bevy is that we are able to heavily leverage Rust's safety, and we should be holding ourselves accountable to that by minimizing our unsafe footprint. ## Solution Deny `unsafe_code` workspace wide. Add explicit exceptions for the following crates, and forbid it in almost all of the others. * bevy_ecs - Obvious given how much unsafe is needed to achieve performant results * bevy_ptr - Works with raw pointers, even more low level than bevy_ecs. * bevy_render - due to needing to integrate with wgpu * bevy_window - due to needing to integrate with raw_window_handle * bevy_utils - Several unsafe utilities used by bevy_ecs. Ideally moved into bevy_ecs instead of made publicly usable. * bevy_reflect - Required for the unsafe type casting it's doing. * bevy_transform - for the parallel transform propagation * bevy_gizmos - For the SystemParam impls it has. * bevy_assets - To support reflection. Might not be required, not 100% sure yet. * bevy_mikktspace - due to being a conversion from a C library. Pending safe rewrite. * bevy_dynamic_plugin - Inherently unsafe due to the dynamic loading nature. Several uses of unsafe were rewritten, as they did not need to be using them: * bevy_text - a case of `Option::unchecked` could be rewritten as a normal for loop and match instead of an iterator. * bevy_color - the Pod/Zeroable implementations were replaceable with bytemuck's derive macros.	2024-03-27 03:30:08 +00:00
JMS55	4f20faaa43	Meshlet rendering (initial feature) (#10164 ) # Objective - Implements a more efficient, GPU-driven (https://github.com/bevyengine/bevy/issues/1342) rendering pipeline based on meshlets. - Meshes are split into small clusters of triangles called meshlets, each of which acts as a mini index buffer into the larger mesh data. Meshlets can be compressed, streamed, culled, and batched much more efficiently than monolithic meshes. ![image](https://github.com/bevyengine/bevy/assets/47158642/cb2aaad0-7a9a-4e14-93b0-15d4e895b26a) ![image](https://github.com/bevyengine/bevy/assets/47158642/7534035b-1eb7-4278-9b99-5322e4401715) # Misc * Future work: https://github.com/bevyengine/bevy/issues/11518 * Nanite reference: https://advances.realtimerendering.com/s2021/Karis_Nanite_SIGGRAPH_Advances_2021_final.pdf Two pass occlusion culling explained very well: https://medium.com/@mil_kru/two-pass-occlusion-culling-4100edcad501 --------- Co-authored-by: Ricky Taylor <rickytaylor26@gmail.com> Co-authored-by: vero <email@atlasdostal.com> Co-authored-by: François <mockersf@gmail.com> Co-authored-by: atlas dostal <rodol@rivalrebels.com>	2024-03-25 19:08:27 +00:00

43 commits