Mirrors/bevy

mirror of https://github.com/bevyengine/bevy synced 2024-11-22 12:43:34 +00:00

Author	SHA1	Message	Date
moonlightaria	3f2cc244d7	Add color conversions #13224 (#13276 ) # Objective fixes #13224 adds conversions for Vec3 and Vec4 since these appear so often ## Solution added Covert trait (couldn't think of good name) for [f32; 4], [f32, 3], Vec4, and Vec3 along with the symmetric implementation ## Changelog added conversions between arrays and vector to colors and vice versa #migration LinearRgba appears to have already had implicit conversions for [f32;4] and Vec4	2024-05-09 18:01:52 +00:00
Patrick Walton	0dddfa07ab	Fix the WebGL 2 backend by giving the `visibility_ranges` array a fixed length. (#13210 ) WebGL 2 doesn't support variable-length uniform buffer arrays. So we arbitrarily set the length of the visibility ranges field to 64 on that platform. --------- Co-authored-by: IceSentry <c.giguere42@gmail.com>	2024-05-08 07:34:59 +00:00
IceSentry	4737106bdd	Extract mesh view layouts logic (#13266 ) Copied almost verbatim from the volumetric fog PR # Objective - Managing mesh view layouts is complicated ## Solution - Extract it to it's own struct - This was done as part of #13057 and is copied almost verbatim. I wanted to keep this part of the PR it's own atomic commit in case we ever have to revert fog or run a bisect. This change is good whether or not we have volumetric fog. Co-Authored-By: @pcwalton	2024-05-07 06:46:41 +00:00
Fpgu	60a73fa60b	Use `Dir3` for local axis methods in `GlobalTransform` (#13264 ) Switched the return type from `Vec3` to `Dir3` for directional axis methods within the `GlobalTransform` component. ## Migration Guide The `GlobalTransform` component's directional axis methods (e.g., `right()`, `left()`, `up()`, `down()`, `back()`, `forward()`) have been updated from returning `Vec3` to `Dir3`.	2024-05-06 20:52:05 +00:00
Patrick Walton	59b52fc94e	Modulate the emissive texture by the emissive color again. (#13251 ) Fixes a regression introduced by #13031.	2024-05-06 20:06:10 +00:00
Patrick Walton	77ed72bc16	Implement clearcoat per the Filament and the `KHR_materials_clearcoat` specifications. (#13031 ) Clearcoat is a separate material layer that represents a thin translucent layer of a material. Examples include (from the [Filament spec]) car paint, soda cans, and lacquered wood. This commit implements support for clearcoat following the Filament and Khronos specifications, marking the beginnings of support for multiple PBR layers in Bevy. The [`KHR_materials_clearcoat`] specification describes the clearcoat support in glTF. In Blender, applying a clearcoat to the Principled BSDF node causes the clearcoat settings to be exported via this extension. As of this commit, Bevy parses and reads the extension data when present in glTF. Note that the `gltf` crate has no support for `KHR_materials_clearcoat`; this patch therefore implements the JSON semantics manually. Clearcoat is integrated with `StandardMaterial`, but the code is behind a series of `#ifdef`s that only activate when clearcoat is present. Additionally, the `pbr_feature_layer_material_textures` Cargo feature must be active in order to enable support for clearcoat factor maps, clearcoat roughness maps, and clearcoat normal maps. This approach mirrors the same pattern used by the existing transmission feature and exists to avoid running out of texture bindings on platforms like WebGL and WebGPU. Note that constant clearcoat factors and roughness values are supported in the browser; only the relatively-less-common maps are disabled on those platforms. This patch refactors the lighting code in `StandardMaterial` significantly in order to better support multiple layers in a natural way. That code was due for a refactor in any case, so this is a nice improvement. A new demo, `clearcoat`, has been added. It's based on [the corresponding three.js demo], but all the assets (aside from the skybox and environment map) are my original work. [Filament spec]: https://google.github.io/filament/Filament.html#materialsystem/clearcoatmodel [`KHR_materials_clearcoat`]: https://github.com/KhronosGroup/glTF/blob/main/extensions/2.0/Khronos/KHR_materials_clearcoat/README.md [the corresponding three.js demo]: https://threejs.org/examples/webgl_materials_physical_clearcoat.html ![Screenshot 2024-04-19 101143](https://github.com/bevyengine/bevy/assets/157897/3444bcb5-5c20-490c-b0ad-53759bd47ae2) ![Screenshot 2024-04-19 102054](https://github.com/bevyengine/bevy/assets/157897/6e953944-75b8-49ef-bc71-97b0a53b3a27) ## Changelog ### Added * `StandardMaterial` now supports a clearcoat layer, which represents a thin translucent layer over an underlying material. * The glTF loader now supports the `KHR_materials_clearcoat` extension, representing materials with clearcoat layers. ## Migration Guide * The lighting functions in the `pbr_lighting` WGSL module now have clearcoat parameters, if `STANDARD_MATERIAL_CLEARCOAT` is defined. * The `R` reflection vector parameter has been removed from some lighting functions, as it was unused.	2024-05-05 22:57:05 +00:00
JMS55	77ebabc4fe	Meshlet remove per-cluster data upload (#13125 ) # Objective - Per-cluster (instance of a meshlet) data upload is ridiculously expensive in both CPU and GPU time (8 bytes per cluster, millions of clusters, you very quickly run into PCIE bandwidth maximums, and lots of CPU-side copies and malloc). - We need to be uploading only per-instance/entity data. Anything else needs to be done on the GPU. ## Solution - Per instance, upload: - `meshlet_instance_meshlet_counts_prefix_sum` - An exclusive prefix sum over the count of how many clusters each instance has. - `meshlet_instance_meshlet_slice_starts` - The starting index of the meshlets for each instance within the `meshlets` buffer. - A new `fill_cluster_buffers` pass once at the start of the frame has a thread per cluster, and finds its instance ID and meshlet ID via a binary search of `meshlet_instance_meshlet_counts_prefix_sum` to find what instance it belongs to, and then uses that plus `meshlet_instance_meshlet_slice_starts` to find what number meshlet within the instance it is. The shader then writes out the per-cluster instance/meshlet ID buffers for later passes to quickly read from. - I've gone from 45 -> 180 FPS in my stress test scene, and saved ~30ms/frame of overall CPU/GPU time.	2024-05-04 19:56:19 +00:00
arcashka	6027890a11	move wgsl color operations from bevy_pbr to bevy_render (#13209 ) # Objective `bevy_pbr/utils.wgsl` shader file contains mathematical constants and color conversion functions. Both of those should be accessible without enabling `bevy_pbr` feature. For example, tonemapping can be done in non pbr scenario, and it uses color conversion functions. Fixes #13207 ## Solution * Move mathematical constants (such as PI, E) from `bevy_pbr/src/render/utils.wgsl` into `bevy_render/src/maths.wgsl` * Move color conversion functions from `bevy_pbr/src/render/utils.wgsl` into new file `bevy_render/src/color_operations.wgsl` ## Testing Ran multiple examples, checked they are working: * tonemapping * color_grading * 3d_scene * animated_material * deferred_rendering * 3d_shapes * fog * irradiance_volumes * meshlet * parallax_mapping * pbr * reflection_probes * shadow_biases * 2d_gizmos * light_gizmos --- ## Changelog * Moved mathematical constants (such as PI, E) from `bevy_pbr/src/render/utils.wgsl` into `bevy_render/src/maths.wgsl` * Moved color conversion functions from `bevy_pbr/src/render/utils.wgsl` into new file `bevy_render/src/color_operations.wgsl` ## Migration Guide In user's shader code replace usage of mathematical constants from `bevy_pbr::utils` to the usage of the same constants from `bevy_render::maths`.	2024-05-04 10:30:23 +00:00
Kristoffer Søholm	2089a28717	Add BufferVec, an higher-performance alternative to StorageBuffer, and make GpuArrayBuffer use it. (#13199 ) This is an adoption of #12670 plus some documentation fixes. See that PR for more details. --- ## Changelog * Renamed `BufferVec` to `RawBufferVec` and added a new `BufferVec` type. ## Migration Guide `BufferVec` has been renamed to `RawBufferVec` and a new similar type has taken the `BufferVec` name. --------- Co-authored-by: Patrick Walton <pcwalton@mimiga.net> Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>	2024-05-03 11:39:21 +00:00
Patrick Walton	31835ff76d	Implement visibility ranges, also known as hierarchical levels of detail (HLODs). (#12916 ) Implement visibility ranges, also known as hierarchical levels of detail (HLODs). This commit introduces a new component, `VisibilityRange`, which allows developers to specify camera distances in which meshes are to be shown and hidden. Hiding meshes happens early in the rendering pipeline, so this feature can be used for level of detail optimization. Additionally, this feature is properly evaluated per-view, so different views can show different levels of detail. This feature differs from proper mesh LODs, which can be implemented later. Engines generally implement true mesh LODs later in the pipeline; they're typically more efficient than HLODs with GPU-driven rendering. However, mesh LODs are more limited than HLODs, because they require the lower levels of detail to be meshes with the same vertex layout and shader (and perhaps the same material) as the original mesh. Games often want to use objects other than meshes to replace distant models, such as octahedral imposters or billboard imposters. The reason why the feature is called hierarchical level of detail is that HLODs can replace multiple meshes with a single mesh when the camera is far away. This can be useful for reducing drawcall count. Note that `VisibilityRange` doesn't automatically propagate down to children; it must be placed on every mesh. Crossfading between different levels of detail is supported, using the standard 4x4 ordered dithering pattern from [1]. The shader code to compute the dithering patterns should be well-optimized. The dithering code is only active when visibility ranges are in use for the mesh in question, so that we don't lose early Z. Cascaded shadow maps show the HLOD level of the view they're associated with. Point light and spot light shadow maps, which have no CSMs, display all HLOD levels that are visible in any view. To support this efficiently and avoid doing visibility checks multiple times, we precalculate all visible HLOD levels for each entity with a `VisibilityRange` during the `check_visibility_range` system. A new example, `visibility_range`, has been added to the tree, as well as a new low-poly version of the flight helmet model to go with it. It demonstrates use of the visibility range feature to provide levels of detail. [1]: https://en.wikipedia.org/wiki/Ordered_dithering#Threshold_map [^1]: Unreal doesn't have a feature that exactly corresponds to visibility ranges, but Unreal's HLOD system serves roughly the same purpose. ## Changelog ### Added * A new `VisibilityRange` component is available to conditionally enable entity visibility at camera distances, with optional crossfade support. This can be used to implement different levels of detail (LODs). ## Screenshots High-poly model: ![Screenshot 2024-04-09 185541](https://github.com/bevyengine/bevy/assets/157897/7e8be017-7187-4471-8866-974e2d8f2623) Low-poly model up close: ![Screenshot 2024-04-09 185546](https://github.com/bevyengine/bevy/assets/157897/429603fe-6bb7-4246-8b4e-b4888fd1d3a0) Crossfading between the two: ![Screenshot 2024-04-09 185604](https://github.com/bevyengine/bevy/assets/157897/86d0d543-f8f3-49ec-8fe5-caa4d0784fd4) --------- Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2024-05-03 00:11:35 +00:00
BD103	e357b63448	Add `README.md` to all crates (#13184 ) # Objective - `README.md` is a common file that usually gives an overview of the folder it is in. - When on <https://crates.io>, `README.md` is rendered as the main description. - Many crates in this repository are lacking `README.md` files, which makes it more difficult to understand their purpose. <img width="1552" alt="image" src="https://github.com/bevyengine/bevy/assets/59022059/78ebf91d-b0c4-4b18-9874-365d6310640f"> - There are also a few inconsistencies with `README.md` files that this PR and its follow-ups intend to fix. ## Solution - Create a `README.md` file for all crates that do not have one. - This file only contains the title of the crate (underscores removed, proper capitalization, acronyms expanded) and the <https://shields.io> badges. - Remove the `readme` field in `Cargo.toml` for `bevy` and `bevy_reflect`. - This field is redundant because [Cargo automatically detects `README.md` files](https://doc.rust-lang.org/cargo/reference/manifest.html#the-readme-field). The field is only there if you name it something else, like `INFO.md`. - Fix capitalization of `bevy_utils`'s `README.md`. - It was originally `Readme.md`, which is inconsistent with the rest of the project. - I created two commits renaming it to `README.md`, because Git appears to be case-insensitive. - Expand acronyms in title of `bevy_ptr` and `bevy_utils`. - In the commit where I created all the new `README.md` files, I preferred using expanded acronyms in the titles. (E.g. "Bevy Developer Tools" instead of "Bevy Dev Tools".) - This commit changes the title of existing `README.md` files to follow the same scheme. - I do not feel strongly about this change, please comment if you disagree and I can revert it. - Add <https://shields.io> badges to `bevy_time` and `bevy_transform`, which are the only crates currently lacking them. --- ## Changelog - Added `README.md` files to all crates missing it.	2024-05-02 18:56:00 +00:00
Patrick Walton	961b24deaf	Implement filmic color grading. (#13121 ) This commit expands Bevy's existing tonemapping feature to a complete set of filmic color grading tools, matching those of engines like Unity, Unreal, and Godot. The following features are supported: * White point adjustment. This is inspired by Unity's implementation of the feature, but simplified and optimized. Temperature and tint control the adjustments to the x and y chromaticity values of [CIE 1931]. Following Unity, the adjustments are made relative to the [D65 standard illuminant] in the [LMS color space]. * Hue rotation. This simply converts the RGB value to [HSV], alters the hue, and converts back. * Color correction. This allows the gamma, gain, and lift values to be adjusted according to the standard [ASC CDL combined function]. * Separate color correction for shadows, midtones, and highlights. Blender's source code was used as a reference for the implementation of this. The midtone ranges can be adjusted by the user. To avoid abrupt color changes, a small crossfade is used between the different sections of the image, again following Blender's formulas. A new example, `color_grading`, has been added, offering a GUI to change all the color grading settings. It uses the same test scene as the existing `tonemapping` example, which has been factored out into a shared glTF scene. [CIE 1931]: https://en.wikipedia.org/wiki/CIE_1931_color_space [D65 standard illuminant]: https://en.wikipedia.org/wiki/Standard_illuminant#Illuminant_series_D [LMS color space]: https://en.wikipedia.org/wiki/LMS_color_space [HSV]: https://en.wikipedia.org/wiki/HSL_and_HSV [ASC CDL combined function]: https://en.wikipedia.org/wiki/ASC_CDL#Combined_Function ## Changelog ### Added * Many new filmic color grading options have been added to the `ColorGrading` component. ## Migration Guide * `ColorGrading::gamma` and `ColorGrading::pre_saturation` are now set separately for the `shadows`, `midtones`, and `highlights` sections. You can migrate code with the `ColorGrading::all_sections` and `ColorGrading::all_sections_mut` functions, which access and/or update all sections at once. * `ColorGrading::post_saturation` and `ColorGrading::exposure` are now fields of `ColorGrading::global`. ## Screenshots ![Screenshot 2024-04-27 143144](https://github.com/bevyengine/bevy/assets/157897/c1de5894-917d-4101-b5c9-e644d141a941) ![Screenshot 2024-04-27 143216](https://github.com/bevyengine/bevy/assets/157897/da393c8a-d747-42f5-b47c-6465044c788d)	2024-05-02 12:18:59 +00:00
Patrick Walton	16531fb3e3	Implement GPU frustum culling. (#12889 ) This commit implements opt-in GPU frustum culling, built on top of the infrastructure in https://github.com/bevyengine/bevy/pull/12773. To enable it on a camera, add the `GpuCulling` component to it. To additionally disable CPU frustum culling, add the `NoCpuCulling` component. Note that adding `GpuCulling` without `NoCpuCulling` currently does nothing useful. The reason why `GpuCulling` doesn't automatically imply `NoCpuCulling` is that I intend to follow this patch up with GPU two-phase occlusion culling, and CPU frustum culling plus GPU occlusion culling seems like a very commonly-desired mode. Adding the `GpuCulling` component to a view puts that view into indirect mode. This mode makes all drawcalls indirect, relying on the mesh preprocessing shader to allocate instances dynamically. In indirect mode, the `PreprocessWorkItem` `output_index` points not to a `MeshUniform` instance slot but instead to a set of `wgpu` `IndirectParameters`, from which it allocates an instance slot dynamically if frustum culling succeeds. Batch building has been updated to allocate and track indirect parameter slots, and the AABBs are now supplied to the GPU as `MeshCullingData`. A small amount of code relating to the frustum culling has been borrowed from meshlets and moved into `maths.wgsl`. Note that standard Bevy frustum culling uses AABBs, while meshlets use bounding spheres; this means that not as much code can be shared as one might think. This patch doesn't provide any way to perform GPU culling on shadow maps, to avoid making this patch bigger than it already is. That can be a followup. ## Changelog ### Added * Frustum culling can now optionally be done on the GPU. To enable it, add the `GpuCulling` component to a camera. * To disable CPU frustum culling, add `NoCpuCulling` to a camera. Note that `GpuCulling` doesn't automatically imply `NoCpuCulling`.	2024-04-28 12:50:00 +00:00
BD103	45bb6253e2	Restore dragons to their seat of power (#13124 ) # Objective - There is an unfortunate lack of dragons in the meshlet docs. - Dragons are symbolic of majesty, power, storms, and meshlets. - A dragon habitat such as our docs requires cultivation to ensure each winged lizard reaches their fullest, fiery selves. ## Solution - Fix the link to the dragon image. - The link originally targeted the `meshlet` branch, but that was later deleted after it was merged into `main`. --- ## Changelog - Added a dragon back into the `MeshletPlugin` documentation.	2024-04-28 07:20:16 +00:00
JMS55	e1a0da0fa6	Meshlet LOD-compatible two-pass occlusion culling (#12898 ) Keeping track of explicit visibility per cluster between frames does not work with LODs, and leads to worse culling (using the final depth buffer from the previous frame is more accurate). Instead, we need to generate a second depth pyramid after the second raster pass, and then use that in the first culling pass in the next frame to test if a cluster would have been visible last frame or not. As part of these changes, the write_index_buffer pass has been folded into the culling pass for a large performance gain, and to avoid tracking a lot of extra state that would be needed between passes. Prepass previous model/view stuff was adapted to work with meshlets as well. Also fixed a bug with materials, and other misc improvements. --------- Co-authored-by: François <mockersf@gmail.com> Co-authored-by: atlas dostal <rodol@rivalrebels.com> Co-authored-by: vero <email@atlasdostal.com> Co-authored-by: Patrick Walton <pcwalton@mimiga.net> Co-authored-by: Robert Swain <robert.swain@gmail.com>	2024-04-28 05:30:20 +00:00
Doonv	de9dc9c204	Fix `CameraProjection` panic and improve `CameraProjectionPlugin` (#11808 ) # Objective Fix https://github.com/bevyengine/bevy/issues/11799 and improve `CameraProjectionPlugin` ## Solution `CameraProjectionPlugin` is now an all-in-one plugin for adding a custom `CameraProjection`. I also added `PbrProjectionPlugin` which is like `CameraProjectionPlugin` but for PBR. P.S. I'd like to get this merged after https://github.com/bevyengine/bevy/pull/11766. --- ## Changelog - Changed `CameraProjectionPlugin` to be an all-in-one plugin for adding a `CameraProjection` - Removed `VisibilitySystems::{UpdateOrthographicFrusta, UpdatePerspectiveFrusta, UpdateProjectionFrusta}`, now replaced with `VisibilitySystems::UpdateFrusta` - Added `PbrProjectionPlugin` for projection-specific PBR functionality. ## Migration Guide `VisibilitySystems`'s `UpdateOrthographicFrusta`, `UpdatePerspectiveFrusta`, and `UpdateProjectionFrusta` variants were removed, they were replaced with `VisibilitySystems::UpdateFrusta`	2024-04-26 23:52:09 +00:00
re0312	92928f13ed	Cleanup extract_meshes (#13026 ) # Objective - clean up extract_mesh_(gpu/cpu)_building ## Solution - gpu_building no need to hold `prev_render_mesh_instances` - using `insert_unique_unchecked` instead of simple insert as we know all entities are unique - direcly get `previous_input_index ` in par_loop ## Performance this should also bring a slight performance win. cargo run --release --example many_cubes --features bevy/trace_tracy -- --no-frustum-culling `extract_meshes_for_gpu_building` ![image](https://github.com/bevyengine/bevy/assets/45868716/a5425e8a-258b-482d-afda-170363ee6479) --------- Co-authored-by: Patrick Walton <pcwalton@mimiga.net>	2024-04-26 23:49:32 +00:00
findmyhappy	36a3e53e10	chore: fix some comments (#13083 ) # Objective remove repetitive words Signed-off-by: findmyhappy <findhappy@sohu.com>	2024-04-25 19:09:16 +00:00
JMS55	6d6810c90d	Meshlet continuous LOD (#12755 ) Adds a basic level of detail system to meshlets. An extremely brief summary is as follows: * In `from_mesh.rs`, once we've built the first level of clusters, we group clusters, simplify the new mega-clusters, and then split the simplified groups back into regular sized clusters. Repeat several times (ideally until you can't anymore). This forms a directed acyclic graph (DAG), where the children are the meshlets from the previous level, and the parents are the more simplified versions of their children. The leaf nodes are meshlets formed from the original mesh. * In `cull_meshlets.wgsl`, each cluster selects whether to render or not based on the LOD bounding sphere (different than the culling bounding sphere) of the current meshlet, the LOD bounding sphere of its parent (the meshlet group from simplification), and the simplification error relative to its children of both the current meshlet and its parent meshlet. This kind of breaks two pass occlusion culling, which will be fixed in a future PR by using an HZB from the previous frame to get the initial list of occluders. Many, _many_ improvements to be done in the future https://github.com/bevyengine/bevy/issues/11518, not least of which is code quality and speed. I don't even expect this to work on many types of input meshes. This is just a basic implementation/draft for collaboration. Arguable how much we want to do in this PR, I'll leave that up to maintainers. I've erred on the side of "as basic as possible". References: * Slides 27-77 (video available on youtube) https://advances.realtimerendering.com/s2021/Karis_Nanite_SIGGRAPH_Advances_2021_final.pdf * https://blog.traverseresearch.nl/creating-a-directed-acyclic-graph-from-a-mesh-1329e57286e5 * https://jglrxavpok.github.io/2024/01/19/recreating-nanite-lod-generation.html, https://jglrxavpok.github.io/2024/03/12/recreating-nanite-faster-lod-generation.html, https://jglrxavpok.github.io/2024/04/02/recreating-nanite-runtime-lod-selection.html, and https://github.com/jglrxavpok/Carrot * https://github.com/gents83/INOX/tree/master/crates/plugins/binarizer/src * https://cs418.cs.illinois.edu/website/text/nanite.html ![image](https://github.com/bevyengine/bevy/assets/47158642/e40bff9b-7d0c-4a19-a3cc-2aad24965977) ![image](https://github.com/bevyengine/bevy/assets/47158642/442c7da3-7761-4da7-9acd-37f15dd13e26) --------- Co-authored-by: Ricky Taylor <rickytaylor26@gmail.com> Co-authored-by: vero <email@atlasdostal.com> Co-authored-by: François <mockersf@gmail.com> Co-authored-by: atlas dostal <rodol@rivalrebels.com> Co-authored-by: Patrick Walton <pcwalton@mimiga.net>	2024-04-23 21:43:53 +00:00
JMS55	17633c1f75	Remove unused push constants (#13076 ) The shader code was removed in #11280, but we never cleaned up the rust code.	2024-04-23 21:43:46 +00:00
re0312	0f27500e46	Improve par_iter and Parallel (#12904 ) # Objective - bevy usually use `Parallel::scope` to collect items from `par_iter`, but `scope` will be called with every satifified items. it will cause a lot of unnecessary lookup. ## Solution - similar to Rayon ,we introduce `for_each_init` for `par_iter` which only be invoked when spawn a task for a group of items. --- ## Changelog - added `for_each_init` ## Performance `check_visibility ` in `many_foxes ` ![image](https://github.com/bevyengine/bevy/assets/45868716/030c41cf-0d2f-4a36-a071-35097d93e494) ~40% performance gain in `check_visibility`. --------- Co-authored-by: James Liu <contact@jamessliu.com>	2024-04-23 12:05:34 +00:00
François Mockers	c40b485095	use a u64 for MeshPipelineKey (#13015 ) # Objective - `MeshPipelineKey` use some bits for two things - First commit in this PR adds an assertion that doesn't work currently on main - This leads to some mesh topology not working anymore, for example `LineStrip` - With examples `lines`, there should be two groups of lines, the blue one doesn't display currently ## Solution - Change the `MeshPipelineKey` to be backed by a `u64` instead, to have enough bits	2024-04-21 20:01:45 +00:00
IceSentry	8403c41c67	Use WireframeColor to override global color (#13034 ) # Objective - The docs says the WireframeColor is supposed to override the default global color but it doesn't. ## Solution - Use WireframeColor to override global color like docs said it was supposed to do. - Updated the example to document this feature - I also took the opportunity to clean up the code a bit Fixes #13032	2024-04-20 13:59:12 +00:00
Brezak	f68bc01544	Run `CheckVisibility` after all the other visibility system sets have… (#12962 ) # Objective Make visibility system ordering explicit. Fixes #12953. ## Solution Specify `CheckVisibility` happens after all other `VisibilitySystems` sets have happened. --------- Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com>	2024-04-18 20:33:29 +00:00
charlotte	ef7bafa68e	Add missing Default impl to ExtendedMaterial. (#13008 ) # Objective When trying to be generic over `Material + Default`, the lack of a `Default` impl for `ExtendedMaterial`, even when both of its components implement `Default`, is problematic. I think was probably just overlooked. ## Solution Impl `Default` if the material and extension both impl `Default`. --- ## Changelog ## Migration Guide	2024-04-18 12:57:14 +00:00
Brezak	368c5cef1a	Implement clone for most bundles. (#12993 ) # Objective Closes #12985. ## Solution - Derive clone for most types with bundle in their name. - Bundle types missing clone: - [`TextBundle`](https://docs.rs/bevy/latest/bevy/prelude/struct.TextBundle.html) (Contains [`ContentSize`](https://docs.rs/bevy/latest/bevy/ui/struct.ContentSize.html) which can't be cloned because it itself contains a `Option<MeasureFunc>` where [`MeasureFunc`](https://docs.rs/taffy/0.3.18/taffy/node/enum.MeasureFunc.html) isn't clone) - [`ImageBundle`](https://docs.rs/bevy/latest/bevy/prelude/struct.ImageBundle.html) (Same as `TextBundle`) - [`AtlasImageBundle`](https://docs.rs/bevy/latest/bevy/prelude/struct.AtlasImageBundle.html) (Will be deprecated in 0.14 there's no point)	2024-04-16 16:37:09 +00:00
Patrick Walton	6003a317b8	Add `Cascades` to the type registry, fixing lights in glTF. (#12989 ) glTF files that contain lights currently panic when loaded into Bevy, because Bevy tries to reflect on `Cascades`, which accidentally wasn't registered.	2024-04-16 05:16:45 +00:00
BD103	7b8d502083	Fix beta lints (#12980 ) # Objective - Fixes #12976 ## Solution This one is a doozy. - Run `cargo +beta clippy --workspace --all-targets --all-features` and fix all issues - This includes: - Moving inner attributes to be outer attributes, when the item in question has both inner and outer attributes - Use `ptr::from_ref` in more scenarios - Extend the valid idents list used by `clippy:doc_markdown` with more names - Use `Clone::clone_from` when possible - Remove redundant `ron` import - Add backticks to so many identifiers and items - I'm sorry whoever has to review this --- ## Changelog - Added links to more identifiers in documentation.	2024-04-16 02:46:46 +00:00
Patrick Walton	1141e731ff	Implement alpha to coverage (A2C) support. (#12970 ) [Alpha to coverage] (A2C) replaces alpha blending with a hardware-specific multisample coverage mask when multisample antialiasing is in use. It's a simple form of [order-independent transparency] that relies on MSAA. ["Anti-aliased Alpha Test: The Esoteric Alpha To Coverage"] is a good summary of the motivation for and best practices relating to A2C. This commit implements alpha to coverage support as a new variant for `AlphaMode`. You can supply `AlphaMode::AlphaToCoverage` as the `alpha_mode` field in `StandardMaterial` to use it. When in use, the standard material shader automatically applies the texture filtering method from ["Anti-aliased Alpha Test: The Esoteric Alpha To Coverage"]. Objects with alpha-to-coverage materials are binned in the opaque pass, as they're fully order-independent. The `transparency_3d` example has been updated to feature an object with alpha to coverage. Happily, the example was already using MSAA. This is part of #2223, as far as I can tell. [Alpha to coverage]: https://en.wikipedia.org/wiki/Alpha_to_coverage [order-independent transparency]: https://en.wikipedia.org/wiki/Order-independent_transparency ["Anti-aliased Alpha Test: The Esoteric Alpha To Coverage"]: https://bgolus.medium.com/anti-aliased-alpha-test-the-esoteric-alpha-to-coverage-8b177335ae4f --- ## Changelog ### Added * The `AlphaMode` enum now supports `AlphaToCoverage`, to provide limited order-independent transparency when multisample antialiasing is in use.	2024-04-15 20:37:52 +00:00
re0312	09a1f94d14	fix shadow pass trace (#12977 ) # Objective - shadow pass trace does not work correctly ## Solution - enable it.	2024-04-15 15:55:39 +00:00
Robert Swain	5f05e75a70	Fix 2D BatchedInstanceBuffer clear (#12922 ) # Objective - `cargo run --release --example bevymark -- --benchmark --waves 160 --per-wave 1000 --mode mesh2d` runs slower and slower over time due to `no_gpu_preprocessing::write_batched_instance_buffer<bevy_sprite::mesh2d::mesh::Mesh2dPipeline>` taking longer and longer because the `BatchedInstanceBuffer` is not cleared ## Solution - Split the `clear_batched_instance_buffers` system into CPU and GPU versions - Use the CPU version for 2D meshes	2024-04-15 05:00:43 +00:00
Patrick Walton	8577a448f7	Fix rendering of sprites, text, and meshlets after #12582 . (#12945 ) `Sprite`, `Text`, and `Handle<MeshletMesh>` were types of renderable entities that the new segregated visible entity system didn't handle, so they didn't appear. Because `bevy_text` depends on `bevy_sprite`, and the visibility computation of text happens in the latter crate, I had to introduce a new marker component, `SpriteSource`. `SpriteSource` marks entities that aren't themselves sprites but become sprites during rendering. I added this component to `Text2dBundle`. Unfortunately, this is technically a breaking change, although I suspect it won't break anybody in practice except perhaps editors. Fixes #12935. ## Changelog ### Changed * `Text2dBundle` now includes a new marker component, `SpriteSource`. Bevy uses this internally to optimize visibility calculation. ## Migration Guide * `Text` now requires a `SpriteSource` marker component in order to appear. This component has been added to `Text2dBundle`.	2024-04-13 14:15:00 +00:00
Patrick Walton	5caf085dac	Divide the single `VisibleEntities` list into separate lists for 2D meshes, 3D meshes, lights, and UI elements, for performance. (#12582 ) This commit splits `VisibleEntities::entities` into four separate lists: one for lights, one for 2D meshes, one for 3D meshes, and one for UI elements. This allows `queue_material_meshes` and similar methods to avoid examining entities that are obviously irrelevant. In particular, this separation helps scenes with many skinned meshes, as the individual bones are considered visible entities but have no rendered appearance. Internally, `VisibleEntities::entities` is a `HashMap` from the `TypeId` representing a `QueryFilter` to the appropriate `Entity` list. I had to do this because `VisibleEntities` is located within an upstream crate from the crates that provide lights (`bevy_pbr`) and 2D meshes (`bevy_sprite`). As an added benefit, this setup allows apps to provide their own types of renderable components, by simply adding a specialized `check_visibility` to the schedule. This provides a 16.23% end-to-end speedup on `many_foxes` with 10,000 foxes (24.06 ms/frame to 20.70 ms/frame). ## Migration guide * `check_visibility` and `VisibleEntities` now store the four types of renderable entities--2D meshes, 3D meshes, lights, and UI elements--separately. If your custom rendering code examines `VisibleEntities`, it will now need to specify which type of entity it's interested in using the `WithMesh2d`, `WithMesh`, `WithLight`, and `WithNode` types respectively. If your app introduces a new type of renderable entity, you'll need to add an explicit call to `check_visibility` to the schedule to accommodate your new component or components. ## Analysis `many_foxes`, 10,000 foxes: `main`: ![Screenshot 2024-03-31 114444](https://github.com/bevyengine/bevy/assets/157897/16ecb2ff-6e04-46c0-a4b0-b2fde2084bad) `many_foxes`, 10,000 foxes, this branch: ![Screenshot 2024-03-31 114256](https://github.com/bevyengine/bevy/assets/157897/94dedae4-bd00-45b2-9aaf-dfc237004ddb) `queue_material_meshes` (yellow = this branch, red = `main`): ![Screenshot 2024-03-31 114637](https://github.com/bevyengine/bevy/assets/157897/f90912bd-45bd-42c4-bd74-57d98a0f036e) `queue_shadows` (yellow = this branch, red = `main`): ![Screenshot 2024-03-31 114607](https://github.com/bevyengine/bevy/assets/157897/6ce693e3-20c0-4234-8ec9-a6f191299e2d)	2024-04-11 20:33:20 +00:00
Patrick Walton	d59b1e71ef	Implement percentage-closer filtering (PCF) for point lights. (#12910 ) I ported the two existing PCF techniques to the cubemap domain as best I could. Generally, the technique is to create a 2D orthonormal basis using Gram-Schmidt normalization, then apply the technique over that basis. The results look fine, though the shadow bias often needs adjusting. For comparison, Unity uses a 4-tap pattern for PCF on point lights of (1, 1, 1), (-1, -1, 1), (-1, 1, -1), (1, -1, -1). I tried this but didn't like the look, so I went with the design above, which ports the 2D techniques to the 3D domain. There's surprisingly little material on point light PCF. I've gone through every example using point lights and verified that the shadow maps look fine, adjusting biases as necessary. Fixes #3628. --- ## Changelog ### Added * Shadows from point lights now support percentage-closer filtering (PCF), and as a result look less aliased. ### Changed * `ShadowFilteringMethod::Castano13` and `ShadowFilteringMethod::Jimenez14` have been renamed to `ShadowFilteringMethod::Gaussian` and `ShadowFilteringMethod::Temporal` respectively. ## Migration Guide * `ShadowFilteringMethod::Castano13` and `ShadowFilteringMethod::Jimenez14` have been renamed to `ShadowFilteringMethod::Gaussian` and `ShadowFilteringMethod::Temporal` respectively.	2024-04-10 20:16:08 +00:00
Vitaliy Sapronenko	ddcbb3cc80	flipping texture coords methods has been added to the StandardMaterial (#12917 ) # Objective Fixes #11996 The deprecated shape Quad's flip field role migrated to StandardMaterial's flip/flipped methods ## Solution flip/flipping methods of StandardMaterial is applicable to any mesh --- ## Changelog - Added flip and flipped methods to the StandardMaterial implementation - Added FLIP_HORIZONTAL, FLIP_VERTICAL, FLIP_X, FLIP_Y, FLIP_Z constants ## Migration Guide Instead of using `Quad::flip` field, call `flipped(true, false)` method on the StandardMaterial instance when adding the mesh. --------- Co-authored-by: BD103 <59022059+BD103@users.noreply.github.com>	2024-04-10 18:23:55 +00:00
Patrick Walton	11817f4ba4	Generate `MeshUniform`s on the GPU via compute shader where available. (#12773 ) Currently, `MeshUniform`s are rather large: 160 bytes. They're also somewhat expensive to compute, because they involve taking the inverse of a 3x4 matrix. Finally, if a mesh is present in multiple views, that mesh will have a separate `MeshUniform` for each and every view, which is wasteful. This commit fixes these issues by introducing the concept of a mesh input uniform and adding a mesh uniform building compute shader pass. The `MeshInputUniform` is simply the minimum amount of data needed for the GPU to compute the full `MeshUniform`. Most of this data is just the transform and is therefore only 64 bytes. `MeshInputUniform`s are computed during the extraction phase, much like skins are today, in order to avoid needlessly copying transforms around on CPU. (In fact, the render app has been changed to only store the translation of each mesh; it no longer cares about any other part of the transform, which is stored only on the GPU and the main world.) Before rendering, the `build_mesh_uniforms` pass runs to expand the `MeshInputUniform`s to the full `MeshUniform`. The mesh uniform building pass does the following, all on GPU: 1. Copy the appropriate fields of the `MeshInputUniform` to the `MeshUniform` slot. If a single mesh is present in multiple views, this effectively duplicates it into each view. 2. Compute the inverse transpose of the model transform, used for transforming normals. 3. If applicable, copy the mesh's transform from the previous frame for TAA. To support this, we double-buffer the `MeshInputUniform`s over two frames and swap the buffers each frame. The `MeshInputUniform`s for the current frame contain the index of that mesh's `MeshInputUniform` for the previous frame. This commit produces wins in virtually every CPU part of the pipeline: `extract_meshes`, `queue_material_meshes`, `batch_and_prepare_render_phase`, and especially `write_batched_instance_buffer` are all faster. Shrinking the amount of CPU data that has to be shuffled around speeds up the entire rendering process. \| Benchmark \| This branch \| `main` \| Speedup \| \|------------------------\|-------------\|---------\|---------\| \| `many_cubes -nfc` \| 17.259 \| 24.529 \| 42.12% \| \| `many_cubes -nfc -vpi` \| 302.116 \| 312.123 \| 3.31% \| \| `many_foxes` \| 3.227 \| 3.515 \| 8.92% \| Because mesh uniform building requires compute shader, and WebGL 2 has no compute shader, the existing CPU mesh uniform building code has been left as-is. Many types now have both CPU mesh uniform building and GPU mesh uniform building modes. Developers can opt into the old CPU mesh uniform building by setting the `use_gpu_uniform_builder` option on `PbrPlugin` to `false`. Below are graphs of the CPU portions of `many-cubes --no-frustum-culling`. Yellow is this branch, red is `main`. `extract_meshes`: ![Screenshot 2024-04-02 124842](https://github.com/bevyengine/bevy/assets/157897/a6748ea4-dd05-47b6-9254-45d07d33cb10) It's notable that we get a small win even though we're now writing to a GPU buffer. `queue_material_meshes`: ![Screenshot 2024-04-02 124911](https://github.com/bevyengine/bevy/assets/157897/ecb44d78-65dc-448d-ba85-2de91aa2ad94) There's a bit of a regression here; not sure what's causing it. In any case it's very outweighed by the other gains. `batch_and_prepare_render_phase`: ![Screenshot 2024-04-02 125123](https://github.com/bevyengine/bevy/assets/157897/4e20fc86-f9dd-4e5c-8623-837e4258f435) There's a huge win here, enough to make batching basically drop off the profile. `write_batched_instance_buffer`: ![Screenshot 2024-04-02 125237](https://github.com/bevyengine/bevy/assets/157897/401a5c32-9dc1-4991-996d-eb1cac6014b2) There's a massive improvement here, as expected. Note that a lot of it simply comes from the fact that `MeshInputUniform` is `Pod`. (This isn't a maintainability problem in my view because `MeshInputUniform` is so simple: just 16 tightly-packed words.) ## Changelog ### Added * Per-mesh instance data is now generated on GPU with a compute shader instead of CPU, resulting in rendering performance improvements on platforms where compute shaders are supported. ## Migration guide * Custom render phases now need multiple systems beyond just `batch_and_prepare_render_phase`. Code that was previously creating custom render phases should now add a `BinnedRenderPhasePlugin` or `SortedRenderPhasePlugin` as appropriate instead of directly adding `batch_and_prepare_render_phase`.	2024-04-10 05:33:32 +00:00
Robert Swain	ab7cbfa8fc	Consolidate Render(Ui)Materials(2d) into RenderAssets (#12827 ) # Objective - Replace `RenderMaterials` / `RenderMaterials2d` / `RenderUiMaterials` with `RenderAssets` to enable implementing changes to one thing, `RenderAssets`, that applies to all use cases rather than duplicating changes everywhere for multiple things that should be one thing. - Adopts #8149 ## Solution - Make RenderAsset generic over the destination type rather than the source type as in #8149 - Use `RenderAssets<PreparedMaterial<M>>` etc for render materials --- ## Changelog - Changed: - The `RenderAsset` trait is now implemented on the destination type. Its `SourceAsset` associated type refers to the type of the source asset. - `RenderMaterials`, `RenderMaterials2d`, and `RenderUiMaterials` have been replaced by `RenderAssets<PreparedMaterial<M>>` and similar. ## Migration Guide - `RenderAsset` is now implemented for the destination type rather that the source asset type. The source asset type is now the `RenderAsset` trait's `SourceAsset` associated type.	2024-04-09 13:26:34 +00:00
James Liu	934f2cfadf	Clean up some low level dependencies (#12858 ) # Objective Minimize the number of dependencies low in the tree. ## Solution * Remove the dependency on rustc-hash in bevy_ecs (not used) and bevy_macro_utils (only used in one spot). * Deduplicate the dependency on `sha1_smol` with the existing blake3 dependency already being used for bevy_asset. * Remove the unused `ron` dependency on `bevy_app` * Make the `serde` dependency for `bevy_ecs` optional. It's only used for serializing Entity. * Change the `wgpu` dependency to `wgpu-types`, and make it optional for `bevy_color`. * Remove the unused `thread-local` dependency on `bevy_render`. * Make multiple dependencies for `bevy_tasks` optional and enabled only when running with the `multi-threaded` feature. Preferably they'd be disabled all the time on wasm, but I couldn't find a clean way to do this. --- ## Changelog TODO ## Migration Guide TODO	2024-04-08 19:45:42 +00:00
UkoeHB	2ee69807b1	Fix potential out-of-bounds access in pbr_functions.wgsl (#12585 ) # Objective - Fix a potential out-of-bounds access in the `pbr_functions.wgsl` shader. ## Solution - Correctly compute the `GpuLights::directional_lights` array length. ## Comments I think this solves this comment in the code, but need someone to test it: ```rust //NOTE: When running bevy on Adreno GPU chipsets in WebGL, any value above 1 will result in a crash // when loading the wgsl "pbr_functions.wgsl" in the function apply_fog. ```	2024-04-08 17:00:09 +00:00
Martín Maita	3fc0c6869d	Bump crate-ci/typos from 1.19.0 to 1.20.4 (#12907 ) # Objective - Adopting https://github.com/bevyengine/bevy/pull/12903. ## Solution - Bump crate-ci/typos from 1.19.0 to 1.20.4. - Fixed a typo in `crates/bevy_pbr/src/render/pbr_functions.wgsl` file. - Added "PNG", "iy" and "SME" as exceptions to prevent false positives. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-04-08 15:31:11 +00:00
JMS55	31b5943ad4	Add previous_view_uniforms.inverse_view (#12902 ) # Objective - Upload previous frame's inverse_view matrix to the GPU for use with https://github.com/bevyengine/bevy/pull/12898. --- ## Changelog - Added `prepass_bindings::previous_view_uniforms.inverse_view`. - Renamed `prepass_bindings::previous_view_proj` to `prepass_bindings::previous_view_uniforms.view_proj`. - Renamed `PreviousViewProjectionUniformOffset` to `PreviousViewUniformOffset`. - Renamed `PreviousViewProjection` to `PreviousViewData`. ## Migration Guide - Renamed `prepass_bindings::previous_view_proj` to `prepass_bindings::previous_view_uniforms.view_proj`. - Renamed `PreviousViewProjectionUniformOffset` to `PreviousViewUniformOffset`. - Renamed `PreviousViewProjection` to `PreviousViewData`.	2024-04-07 18:59:16 +00:00
robtfm	452821dd52	more robust gpu image use (#12606 ) # Objective make morph targets and tonemapping more tolerant of delayed image loading. neither of these actually fail currently unless using a bespoke loader (and even then it would be rare), but i am working on adding throttling for asset gpu uploads (as a stopgap until we can do proper asset streaming) and they break with that. ## Solution when a mesh with morph targets is uploaded to the gpu, the prepare function uploads the morph target texture if it's available, otherwise it uploads without morph targets. this is generally fine as long as morph targets are typically loaded from bytes (in gltf loader), but may fail for a custom loader if the asset server async-loads the target texture and the texture is not available yet. the mesh fails to render and doesn't update when the image is loaded -> if morph targets are specified but not ready yet, retry mesh upload next frame tonemapping `unwrap`s on the lookup table image. this is never a problem since the image is added via `include_bytes!`, but could be a problem in future with asset gpu throttling/streaming. -> if the lookup texture is not yet available, use a fallback -> in the node, check if the fallback was used before caching the bind group	2024-04-07 17:18:58 +00:00
François Mockers	a9964f442d	fix msaa shift with irradiance volumes in mesh pipeline key (#12845 ) # Objective - #12791 broke example `irradiance_volumes` - Fixes #12876 ``` wgpu error: Validation Error Caused by: In Device::create_render_pipeline note: label = `pbr_opaque_mesh_pipeline` Color state [0] is invalid Sample count 8 is not supported by format Rgba8UnormSrgb on this device. The WebGPU spec guarentees [1, 4] samples are supported by this format. With the TEXTURE_ADAPTER_SPECIFIC_FORMAT_FEATURES feature your device supports [1, 2, 4]. ``` ## Solution - Shift bits a bit more	2024-04-05 17:50:23 +00:00
James Liu	a4ed1b88b8	Relax BufferVec's type constraints (#12866 ) # Objective Since BufferVec was first introduced, `bytemuck` has added additional traits with fewer restrictions than `Pod`. Within BufferVec, we only rely on the constraints of `bytemuck::cast_slice` to a `u8` slice, which now only requires `T: NoUninit` which is a strict superset of `Pod` types. ## Solution Change out the `Pod` generic type constraint with `NoUninit`. Also taking the opportunity to substitute `cast_slice` with `must_cast_slice`, which avoids a runtime panic in place of a compile time failure if `T` cannot be used. --- ## Changelog Changed: `BufferVec` now supports working with types containing `NoUninit` but not `Pod` members. Changed: `BufferVec` will now fail to compile if used with a type that cannot be safely read from. Most notably, this includes ZSTs, which would previously always panic at runtime.	2024-04-05 02:11:41 +00:00
Patrick Walton	37522fd0ae	Micro-optimize `queue_material_meshes`, primarily to remove bit manipulation. (#12791 ) This commit makes the following optimizations: ## `MeshPipelineKey`/`BaseMeshPipelineKey` split `MeshPipelineKey` has been split into `BaseMeshPipelineKey`, which lives in `bevy_render` and `MeshPipelineKey`, which lives in `bevy_pbr`. Conceptually, `BaseMeshPipelineKey` is a superclass of `MeshPipelineKey`. For `BaseMeshPipelineKey`, the bits start at the highest (most significant) bit and grow downward toward the lowest bit; for `MeshPipelineKey`, the bits start at the lowest bit and grow upward toward the highest bit. This prevents them from colliding. The goal of this is to avoid having to reassemble bits of the pipeline key for every mesh every frame. Instead, we can just use a bitwise or operation to combine the pieces that make up a `MeshPipelineKey`. ## `specialize_slow` Previously, all of `specialize()` was marked as `#[inline]`. This bloated `queue_material_meshes` unnecessarily, as a large chunk of it ended up being a slow path that was rarely hit. This commit refactors the function to move the slow path to `specialize_slow()`. Together, these two changes shave about 5% off `queue_material_meshes`: ![Screenshot 2024-03-29 130002](https://github.com/bevyengine/bevy/assets/157897/a7e5a994-a807-4328-b314-9003429dcdd2) ## Migration Guide - The `primitive_topology` field on `GpuMesh` is now an accessor method: `GpuMesh::primitive_topology()`. - For performance reasons, `MeshPipelineKey` has been split into `BaseMeshPipelineKey`, which lives in `bevy_render`, and `MeshPipelineKey`, which lives in `bevy_pbr`. These two should be combined with bitwise-or to produce the final `MeshPipelineKey`.	2024-04-01 21:58:53 +00:00
Cameron	01649f13e2	Refactor `App` and `SubApp` internals for better separation (#9202 ) # Objective This is a necessary precursor to #9122 (this was split from that PR to reduce the amount of code to review all at once). Moving `!Send` resource ownership to `App` will make it unambiguously `!Send`. `SubApp` must be `Send`, so it can't wrap `App`. ## Solution Refactor `App` and `SubApp` to not have a recursive relationship. Since `SubApp` no longer wraps `App`, once `!Send` resources are moved out of `World` and into `App`, `SubApp` will become unambiguously `Send`. There could be less code duplication between `App` and `SubApp`, but that would break `App` method chaining. ## Changelog - `SubApp` no longer wraps `App`. - `App` fields are no longer publicly accessible. - `App` can no longer be converted into a `SubApp`. - Various methods now return references to a `SubApp` instead of an `App`. ## Migration Guide - To construct a sub-app, use `SubApp::new()`. `App` can no longer convert into `SubApp`. - If you implemented a trait for `App`, you may want to implement it for `SubApp` as well. - If you're accessing `app.world` directly, you now have to use `app.world()` and `app.world_mut()`. - `App::sub_app` now returns `&SubApp`. - `App::sub_app_mut` now returns `&mut SubApp`. - `App::get_sub_app` now returns `Option<&SubApp>.` - `App::get_sub_app_mut` now returns `Option<&mut SubApp>.`	2024-03-31 03:16:10 +00:00
Patrick Walton	4dadebd9c4	Improve performance by binning together opaque items instead of sorting them. (#12453 ) Today, we sort all entities added to all phases, even the phases that don't strictly need sorting, such as the opaque and shadow phases. This results in a performance loss because our `PhaseItem`s are rather large in memory, so sorting is slow. Additionally, determining the boundaries of batches is an O(n) process. This commit makes Bevy instead applicable place phase items into bins keyed by bin keys, which have the invariant that everything in the same bin is potentially batchable. This makes determining batch boundaries O(1), because everything in the same bin can be batched. Instead of sorting each entity, we now sort only the bin keys. This drops the sorting time to near-zero on workloads with few bins like `many_cubes --no-frustum-culling`. Memory usage is improved too, with batch boundaries and dynamic indices now implicit instead of explicit. The improved memory usage results in a significant win even on unbatchable workloads like `many_cubes --no-frustum-culling --vary-material-data-per-instance`, presumably due to cache effects. Not all phases can be binned; some, such as transparent and transmissive phases, must still be sorted. To handle this, this commit splits `PhaseItem` into `BinnedPhaseItem` and `SortedPhaseItem`. Most of the logic that today deals with `PhaseItem`s has been moved to `SortedPhaseItem`. `BinnedPhaseItem` has the new logic. Frame time results (in ms/frame) are as follows: \| Benchmark \| `binning` \| `main` \| Speedup \| \| ------------------------ \| --------- \| ------- \| ------- \| \| `many_cubes -nfc -vpi` \| 232.179 \| 312.123 \| 34.43% \| \| `many_cubes -nfc` \| 25.874 \| 30.117 \| 16.40% \| \| `many_foxes` \| 3.276 \| 3.515 \| 7.30% \| (`-nfc` is short for `--no-frustum-culling`; `-vpi` is short for `--vary-per-instance`.) --- ## Changelog ### Changed * Render phases have been split into binned and sorted phases. Binned phases, such as the common opaque phase, achieve improved CPU performance by avoiding the sorting step. ## Migration Guide - `PhaseItem` has been split into `BinnedPhaseItem` and `SortedPhaseItem`. If your code has custom `PhaseItem`s, you will need to migrate them to one of these two types. `SortedPhaseItem` requires the fewest code changes, but you may want to pick `BinnedPhaseItem` if your phase doesn't require sorting, as that enables higher performance. ## Tracy graphs `many-cubes --no-frustum-culling`, `main` branch: <img width="1064" alt="Screenshot 2024-03-12 180037" src="https://github.com/bevyengine/bevy/assets/157897/e1180ce8-8e89-46d2-85e3-f59f72109a55"> `many-cubes --no-frustum-culling`, this branch: <img width="1064" alt="Screenshot 2024-03-12 180011" src="https://github.com/bevyengine/bevy/assets/157897/0899f036-6075-44c5-a972-44d95895f46c"> You can see that `batch_and_prepare_binned_render_phase` is a much smaller fraction of the time. Zooming in on that function, with yellow being this branch and red being `main`, we see: <img width="1064" alt="Screenshot 2024-03-12 175832" src="https://github.com/bevyengine/bevy/assets/157897/0dfc8d3f-49f4-496e-8825-a66e64d356d0"> The binning happens in `queue_material_meshes`. Again with yellow being this branch and red being `main`: <img width="1064" alt="Screenshot 2024-03-12 175755" src="https://github.com/bevyengine/bevy/assets/157897/b9b20dc1-11c8-400c-a6cc-1c2e09c1bb96"> We can see that there is a small regression in `queue_material_meshes` performance, but it's not nearly enough to outweigh the large gains in `batch_and_prepare_binned_render_phase`. --------- Co-authored-by: James Liu <contact@jamessliu.com>	2024-03-30 02:55:02 +00:00
Patrick Walton	5b746d2b19	Pack the `StandardMaterialKey` into a single scalar instead of a structure. (#12783 ) This commit changes the `StandardMaterialKey` to be based on a set of bitflags instead of a structure. We hash it every frame for every mesh, and `#[derive(Hash)]` doesn't generate particularly efficient code for large structures full of small types. Packing it into a single `u64` therefore results in a roughly 10% speedup in `queue_material_meshes` on `many_cubes --no-frustum-culling`. ![Screenshot 2024-03-29 075124](https://github.com/bevyengine/bevy/assets/157897/78afcab6-b616-489b-8243-da9a117f606c)	2024-03-29 18:34:27 +00:00
James Liu	e62a01f403	Make PersistentGpuBufferable a safe trait (#12744 ) # Objective Fixes #12727. All parts that `PersistentGpuBuffer` interact with should be 100% safe both on the CPU and the GPU: `Queue::write_buffer_with` zeroes out the slice being written to and when uploading to the GPU, and all slice writes are bounds checked on the CPU side. ## Solution Make `PersistentGpuBufferable` a safe trait. Enforce it's correct implementation via assertions. Re-enable `forbid(unsafe_code)` on `bevy_pbr`.	2024-03-29 13:14:34 +00:00
Jacques Schutte	4508077297	Move FloatOrd into bevy_math (#12732 ) # Objective - Fixes #12712 ## Solution - Move the `float_ord.rs` file to `bevy_math` - Change any `bevy_utils::FloatOrd` statements to `bevy_math::FloatOrd` --- ## Changelog - Moved `FloatOrd` from `bevy_utils` to `bevy_math` ## Migration Guide - References to `bevy_utils::FloatOrd` should be changed to `bevy_math::FloatOrd`	2024-03-27 18:30:11 +00:00

1 2 3 4 5 ...

635 commits