bevy/crates/bevy_pbr/src/render/morph.rs

100 lines
2.9 KiB
Rust
Raw Normal View History

Add morph targets (#8158) # Objective - Add morph targets to `bevy_pbr` (closes #5756) & load them from glTF - Supersedes #3722 - Fixes #6814 [Morph targets][1] (also known as shape interpolation, shape keys, or blend shapes) allow animating individual vertices with fine grained controls. This is typically used for facial expressions. By specifying multiple poses as vertex offset, and providing a set of weight of each pose, it is possible to define surprisingly realistic transitions between poses. Blending between multiple poses also allow composition. Morph targets are part of the [gltf standard][2] and are a feature of Unity and Unreal, and babylone.js, it is only natural to implement them in bevy. ## Solution This implementation of morph targets uses a 3d texture where each pixel is a component of an animated attribute. Each layer is a different target. We use a 2d texture for each target, because the number of attribute×components×animated vertices is expected to always exceed the maximum pixel row size limit of webGL2. It copies fairly closely the way skinning is implemented on the CPU side, while on the GPU side, the shader morph target implementation is a relatively trivial detail. We add an optional `morph_texture` to the `Mesh` struct. The `morph_texture` is built through a method that accepts an iterator over attribute buffers. The `MorphWeights` component, user-accessible, controls the blend of poses used by mesh instances (so that multiple copy of the same mesh may have different weights), all the weights are uploaded to a uniform buffer of 256 `f32`. We limit to 16 poses per mesh, and a total of 256 poses. More literature: * Old babylone.js implementation (vertex attribute-based): https://www.eternalcoding.com/dev-log-1-morph-targets/ * Babylone.js implementation (similar to ours): https://www.youtube.com/watch?v=LBPRmGgU0PE * GPU gems 3: https://developer.nvidia.com/gpugems/gpugems3/part-i-geometry/chapter-3-directx-10-blend-shapes-breaking-limits * Development discord thread https://discord.com/channels/691052431525675048/1083325980615114772 https://user-images.githubusercontent.com/26321040/231181046-3bca2ab2-d4d9-472e-8098-639f1871ce2e.mp4 https://github.com/bevyengine/bevy/assets/26321040/d2a0c544-0ef8-45cf-9f99-8c3792f5a258 ## Acknowledgements * Thanks to `storytold` for sponsoring the feature * Thanks to `superdump` and `james7132` for guidance and help figuring out stuff ## Future work - Handling of less and more attributes (eg: animated uv, animated arbitrary attributes) - Dynamic pose allocation (so that zero-weighted poses aren't uploaded to GPU for example, enables much more total poses) - Better animation API, see #8357 ---- ## Changelog - Add morph targets to bevy meshes - Support up to 64 poses per mesh of individually up to 116508 vertices, animation currently strictly limited to the position, normal and tangent attributes. - Load a morph target using `Mesh::set_morph_targets` - Add `VisitMorphTargets` and `VisitMorphAttributes` traits to `bevy_render`, this allows defining morph targets (a fairly complex and nested data structure) through iterators (ie: single copy instead of passing around buffers), see documentation of those traits for details - Add `MorphWeights` component exported by `bevy_render` - `MorphWeights` control mesh's morph target weights, blending between various poses defined as morph targets. - `MorphWeights` are directly inherited by direct children (single level of hierarchy) of an entity. This allows controlling several mesh primitives through a unique entity _as per GLTF spec_. - Add `MorphTargetNames` component, naming each indices of loaded morph targets. - Load morph targets weights and buffers in `bevy_gltf` - handle morph targets animations in `bevy_animation` (previously, it was a `warn!` log) - Add the `MorphStressTest.gltf` asset for morph targets testing, taken from the glTF samples repo, CC0. - Add morph target manipulation to `scene_viewer` - Separate the animation code in `scene_viewer` from the rest of the code, reducing `#[cfg(feature)]` noise - Add the `morph_targets.rs` example to show off how to manipulate morph targets, loading `MorpStressTest.gltf` ## Migration Guide - (very specialized, unlikely to be touched by 3rd parties) - `MeshPipeline` now has a single `mesh_layouts` field rather than separate `mesh_layout` and `skinned_mesh_layout` fields. You should handle all possible mesh bind group layouts in your implementation - You should also handle properly the new `MORPH_TARGETS` shader def and mesh pipeline key. A new function is exposed to make this easier: `setup_moprh_and_skinning_defs` - The `MeshBindGroup` is now `MeshBindGroups`, cached bind groups are now accessed through the `get` method. [1]: https://en.wikipedia.org/wiki/Morph_target_animation [2]: https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#morph-targets --------- Co-authored-by: François <mockersf@gmail.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2023-06-22 20:00:01 +00:00
use std::{iter, mem};
use bevy_ecs::prelude::*;
use bevy_render::{
Automatic batching/instancing of draw commands (#9685) # Objective - Implement the foundations of automatic batching/instancing of draw commands as the next step from #89 - NOTE: More performance improvements will come when more data is managed and bound in ways that do not require rebinding such as mesh, material, and texture data. ## Solution - The core idea for batching of draw commands is to check whether any of the information that has to be passed when encoding a draw command changes between two things that are being drawn according to the sorted render phase order. These should be things like the pipeline, bind groups and their dynamic offsets, index/vertex buffers, and so on. - The following assumptions have been made: - Only entities with prepared assets (pipelines, materials, meshes) are queued to phases - View bindings are constant across a phase for a given draw function as phases are per-view - `batch_and_prepare_render_phase` is the only system that performs this batching and has sole responsibility for preparing the per-object data. As such the mesh binding and dynamic offsets are assumed to only vary as a result of the `batch_and_prepare_render_phase` system, e.g. due to having to split data across separate uniform bindings within the same buffer due to the maximum uniform buffer binding size. - Implement `GpuArrayBuffer` for `Mesh2dUniform` to store Mesh2dUniform in arrays in GPU buffers rather than each one being at a dynamic offset in a uniform buffer. This is the same optimisation that was made for 3D not long ago. - Change batch size for a range in `PhaseItem`, adding API for getting or mutating the range. This is more flexible than a size as the length of the range can be used in place of the size, but the start and end can be otherwise whatever is needed. - Add an optional mesh bind group dynamic offset to `PhaseItem`. This avoids having to do a massive table move just to insert `GpuArrayBufferIndex` components. ## Benchmarks All tests have been run on an M1 Max on AC power. `bevymark` and `many_cubes` were modified to use 1920x1080 with a scale factor of 1. I run a script that runs a separate Tracy capture process, and then runs the bevy example with `--features bevy_ci_testing,trace_tracy` and `CI_TESTING_CONFIG=../benchmark.ron` with the contents of `../benchmark.ron`: ```rust ( exit_after: Some(1500) ) ``` ...in order to run each test for 1500 frames. The recent changes to `many_cubes` and `bevymark` added reproducible random number generation so that with the same settings, the same rng will occur. They also added benchmark modes that use a fixed delta time for animations. Combined this means that the same frames should be rendered both on main and on the branch. The graphs compare main (yellow) to this PR (red). ### 3D Mesh `many_cubes --benchmark` <img width="1411" alt="Screenshot 2023-09-03 at 23 42 10" src="https://github.com/bevyengine/bevy/assets/302146/2088716a-c918-486c-8129-090b26fd2bc4"> The mesh and material are the same for all instances. This is basically the best case for the initial batching implementation as it results in 1 draw for the ~11.7k visible meshes. It gives a ~30% reduction in median frame time. The 1000th frame is identical using the flip tool: ![flip many_cubes-main-mesh3d many_cubes-batching-mesh3d 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/2511f37a-6df8-481a-932f-706ca4de7643) ``` Mean: 0.000000 Weighted median: 0.000000 1st weighted quartile: 0.000000 3rd weighted quartile: 0.000000 Min: 0.000000 Max: 0.000000 Evaluation time: 0.4615 seconds ``` ### 3D Mesh `many_cubes --benchmark --material-texture-count 10` <img width="1404" alt="Screenshot 2023-09-03 at 23 45 18" src="https://github.com/bevyengine/bevy/assets/302146/5ee9c447-5bd2-45c6-9706-ac5ff8916daf"> This run uses 10 different materials by varying their textures. The materials are randomly selected, and there is no sorting by material bind group for opaque 3D so any batching is 'random'. The PR produces a ~5% reduction in median frame time. If we were to sort the opaque phase by the material bind group, then this should be a lot faster. This produces about 10.5k draws for the 11.7k visible entities. This makes sense as randomly selecting from 10 materials gives a chance that two adjacent entities randomly select the same material and can be batched. The 1000th frame is identical in flip: ![flip many_cubes-main-mesh3d-mtc10 many_cubes-batching-mesh3d-mtc10 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/2b3a8614-9466-4ed8-b50c-d4aa71615dbb) ``` Mean: 0.000000 Weighted median: 0.000000 1st weighted quartile: 0.000000 3rd weighted quartile: 0.000000 Min: 0.000000 Max: 0.000000 Evaluation time: 0.4537 seconds ``` ### 3D Mesh `many_cubes --benchmark --vary-per-instance` <img width="1394" alt="Screenshot 2023-09-03 at 23 48 44" src="https://github.com/bevyengine/bevy/assets/302146/f02a816b-a444-4c18-a96a-63b5436f3b7f"> This run varies the material data per instance by randomly-generating its colour. This is the worst case for batching and that it performs about the same as `main` is a good thing as it demonstrates that the batching has minimal overhead when dealing with ~11k visible mesh entities. The 1000th frame is identical according to flip: ![flip many_cubes-main-mesh3d-vpi many_cubes-batching-mesh3d-vpi 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/ac5f5c14-9bda-4d1a-8219-7577d4aac68c) ``` Mean: 0.000000 Weighted median: 0.000000 1st weighted quartile: 0.000000 3rd weighted quartile: 0.000000 Min: 0.000000 Max: 0.000000 Evaluation time: 0.4568 seconds ``` ### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d` <img width="1412" alt="Screenshot 2023-09-03 at 23 59 56" src="https://github.com/bevyengine/bevy/assets/302146/cb02ae07-237b-4646-ae9f-fda4dafcbad4"> This spawns 160 waves of 1000 quad meshes that are shaded with ColorMaterial. Each wave has a different material so 160 waves currently should result in 160 batches. This results in a 50% reduction in median frame time. Capturing a screenshot of the 1000th frame main vs PR gives: ![flip bevymark-main-mesh2d bevymark-batching-mesh2d 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/80102728-1217-4059-87af-14d05044df40) ``` Mean: 0.001222 Weighted median: 0.750432 1st weighted quartile: 0.453494 3rd weighted quartile: 0.969758 Min: 0.000000 Max: 0.990296 Evaluation time: 0.4255 seconds ``` So they seem to produce the same results. I also double-checked the number of draws. `main` does 160000 draws, and the PR does 160, as expected. ### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d --material-texture-count 10` <img width="1392" alt="Screenshot 2023-09-04 at 00 09 22" src="https://github.com/bevyengine/bevy/assets/302146/4358da2e-ce32-4134-82df-3ab74c40849c"> This generates 10 textures and generates materials for each of those and then selects one material per wave. The median frame time is reduced by 50%. Similar to the plain run above, this produces 160 draws on the PR and 160000 on `main` and the 1000th frame is identical (ignoring the fps counter text overlay). ![flip bevymark-main-mesh2d-mtc10 bevymark-batching-mesh2d-mtc10 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/ebed2822-dce7-426a-858b-b77dc45b986f) ``` Mean: 0.002877 Weighted median: 0.964980 1st weighted quartile: 0.668871 3rd weighted quartile: 0.982749 Min: 0.000000 Max: 0.992377 Evaluation time: 0.4301 seconds ``` ### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d --vary-per-instance` <img width="1396" alt="Screenshot 2023-09-04 at 00 13 53" src="https://github.com/bevyengine/bevy/assets/302146/b2198b18-3439-47ad-919a-cdabe190facb"> This creates unique materials per instance by randomly-generating the material's colour. This is the worst case for 2D batching. Somehow, this PR manages a 7% reduction in median frame time. Both main and this PR issue 160000 draws. The 1000th frame is the same: ![flip bevymark-main-mesh2d-vpi bevymark-batching-mesh2d-vpi 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/a2ec471c-f576-4a36-a23b-b24b22578b97) ``` Mean: 0.001214 Weighted median: 0.937499 1st weighted quartile: 0.635467 3rd weighted quartile: 0.979085 Min: 0.000000 Max: 0.988971 Evaluation time: 0.4462 seconds ``` ### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite` <img width="1396" alt="Screenshot 2023-09-04 at 12 21 12" src="https://github.com/bevyengine/bevy/assets/302146/8b31e915-d6be-4cac-abf5-c6a4da9c3d43"> This just spawns 160 waves of 1000 sprites. There should be and is no notable difference between main and the PR. ### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite --material-texture-count 10` <img width="1389" alt="Screenshot 2023-09-04 at 12 36 08" src="https://github.com/bevyengine/bevy/assets/302146/45fe8d6d-c901-4062-a349-3693dd044413"> This spawns the sprites selecting a texture at random per instance from the 10 generated textures. This has no significant change vs main and shouldn't. ### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite --vary-per-instance` <img width="1401" alt="Screenshot 2023-09-04 at 12 29 52" src="https://github.com/bevyengine/bevy/assets/302146/762c5c60-352e-471f-8dbe-bbf10e24ebd6"> This sets the sprite colour as being unique per instance. This can still all be drawn using one batch. There should be no difference but the PR produces median frame times that are 4% higher. Investigation showed no clear sources of cost, rather a mix of give and take that should not happen. It seems like noise in the results. ### Summary | Benchmark | % change in median frame time | | ------------- | ------------- | | many_cubes | 🟩 -30% | | many_cubes 10 materials | 🟩 -5% | | many_cubes unique materials | 🟩 ~0% | | bevymark mesh2d | 🟩 -50% | | bevymark mesh2d 10 materials | 🟩 -50% | | bevymark mesh2d unique materials | 🟩 -7% | | bevymark sprite | 🟥 2% | | bevymark sprite 10 materials | 🟥 0.6% | | bevymark sprite unique materials | 🟥 4.1% | --- ## Changelog - Added: 2D and 3D mesh entities that share the same mesh and material (same textures, same data) are now batched into the same draw command for better performance. --------- Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com> Co-authored-by: Nicola Papale <nico@nicopap.ch>
2023-09-21 22:12:34 +00:00
batching::NoAutomaticBatching,
Add morph targets (#8158) # Objective - Add morph targets to `bevy_pbr` (closes #5756) & load them from glTF - Supersedes #3722 - Fixes #6814 [Morph targets][1] (also known as shape interpolation, shape keys, or blend shapes) allow animating individual vertices with fine grained controls. This is typically used for facial expressions. By specifying multiple poses as vertex offset, and providing a set of weight of each pose, it is possible to define surprisingly realistic transitions between poses. Blending between multiple poses also allow composition. Morph targets are part of the [gltf standard][2] and are a feature of Unity and Unreal, and babylone.js, it is only natural to implement them in bevy. ## Solution This implementation of morph targets uses a 3d texture where each pixel is a component of an animated attribute. Each layer is a different target. We use a 2d texture for each target, because the number of attribute×components×animated vertices is expected to always exceed the maximum pixel row size limit of webGL2. It copies fairly closely the way skinning is implemented on the CPU side, while on the GPU side, the shader morph target implementation is a relatively trivial detail. We add an optional `morph_texture` to the `Mesh` struct. The `morph_texture` is built through a method that accepts an iterator over attribute buffers. The `MorphWeights` component, user-accessible, controls the blend of poses used by mesh instances (so that multiple copy of the same mesh may have different weights), all the weights are uploaded to a uniform buffer of 256 `f32`. We limit to 16 poses per mesh, and a total of 256 poses. More literature: * Old babylone.js implementation (vertex attribute-based): https://www.eternalcoding.com/dev-log-1-morph-targets/ * Babylone.js implementation (similar to ours): https://www.youtube.com/watch?v=LBPRmGgU0PE * GPU gems 3: https://developer.nvidia.com/gpugems/gpugems3/part-i-geometry/chapter-3-directx-10-blend-shapes-breaking-limits * Development discord thread https://discord.com/channels/691052431525675048/1083325980615114772 https://user-images.githubusercontent.com/26321040/231181046-3bca2ab2-d4d9-472e-8098-639f1871ce2e.mp4 https://github.com/bevyengine/bevy/assets/26321040/d2a0c544-0ef8-45cf-9f99-8c3792f5a258 ## Acknowledgements * Thanks to `storytold` for sponsoring the feature * Thanks to `superdump` and `james7132` for guidance and help figuring out stuff ## Future work - Handling of less and more attributes (eg: animated uv, animated arbitrary attributes) - Dynamic pose allocation (so that zero-weighted poses aren't uploaded to GPU for example, enables much more total poses) - Better animation API, see #8357 ---- ## Changelog - Add morph targets to bevy meshes - Support up to 64 poses per mesh of individually up to 116508 vertices, animation currently strictly limited to the position, normal and tangent attributes. - Load a morph target using `Mesh::set_morph_targets` - Add `VisitMorphTargets` and `VisitMorphAttributes` traits to `bevy_render`, this allows defining morph targets (a fairly complex and nested data structure) through iterators (ie: single copy instead of passing around buffers), see documentation of those traits for details - Add `MorphWeights` component exported by `bevy_render` - `MorphWeights` control mesh's morph target weights, blending between various poses defined as morph targets. - `MorphWeights` are directly inherited by direct children (single level of hierarchy) of an entity. This allows controlling several mesh primitives through a unique entity _as per GLTF spec_. - Add `MorphTargetNames` component, naming each indices of loaded morph targets. - Load morph targets weights and buffers in `bevy_gltf` - handle morph targets animations in `bevy_animation` (previously, it was a `warn!` log) - Add the `MorphStressTest.gltf` asset for morph targets testing, taken from the glTF samples repo, CC0. - Add morph target manipulation to `scene_viewer` - Separate the animation code in `scene_viewer` from the rest of the code, reducing `#[cfg(feature)]` noise - Add the `morph_targets.rs` example to show off how to manipulate morph targets, loading `MorpStressTest.gltf` ## Migration Guide - (very specialized, unlikely to be touched by 3rd parties) - `MeshPipeline` now has a single `mesh_layouts` field rather than separate `mesh_layout` and `skinned_mesh_layout` fields. You should handle all possible mesh bind group layouts in your implementation - You should also handle properly the new `MORPH_TARGETS` shader def and mesh pipeline key. A new function is exposed to make this easier: `setup_moprh_and_skinning_defs` - The `MeshBindGroup` is now `MeshBindGroups`, cached bind groups are now accessed through the `get` method. [1]: https://en.wikipedia.org/wiki/Morph_target_animation [2]: https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#morph-targets --------- Co-authored-by: François <mockersf@gmail.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2023-06-22 20:00:01 +00:00
mesh::morph::{MeshMorphWeights, MAX_MORPH_WEIGHTS},
render_resource::{BufferUsages, BufferVec},
renderer::{RenderDevice, RenderQueue},
Split `ComputedVisibility` into two components to allow for accurate change detection and speed up visibility propagation (#9497) # Objective Fix #8267. Fixes half of #7840. The `ComputedVisibility` component contains two flags: hierarchy visibility, and view visibility (whether its visible to any cameras). Due to the modular and open-ended way that view visibility is computed, it triggers change detection every single frame, even when the value does not change. Since hierarchy visibility is stored in the same component as view visibility, this means that change detection for inherited visibility is completely broken. At the company I work for, this has become a real issue. We are using change detection to only re-render scenes when necessary. The broken state of change detection for computed visibility means that we have to to rely on the non-inherited `Visibility` component for now. This is workable in the early stages of our project, but since we will inevitably want to use the hierarchy, we will have to either: 1. Roll our own solution for computed visibility. 2. Fix the issue for everyone. ## Solution Split the `ComputedVisibility` component into two: `InheritedVisibilty` and `ViewVisibility`. This allows change detection to behave properly for `InheritedVisibility`. View visiblity is still erratic, although it is less useful to be able to detect changes for this flavor of visibility. Overall, this actually simplifies the API. Since the visibility system consists of self-explaining components, it is much easier to document the behavior and usage. This approach is more modular and "ECS-like" -- one could strip out the `ViewVisibility` component entirely if it's not needed, and rely only on inherited visibility. --- ## Changelog - `ComputedVisibility` has been removed in favor of: `InheritedVisibility` and `ViewVisiblity`. ## Migration Guide The `ComputedVisibilty` component has been split into `InheritedVisiblity` and `ViewVisibility`. Replace any usages of `ComputedVisibility::is_visible_in_hierarchy` with `InheritedVisibility::get`, and replace `ComputedVisibility::is_visible_in_view` with `ViewVisibility::get`. ```rust // Before: commands.spawn(VisibilityBundle { visibility: Visibility::Inherited, computed_visibility: ComputedVisibility::default(), }); // After: commands.spawn(VisibilityBundle { visibility: Visibility::Inherited, inherited_visibility: InheritedVisibility::default(), view_visibility: ViewVisibility::default(), }); ``` ```rust // Before: fn my_system(q: Query<&ComputedVisibilty>) { for vis in &q { if vis.is_visible_in_hierarchy() { // After: fn my_system(q: Query<&InheritedVisibility>) { for inherited_visibility in &q { if inherited_visibility.get() { ``` ```rust // Before: fn my_system(q: Query<&ComputedVisibilty>) { for vis in &q { if vis.is_visible_in_view() { // After: fn my_system(q: Query<&ViewVisibility>) { for view_visibility in &q { if view_visibility.get() { ``` ```rust // Before: fn my_system(mut q: Query<&mut ComputedVisibilty>) { for vis in &mut q { vis.set_visible_in_view(); // After: fn my_system(mut q: Query<&mut ViewVisibility>) { for view_visibility in &mut q { view_visibility.set(); ``` --------- Co-authored-by: Robert Swain <robert.swain@gmail.com>
2023-09-01 13:00:18 +00:00
view::ViewVisibility,
Add morph targets (#8158) # Objective - Add morph targets to `bevy_pbr` (closes #5756) & load them from glTF - Supersedes #3722 - Fixes #6814 [Morph targets][1] (also known as shape interpolation, shape keys, or blend shapes) allow animating individual vertices with fine grained controls. This is typically used for facial expressions. By specifying multiple poses as vertex offset, and providing a set of weight of each pose, it is possible to define surprisingly realistic transitions between poses. Blending between multiple poses also allow composition. Morph targets are part of the [gltf standard][2] and are a feature of Unity and Unreal, and babylone.js, it is only natural to implement them in bevy. ## Solution This implementation of morph targets uses a 3d texture where each pixel is a component of an animated attribute. Each layer is a different target. We use a 2d texture for each target, because the number of attribute×components×animated vertices is expected to always exceed the maximum pixel row size limit of webGL2. It copies fairly closely the way skinning is implemented on the CPU side, while on the GPU side, the shader morph target implementation is a relatively trivial detail. We add an optional `morph_texture` to the `Mesh` struct. The `morph_texture` is built through a method that accepts an iterator over attribute buffers. The `MorphWeights` component, user-accessible, controls the blend of poses used by mesh instances (so that multiple copy of the same mesh may have different weights), all the weights are uploaded to a uniform buffer of 256 `f32`. We limit to 16 poses per mesh, and a total of 256 poses. More literature: * Old babylone.js implementation (vertex attribute-based): https://www.eternalcoding.com/dev-log-1-morph-targets/ * Babylone.js implementation (similar to ours): https://www.youtube.com/watch?v=LBPRmGgU0PE * GPU gems 3: https://developer.nvidia.com/gpugems/gpugems3/part-i-geometry/chapter-3-directx-10-blend-shapes-breaking-limits * Development discord thread https://discord.com/channels/691052431525675048/1083325980615114772 https://user-images.githubusercontent.com/26321040/231181046-3bca2ab2-d4d9-472e-8098-639f1871ce2e.mp4 https://github.com/bevyengine/bevy/assets/26321040/d2a0c544-0ef8-45cf-9f99-8c3792f5a258 ## Acknowledgements * Thanks to `storytold` for sponsoring the feature * Thanks to `superdump` and `james7132` for guidance and help figuring out stuff ## Future work - Handling of less and more attributes (eg: animated uv, animated arbitrary attributes) - Dynamic pose allocation (so that zero-weighted poses aren't uploaded to GPU for example, enables much more total poses) - Better animation API, see #8357 ---- ## Changelog - Add morph targets to bevy meshes - Support up to 64 poses per mesh of individually up to 116508 vertices, animation currently strictly limited to the position, normal and tangent attributes. - Load a morph target using `Mesh::set_morph_targets` - Add `VisitMorphTargets` and `VisitMorphAttributes` traits to `bevy_render`, this allows defining morph targets (a fairly complex and nested data structure) through iterators (ie: single copy instead of passing around buffers), see documentation of those traits for details - Add `MorphWeights` component exported by `bevy_render` - `MorphWeights` control mesh's morph target weights, blending between various poses defined as morph targets. - `MorphWeights` are directly inherited by direct children (single level of hierarchy) of an entity. This allows controlling several mesh primitives through a unique entity _as per GLTF spec_. - Add `MorphTargetNames` component, naming each indices of loaded morph targets. - Load morph targets weights and buffers in `bevy_gltf` - handle morph targets animations in `bevy_animation` (previously, it was a `warn!` log) - Add the `MorphStressTest.gltf` asset for morph targets testing, taken from the glTF samples repo, CC0. - Add morph target manipulation to `scene_viewer` - Separate the animation code in `scene_viewer` from the rest of the code, reducing `#[cfg(feature)]` noise - Add the `morph_targets.rs` example to show off how to manipulate morph targets, loading `MorpStressTest.gltf` ## Migration Guide - (very specialized, unlikely to be touched by 3rd parties) - `MeshPipeline` now has a single `mesh_layouts` field rather than separate `mesh_layout` and `skinned_mesh_layout` fields. You should handle all possible mesh bind group layouts in your implementation - You should also handle properly the new `MORPH_TARGETS` shader def and mesh pipeline key. A new function is exposed to make this easier: `setup_moprh_and_skinning_defs` - The `MeshBindGroup` is now `MeshBindGroups`, cached bind groups are now accessed through the `get` method. [1]: https://en.wikipedia.org/wiki/Morph_target_animation [2]: https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#morph-targets --------- Co-authored-by: François <mockersf@gmail.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2023-06-22 20:00:01 +00:00
Extract,
};
use bytemuck::Pod;
#[derive(Component)]
pub struct MorphIndex {
pub(super) index: u32,
}
#[derive(Resource)]
pub struct MorphUniform {
pub buffer: BufferVec<f32>,
}
impl Default for MorphUniform {
fn default() -> Self {
Self {
buffer: BufferVec::new(BufferUsages::UNIFORM),
}
}
}
pub fn prepare_morphs(
device: Res<RenderDevice>,
queue: Res<RenderQueue>,
mut uniform: ResMut<MorphUniform>,
) {
if uniform.buffer.is_empty() {
return;
}
let buffer = &mut uniform.buffer;
buffer.reserve(buffer.len(), &device);
buffer.write_buffer(&device, &queue);
}
const fn can_align(step: usize, target: usize) -> bool {
step % target == 0 || target % step == 0
}
const WGPU_MIN_ALIGN: usize = 256;
/// Align a [`BufferVec`] to `N` bytes by padding the end with `T::default()` values.
fn add_to_alignment<T: Pod + Default>(buffer: &mut BufferVec<T>) {
let n = WGPU_MIN_ALIGN;
let t_size = mem::size_of::<T>();
if !can_align(n, t_size) {
// This panic is stripped at compile time, due to n, t_size and can_align being const
panic!(
"BufferVec should contain only types with a size multiple or divisible by {n}, \
{} has a size of {t_size}, which is neither multiple or divisible by {n}",
Add morph targets (#8158) # Objective - Add morph targets to `bevy_pbr` (closes #5756) & load them from glTF - Supersedes #3722 - Fixes #6814 [Morph targets][1] (also known as shape interpolation, shape keys, or blend shapes) allow animating individual vertices with fine grained controls. This is typically used for facial expressions. By specifying multiple poses as vertex offset, and providing a set of weight of each pose, it is possible to define surprisingly realistic transitions between poses. Blending between multiple poses also allow composition. Morph targets are part of the [gltf standard][2] and are a feature of Unity and Unreal, and babylone.js, it is only natural to implement them in bevy. ## Solution This implementation of morph targets uses a 3d texture where each pixel is a component of an animated attribute. Each layer is a different target. We use a 2d texture for each target, because the number of attribute×components×animated vertices is expected to always exceed the maximum pixel row size limit of webGL2. It copies fairly closely the way skinning is implemented on the CPU side, while on the GPU side, the shader morph target implementation is a relatively trivial detail. We add an optional `morph_texture` to the `Mesh` struct. The `morph_texture` is built through a method that accepts an iterator over attribute buffers. The `MorphWeights` component, user-accessible, controls the blend of poses used by mesh instances (so that multiple copy of the same mesh may have different weights), all the weights are uploaded to a uniform buffer of 256 `f32`. We limit to 16 poses per mesh, and a total of 256 poses. More literature: * Old babylone.js implementation (vertex attribute-based): https://www.eternalcoding.com/dev-log-1-morph-targets/ * Babylone.js implementation (similar to ours): https://www.youtube.com/watch?v=LBPRmGgU0PE * GPU gems 3: https://developer.nvidia.com/gpugems/gpugems3/part-i-geometry/chapter-3-directx-10-blend-shapes-breaking-limits * Development discord thread https://discord.com/channels/691052431525675048/1083325980615114772 https://user-images.githubusercontent.com/26321040/231181046-3bca2ab2-d4d9-472e-8098-639f1871ce2e.mp4 https://github.com/bevyengine/bevy/assets/26321040/d2a0c544-0ef8-45cf-9f99-8c3792f5a258 ## Acknowledgements * Thanks to `storytold` for sponsoring the feature * Thanks to `superdump` and `james7132` for guidance and help figuring out stuff ## Future work - Handling of less and more attributes (eg: animated uv, animated arbitrary attributes) - Dynamic pose allocation (so that zero-weighted poses aren't uploaded to GPU for example, enables much more total poses) - Better animation API, see #8357 ---- ## Changelog - Add morph targets to bevy meshes - Support up to 64 poses per mesh of individually up to 116508 vertices, animation currently strictly limited to the position, normal and tangent attributes. - Load a morph target using `Mesh::set_morph_targets` - Add `VisitMorphTargets` and `VisitMorphAttributes` traits to `bevy_render`, this allows defining morph targets (a fairly complex and nested data structure) through iterators (ie: single copy instead of passing around buffers), see documentation of those traits for details - Add `MorphWeights` component exported by `bevy_render` - `MorphWeights` control mesh's morph target weights, blending between various poses defined as morph targets. - `MorphWeights` are directly inherited by direct children (single level of hierarchy) of an entity. This allows controlling several mesh primitives through a unique entity _as per GLTF spec_. - Add `MorphTargetNames` component, naming each indices of loaded morph targets. - Load morph targets weights and buffers in `bevy_gltf` - handle morph targets animations in `bevy_animation` (previously, it was a `warn!` log) - Add the `MorphStressTest.gltf` asset for morph targets testing, taken from the glTF samples repo, CC0. - Add morph target manipulation to `scene_viewer` - Separate the animation code in `scene_viewer` from the rest of the code, reducing `#[cfg(feature)]` noise - Add the `morph_targets.rs` example to show off how to manipulate morph targets, loading `MorpStressTest.gltf` ## Migration Guide - (very specialized, unlikely to be touched by 3rd parties) - `MeshPipeline` now has a single `mesh_layouts` field rather than separate `mesh_layout` and `skinned_mesh_layout` fields. You should handle all possible mesh bind group layouts in your implementation - You should also handle properly the new `MORPH_TARGETS` shader def and mesh pipeline key. A new function is exposed to make this easier: `setup_moprh_and_skinning_defs` - The `MeshBindGroup` is now `MeshBindGroups`, cached bind groups are now accessed through the `get` method. [1]: https://en.wikipedia.org/wiki/Morph_target_animation [2]: https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#morph-targets --------- Co-authored-by: François <mockersf@gmail.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2023-06-22 20:00:01 +00:00
std::any::type_name::<T>()
);
}
let buffer_size = buffer.len();
let byte_size = t_size * buffer_size;
let bytes_over_n = byte_size % n;
if bytes_over_n == 0 {
return;
}
let bytes_to_add = n - bytes_over_n;
let ts_to_add = bytes_to_add / t_size;
buffer.extend(iter::repeat_with(T::default).take(ts_to_add));
}
pub fn extract_morphs(
mut commands: Commands,
mut previous_len: Local<usize>,
mut uniform: ResMut<MorphUniform>,
Split `ComputedVisibility` into two components to allow for accurate change detection and speed up visibility propagation (#9497) # Objective Fix #8267. Fixes half of #7840. The `ComputedVisibility` component contains two flags: hierarchy visibility, and view visibility (whether its visible to any cameras). Due to the modular and open-ended way that view visibility is computed, it triggers change detection every single frame, even when the value does not change. Since hierarchy visibility is stored in the same component as view visibility, this means that change detection for inherited visibility is completely broken. At the company I work for, this has become a real issue. We are using change detection to only re-render scenes when necessary. The broken state of change detection for computed visibility means that we have to to rely on the non-inherited `Visibility` component for now. This is workable in the early stages of our project, but since we will inevitably want to use the hierarchy, we will have to either: 1. Roll our own solution for computed visibility. 2. Fix the issue for everyone. ## Solution Split the `ComputedVisibility` component into two: `InheritedVisibilty` and `ViewVisibility`. This allows change detection to behave properly for `InheritedVisibility`. View visiblity is still erratic, although it is less useful to be able to detect changes for this flavor of visibility. Overall, this actually simplifies the API. Since the visibility system consists of self-explaining components, it is much easier to document the behavior and usage. This approach is more modular and "ECS-like" -- one could strip out the `ViewVisibility` component entirely if it's not needed, and rely only on inherited visibility. --- ## Changelog - `ComputedVisibility` has been removed in favor of: `InheritedVisibility` and `ViewVisiblity`. ## Migration Guide The `ComputedVisibilty` component has been split into `InheritedVisiblity` and `ViewVisibility`. Replace any usages of `ComputedVisibility::is_visible_in_hierarchy` with `InheritedVisibility::get`, and replace `ComputedVisibility::is_visible_in_view` with `ViewVisibility::get`. ```rust // Before: commands.spawn(VisibilityBundle { visibility: Visibility::Inherited, computed_visibility: ComputedVisibility::default(), }); // After: commands.spawn(VisibilityBundle { visibility: Visibility::Inherited, inherited_visibility: InheritedVisibility::default(), view_visibility: ViewVisibility::default(), }); ``` ```rust // Before: fn my_system(q: Query<&ComputedVisibilty>) { for vis in &q { if vis.is_visible_in_hierarchy() { // After: fn my_system(q: Query<&InheritedVisibility>) { for inherited_visibility in &q { if inherited_visibility.get() { ``` ```rust // Before: fn my_system(q: Query<&ComputedVisibilty>) { for vis in &q { if vis.is_visible_in_view() { // After: fn my_system(q: Query<&ViewVisibility>) { for view_visibility in &q { if view_visibility.get() { ``` ```rust // Before: fn my_system(mut q: Query<&mut ComputedVisibilty>) { for vis in &mut q { vis.set_visible_in_view(); // After: fn my_system(mut q: Query<&mut ViewVisibility>) { for view_visibility in &mut q { view_visibility.set(); ``` --------- Co-authored-by: Robert Swain <robert.swain@gmail.com>
2023-09-01 13:00:18 +00:00
query: Extract<Query<(Entity, &ViewVisibility, &MeshMorphWeights)>>,
Add morph targets (#8158) # Objective - Add morph targets to `bevy_pbr` (closes #5756) & load them from glTF - Supersedes #3722 - Fixes #6814 [Morph targets][1] (also known as shape interpolation, shape keys, or blend shapes) allow animating individual vertices with fine grained controls. This is typically used for facial expressions. By specifying multiple poses as vertex offset, and providing a set of weight of each pose, it is possible to define surprisingly realistic transitions between poses. Blending between multiple poses also allow composition. Morph targets are part of the [gltf standard][2] and are a feature of Unity and Unreal, and babylone.js, it is only natural to implement them in bevy. ## Solution This implementation of morph targets uses a 3d texture where each pixel is a component of an animated attribute. Each layer is a different target. We use a 2d texture for each target, because the number of attribute×components×animated vertices is expected to always exceed the maximum pixel row size limit of webGL2. It copies fairly closely the way skinning is implemented on the CPU side, while on the GPU side, the shader morph target implementation is a relatively trivial detail. We add an optional `morph_texture` to the `Mesh` struct. The `morph_texture` is built through a method that accepts an iterator over attribute buffers. The `MorphWeights` component, user-accessible, controls the blend of poses used by mesh instances (so that multiple copy of the same mesh may have different weights), all the weights are uploaded to a uniform buffer of 256 `f32`. We limit to 16 poses per mesh, and a total of 256 poses. More literature: * Old babylone.js implementation (vertex attribute-based): https://www.eternalcoding.com/dev-log-1-morph-targets/ * Babylone.js implementation (similar to ours): https://www.youtube.com/watch?v=LBPRmGgU0PE * GPU gems 3: https://developer.nvidia.com/gpugems/gpugems3/part-i-geometry/chapter-3-directx-10-blend-shapes-breaking-limits * Development discord thread https://discord.com/channels/691052431525675048/1083325980615114772 https://user-images.githubusercontent.com/26321040/231181046-3bca2ab2-d4d9-472e-8098-639f1871ce2e.mp4 https://github.com/bevyengine/bevy/assets/26321040/d2a0c544-0ef8-45cf-9f99-8c3792f5a258 ## Acknowledgements * Thanks to `storytold` for sponsoring the feature * Thanks to `superdump` and `james7132` for guidance and help figuring out stuff ## Future work - Handling of less and more attributes (eg: animated uv, animated arbitrary attributes) - Dynamic pose allocation (so that zero-weighted poses aren't uploaded to GPU for example, enables much more total poses) - Better animation API, see #8357 ---- ## Changelog - Add morph targets to bevy meshes - Support up to 64 poses per mesh of individually up to 116508 vertices, animation currently strictly limited to the position, normal and tangent attributes. - Load a morph target using `Mesh::set_morph_targets` - Add `VisitMorphTargets` and `VisitMorphAttributes` traits to `bevy_render`, this allows defining morph targets (a fairly complex and nested data structure) through iterators (ie: single copy instead of passing around buffers), see documentation of those traits for details - Add `MorphWeights` component exported by `bevy_render` - `MorphWeights` control mesh's morph target weights, blending between various poses defined as morph targets. - `MorphWeights` are directly inherited by direct children (single level of hierarchy) of an entity. This allows controlling several mesh primitives through a unique entity _as per GLTF spec_. - Add `MorphTargetNames` component, naming each indices of loaded morph targets. - Load morph targets weights and buffers in `bevy_gltf` - handle morph targets animations in `bevy_animation` (previously, it was a `warn!` log) - Add the `MorphStressTest.gltf` asset for morph targets testing, taken from the glTF samples repo, CC0. - Add morph target manipulation to `scene_viewer` - Separate the animation code in `scene_viewer` from the rest of the code, reducing `#[cfg(feature)]` noise - Add the `morph_targets.rs` example to show off how to manipulate morph targets, loading `MorpStressTest.gltf` ## Migration Guide - (very specialized, unlikely to be touched by 3rd parties) - `MeshPipeline` now has a single `mesh_layouts` field rather than separate `mesh_layout` and `skinned_mesh_layout` fields. You should handle all possible mesh bind group layouts in your implementation - You should also handle properly the new `MORPH_TARGETS` shader def and mesh pipeline key. A new function is exposed to make this easier: `setup_moprh_and_skinning_defs` - The `MeshBindGroup` is now `MeshBindGroups`, cached bind groups are now accessed through the `get` method. [1]: https://en.wikipedia.org/wiki/Morph_target_animation [2]: https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#morph-targets --------- Co-authored-by: François <mockersf@gmail.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2023-06-22 20:00:01 +00:00
) {
uniform.buffer.clear();
let mut values = Vec::with_capacity(*previous_len);
Split `ComputedVisibility` into two components to allow for accurate change detection and speed up visibility propagation (#9497) # Objective Fix #8267. Fixes half of #7840. The `ComputedVisibility` component contains two flags: hierarchy visibility, and view visibility (whether its visible to any cameras). Due to the modular and open-ended way that view visibility is computed, it triggers change detection every single frame, even when the value does not change. Since hierarchy visibility is stored in the same component as view visibility, this means that change detection for inherited visibility is completely broken. At the company I work for, this has become a real issue. We are using change detection to only re-render scenes when necessary. The broken state of change detection for computed visibility means that we have to to rely on the non-inherited `Visibility` component for now. This is workable in the early stages of our project, but since we will inevitably want to use the hierarchy, we will have to either: 1. Roll our own solution for computed visibility. 2. Fix the issue for everyone. ## Solution Split the `ComputedVisibility` component into two: `InheritedVisibilty` and `ViewVisibility`. This allows change detection to behave properly for `InheritedVisibility`. View visiblity is still erratic, although it is less useful to be able to detect changes for this flavor of visibility. Overall, this actually simplifies the API. Since the visibility system consists of self-explaining components, it is much easier to document the behavior and usage. This approach is more modular and "ECS-like" -- one could strip out the `ViewVisibility` component entirely if it's not needed, and rely only on inherited visibility. --- ## Changelog - `ComputedVisibility` has been removed in favor of: `InheritedVisibility` and `ViewVisiblity`. ## Migration Guide The `ComputedVisibilty` component has been split into `InheritedVisiblity` and `ViewVisibility`. Replace any usages of `ComputedVisibility::is_visible_in_hierarchy` with `InheritedVisibility::get`, and replace `ComputedVisibility::is_visible_in_view` with `ViewVisibility::get`. ```rust // Before: commands.spawn(VisibilityBundle { visibility: Visibility::Inherited, computed_visibility: ComputedVisibility::default(), }); // After: commands.spawn(VisibilityBundle { visibility: Visibility::Inherited, inherited_visibility: InheritedVisibility::default(), view_visibility: ViewVisibility::default(), }); ``` ```rust // Before: fn my_system(q: Query<&ComputedVisibilty>) { for vis in &q { if vis.is_visible_in_hierarchy() { // After: fn my_system(q: Query<&InheritedVisibility>) { for inherited_visibility in &q { if inherited_visibility.get() { ``` ```rust // Before: fn my_system(q: Query<&ComputedVisibilty>) { for vis in &q { if vis.is_visible_in_view() { // After: fn my_system(q: Query<&ViewVisibility>) { for view_visibility in &q { if view_visibility.get() { ``` ```rust // Before: fn my_system(mut q: Query<&mut ComputedVisibilty>) { for vis in &mut q { vis.set_visible_in_view(); // After: fn my_system(mut q: Query<&mut ViewVisibility>) { for view_visibility in &mut q { view_visibility.set(); ``` --------- Co-authored-by: Robert Swain <robert.swain@gmail.com>
2023-09-01 13:00:18 +00:00
for (entity, view_visibility, morph_weights) in &query {
if !view_visibility.get() {
Add morph targets (#8158) # Objective - Add morph targets to `bevy_pbr` (closes #5756) & load them from glTF - Supersedes #3722 - Fixes #6814 [Morph targets][1] (also known as shape interpolation, shape keys, or blend shapes) allow animating individual vertices with fine grained controls. This is typically used for facial expressions. By specifying multiple poses as vertex offset, and providing a set of weight of each pose, it is possible to define surprisingly realistic transitions between poses. Blending between multiple poses also allow composition. Morph targets are part of the [gltf standard][2] and are a feature of Unity and Unreal, and babylone.js, it is only natural to implement them in bevy. ## Solution This implementation of morph targets uses a 3d texture where each pixel is a component of an animated attribute. Each layer is a different target. We use a 2d texture for each target, because the number of attribute×components×animated vertices is expected to always exceed the maximum pixel row size limit of webGL2. It copies fairly closely the way skinning is implemented on the CPU side, while on the GPU side, the shader morph target implementation is a relatively trivial detail. We add an optional `morph_texture` to the `Mesh` struct. The `morph_texture` is built through a method that accepts an iterator over attribute buffers. The `MorphWeights` component, user-accessible, controls the blend of poses used by mesh instances (so that multiple copy of the same mesh may have different weights), all the weights are uploaded to a uniform buffer of 256 `f32`. We limit to 16 poses per mesh, and a total of 256 poses. More literature: * Old babylone.js implementation (vertex attribute-based): https://www.eternalcoding.com/dev-log-1-morph-targets/ * Babylone.js implementation (similar to ours): https://www.youtube.com/watch?v=LBPRmGgU0PE * GPU gems 3: https://developer.nvidia.com/gpugems/gpugems3/part-i-geometry/chapter-3-directx-10-blend-shapes-breaking-limits * Development discord thread https://discord.com/channels/691052431525675048/1083325980615114772 https://user-images.githubusercontent.com/26321040/231181046-3bca2ab2-d4d9-472e-8098-639f1871ce2e.mp4 https://github.com/bevyengine/bevy/assets/26321040/d2a0c544-0ef8-45cf-9f99-8c3792f5a258 ## Acknowledgements * Thanks to `storytold` for sponsoring the feature * Thanks to `superdump` and `james7132` for guidance and help figuring out stuff ## Future work - Handling of less and more attributes (eg: animated uv, animated arbitrary attributes) - Dynamic pose allocation (so that zero-weighted poses aren't uploaded to GPU for example, enables much more total poses) - Better animation API, see #8357 ---- ## Changelog - Add morph targets to bevy meshes - Support up to 64 poses per mesh of individually up to 116508 vertices, animation currently strictly limited to the position, normal and tangent attributes. - Load a morph target using `Mesh::set_morph_targets` - Add `VisitMorphTargets` and `VisitMorphAttributes` traits to `bevy_render`, this allows defining morph targets (a fairly complex and nested data structure) through iterators (ie: single copy instead of passing around buffers), see documentation of those traits for details - Add `MorphWeights` component exported by `bevy_render` - `MorphWeights` control mesh's morph target weights, blending between various poses defined as morph targets. - `MorphWeights` are directly inherited by direct children (single level of hierarchy) of an entity. This allows controlling several mesh primitives through a unique entity _as per GLTF spec_. - Add `MorphTargetNames` component, naming each indices of loaded morph targets. - Load morph targets weights and buffers in `bevy_gltf` - handle morph targets animations in `bevy_animation` (previously, it was a `warn!` log) - Add the `MorphStressTest.gltf` asset for morph targets testing, taken from the glTF samples repo, CC0. - Add morph target manipulation to `scene_viewer` - Separate the animation code in `scene_viewer` from the rest of the code, reducing `#[cfg(feature)]` noise - Add the `morph_targets.rs` example to show off how to manipulate morph targets, loading `MorpStressTest.gltf` ## Migration Guide - (very specialized, unlikely to be touched by 3rd parties) - `MeshPipeline` now has a single `mesh_layouts` field rather than separate `mesh_layout` and `skinned_mesh_layout` fields. You should handle all possible mesh bind group layouts in your implementation - You should also handle properly the new `MORPH_TARGETS` shader def and mesh pipeline key. A new function is exposed to make this easier: `setup_moprh_and_skinning_defs` - The `MeshBindGroup` is now `MeshBindGroups`, cached bind groups are now accessed through the `get` method. [1]: https://en.wikipedia.org/wiki/Morph_target_animation [2]: https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#morph-targets --------- Co-authored-by: François <mockersf@gmail.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2023-06-22 20:00:01 +00:00
continue;
}
let start = uniform.buffer.len();
let weights = morph_weights.weights();
let legal_weights = weights.iter().take(MAX_MORPH_WEIGHTS).copied();
uniform.buffer.extend(legal_weights);
add_to_alignment::<f32>(&mut uniform.buffer);
let index = (start * mem::size_of::<f32>()) as u32;
Automatic batching/instancing of draw commands (#9685) # Objective - Implement the foundations of automatic batching/instancing of draw commands as the next step from #89 - NOTE: More performance improvements will come when more data is managed and bound in ways that do not require rebinding such as mesh, material, and texture data. ## Solution - The core idea for batching of draw commands is to check whether any of the information that has to be passed when encoding a draw command changes between two things that are being drawn according to the sorted render phase order. These should be things like the pipeline, bind groups and their dynamic offsets, index/vertex buffers, and so on. - The following assumptions have been made: - Only entities with prepared assets (pipelines, materials, meshes) are queued to phases - View bindings are constant across a phase for a given draw function as phases are per-view - `batch_and_prepare_render_phase` is the only system that performs this batching and has sole responsibility for preparing the per-object data. As such the mesh binding and dynamic offsets are assumed to only vary as a result of the `batch_and_prepare_render_phase` system, e.g. due to having to split data across separate uniform bindings within the same buffer due to the maximum uniform buffer binding size. - Implement `GpuArrayBuffer` for `Mesh2dUniform` to store Mesh2dUniform in arrays in GPU buffers rather than each one being at a dynamic offset in a uniform buffer. This is the same optimisation that was made for 3D not long ago. - Change batch size for a range in `PhaseItem`, adding API for getting or mutating the range. This is more flexible than a size as the length of the range can be used in place of the size, but the start and end can be otherwise whatever is needed. - Add an optional mesh bind group dynamic offset to `PhaseItem`. This avoids having to do a massive table move just to insert `GpuArrayBufferIndex` components. ## Benchmarks All tests have been run on an M1 Max on AC power. `bevymark` and `many_cubes` were modified to use 1920x1080 with a scale factor of 1. I run a script that runs a separate Tracy capture process, and then runs the bevy example with `--features bevy_ci_testing,trace_tracy` and `CI_TESTING_CONFIG=../benchmark.ron` with the contents of `../benchmark.ron`: ```rust ( exit_after: Some(1500) ) ``` ...in order to run each test for 1500 frames. The recent changes to `many_cubes` and `bevymark` added reproducible random number generation so that with the same settings, the same rng will occur. They also added benchmark modes that use a fixed delta time for animations. Combined this means that the same frames should be rendered both on main and on the branch. The graphs compare main (yellow) to this PR (red). ### 3D Mesh `many_cubes --benchmark` <img width="1411" alt="Screenshot 2023-09-03 at 23 42 10" src="https://github.com/bevyengine/bevy/assets/302146/2088716a-c918-486c-8129-090b26fd2bc4"> The mesh and material are the same for all instances. This is basically the best case for the initial batching implementation as it results in 1 draw for the ~11.7k visible meshes. It gives a ~30% reduction in median frame time. The 1000th frame is identical using the flip tool: ![flip many_cubes-main-mesh3d many_cubes-batching-mesh3d 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/2511f37a-6df8-481a-932f-706ca4de7643) ``` Mean: 0.000000 Weighted median: 0.000000 1st weighted quartile: 0.000000 3rd weighted quartile: 0.000000 Min: 0.000000 Max: 0.000000 Evaluation time: 0.4615 seconds ``` ### 3D Mesh `many_cubes --benchmark --material-texture-count 10` <img width="1404" alt="Screenshot 2023-09-03 at 23 45 18" src="https://github.com/bevyengine/bevy/assets/302146/5ee9c447-5bd2-45c6-9706-ac5ff8916daf"> This run uses 10 different materials by varying their textures. The materials are randomly selected, and there is no sorting by material bind group for opaque 3D so any batching is 'random'. The PR produces a ~5% reduction in median frame time. If we were to sort the opaque phase by the material bind group, then this should be a lot faster. This produces about 10.5k draws for the 11.7k visible entities. This makes sense as randomly selecting from 10 materials gives a chance that two adjacent entities randomly select the same material and can be batched. The 1000th frame is identical in flip: ![flip many_cubes-main-mesh3d-mtc10 many_cubes-batching-mesh3d-mtc10 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/2b3a8614-9466-4ed8-b50c-d4aa71615dbb) ``` Mean: 0.000000 Weighted median: 0.000000 1st weighted quartile: 0.000000 3rd weighted quartile: 0.000000 Min: 0.000000 Max: 0.000000 Evaluation time: 0.4537 seconds ``` ### 3D Mesh `many_cubes --benchmark --vary-per-instance` <img width="1394" alt="Screenshot 2023-09-03 at 23 48 44" src="https://github.com/bevyengine/bevy/assets/302146/f02a816b-a444-4c18-a96a-63b5436f3b7f"> This run varies the material data per instance by randomly-generating its colour. This is the worst case for batching and that it performs about the same as `main` is a good thing as it demonstrates that the batching has minimal overhead when dealing with ~11k visible mesh entities. The 1000th frame is identical according to flip: ![flip many_cubes-main-mesh3d-vpi many_cubes-batching-mesh3d-vpi 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/ac5f5c14-9bda-4d1a-8219-7577d4aac68c) ``` Mean: 0.000000 Weighted median: 0.000000 1st weighted quartile: 0.000000 3rd weighted quartile: 0.000000 Min: 0.000000 Max: 0.000000 Evaluation time: 0.4568 seconds ``` ### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d` <img width="1412" alt="Screenshot 2023-09-03 at 23 59 56" src="https://github.com/bevyengine/bevy/assets/302146/cb02ae07-237b-4646-ae9f-fda4dafcbad4"> This spawns 160 waves of 1000 quad meshes that are shaded with ColorMaterial. Each wave has a different material so 160 waves currently should result in 160 batches. This results in a 50% reduction in median frame time. Capturing a screenshot of the 1000th frame main vs PR gives: ![flip bevymark-main-mesh2d bevymark-batching-mesh2d 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/80102728-1217-4059-87af-14d05044df40) ``` Mean: 0.001222 Weighted median: 0.750432 1st weighted quartile: 0.453494 3rd weighted quartile: 0.969758 Min: 0.000000 Max: 0.990296 Evaluation time: 0.4255 seconds ``` So they seem to produce the same results. I also double-checked the number of draws. `main` does 160000 draws, and the PR does 160, as expected. ### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d --material-texture-count 10` <img width="1392" alt="Screenshot 2023-09-04 at 00 09 22" src="https://github.com/bevyengine/bevy/assets/302146/4358da2e-ce32-4134-82df-3ab74c40849c"> This generates 10 textures and generates materials for each of those and then selects one material per wave. The median frame time is reduced by 50%. Similar to the plain run above, this produces 160 draws on the PR and 160000 on `main` and the 1000th frame is identical (ignoring the fps counter text overlay). ![flip bevymark-main-mesh2d-mtc10 bevymark-batching-mesh2d-mtc10 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/ebed2822-dce7-426a-858b-b77dc45b986f) ``` Mean: 0.002877 Weighted median: 0.964980 1st weighted quartile: 0.668871 3rd weighted quartile: 0.982749 Min: 0.000000 Max: 0.992377 Evaluation time: 0.4301 seconds ``` ### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d --vary-per-instance` <img width="1396" alt="Screenshot 2023-09-04 at 00 13 53" src="https://github.com/bevyengine/bevy/assets/302146/b2198b18-3439-47ad-919a-cdabe190facb"> This creates unique materials per instance by randomly-generating the material's colour. This is the worst case for 2D batching. Somehow, this PR manages a 7% reduction in median frame time. Both main and this PR issue 160000 draws. The 1000th frame is the same: ![flip bevymark-main-mesh2d-vpi bevymark-batching-mesh2d-vpi 67ppd ldr](https://github.com/bevyengine/bevy/assets/302146/a2ec471c-f576-4a36-a23b-b24b22578b97) ``` Mean: 0.001214 Weighted median: 0.937499 1st weighted quartile: 0.635467 3rd weighted quartile: 0.979085 Min: 0.000000 Max: 0.988971 Evaluation time: 0.4462 seconds ``` ### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite` <img width="1396" alt="Screenshot 2023-09-04 at 12 21 12" src="https://github.com/bevyengine/bevy/assets/302146/8b31e915-d6be-4cac-abf5-c6a4da9c3d43"> This just spawns 160 waves of 1000 sprites. There should be and is no notable difference between main and the PR. ### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite --material-texture-count 10` <img width="1389" alt="Screenshot 2023-09-04 at 12 36 08" src="https://github.com/bevyengine/bevy/assets/302146/45fe8d6d-c901-4062-a349-3693dd044413"> This spawns the sprites selecting a texture at random per instance from the 10 generated textures. This has no significant change vs main and shouldn't. ### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite --vary-per-instance` <img width="1401" alt="Screenshot 2023-09-04 at 12 29 52" src="https://github.com/bevyengine/bevy/assets/302146/762c5c60-352e-471f-8dbe-bbf10e24ebd6"> This sets the sprite colour as being unique per instance. This can still all be drawn using one batch. There should be no difference but the PR produces median frame times that are 4% higher. Investigation showed no clear sources of cost, rather a mix of give and take that should not happen. It seems like noise in the results. ### Summary | Benchmark | % change in median frame time | | ------------- | ------------- | | many_cubes | 🟩 -30% | | many_cubes 10 materials | 🟩 -5% | | many_cubes unique materials | 🟩 ~0% | | bevymark mesh2d | 🟩 -50% | | bevymark mesh2d 10 materials | 🟩 -50% | | bevymark mesh2d unique materials | 🟩 -7% | | bevymark sprite | 🟥 2% | | bevymark sprite 10 materials | 🟥 0.6% | | bevymark sprite unique materials | 🟥 4.1% | --- ## Changelog - Added: 2D and 3D mesh entities that share the same mesh and material (same textures, same data) are now batched into the same draw command for better performance. --------- Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com> Co-authored-by: Nicola Papale <nico@nicopap.ch>
2023-09-21 22:12:34 +00:00
// NOTE: Because morph targets require per-morph target texture bindings, they cannot
// currently be batched.
values.push((entity, (MorphIndex { index }, NoAutomaticBatching)));
Add morph targets (#8158) # Objective - Add morph targets to `bevy_pbr` (closes #5756) & load them from glTF - Supersedes #3722 - Fixes #6814 [Morph targets][1] (also known as shape interpolation, shape keys, or blend shapes) allow animating individual vertices with fine grained controls. This is typically used for facial expressions. By specifying multiple poses as vertex offset, and providing a set of weight of each pose, it is possible to define surprisingly realistic transitions between poses. Blending between multiple poses also allow composition. Morph targets are part of the [gltf standard][2] and are a feature of Unity and Unreal, and babylone.js, it is only natural to implement them in bevy. ## Solution This implementation of morph targets uses a 3d texture where each pixel is a component of an animated attribute. Each layer is a different target. We use a 2d texture for each target, because the number of attribute×components×animated vertices is expected to always exceed the maximum pixel row size limit of webGL2. It copies fairly closely the way skinning is implemented on the CPU side, while on the GPU side, the shader morph target implementation is a relatively trivial detail. We add an optional `morph_texture` to the `Mesh` struct. The `morph_texture` is built through a method that accepts an iterator over attribute buffers. The `MorphWeights` component, user-accessible, controls the blend of poses used by mesh instances (so that multiple copy of the same mesh may have different weights), all the weights are uploaded to a uniform buffer of 256 `f32`. We limit to 16 poses per mesh, and a total of 256 poses. More literature: * Old babylone.js implementation (vertex attribute-based): https://www.eternalcoding.com/dev-log-1-morph-targets/ * Babylone.js implementation (similar to ours): https://www.youtube.com/watch?v=LBPRmGgU0PE * GPU gems 3: https://developer.nvidia.com/gpugems/gpugems3/part-i-geometry/chapter-3-directx-10-blend-shapes-breaking-limits * Development discord thread https://discord.com/channels/691052431525675048/1083325980615114772 https://user-images.githubusercontent.com/26321040/231181046-3bca2ab2-d4d9-472e-8098-639f1871ce2e.mp4 https://github.com/bevyengine/bevy/assets/26321040/d2a0c544-0ef8-45cf-9f99-8c3792f5a258 ## Acknowledgements * Thanks to `storytold` for sponsoring the feature * Thanks to `superdump` and `james7132` for guidance and help figuring out stuff ## Future work - Handling of less and more attributes (eg: animated uv, animated arbitrary attributes) - Dynamic pose allocation (so that zero-weighted poses aren't uploaded to GPU for example, enables much more total poses) - Better animation API, see #8357 ---- ## Changelog - Add morph targets to bevy meshes - Support up to 64 poses per mesh of individually up to 116508 vertices, animation currently strictly limited to the position, normal and tangent attributes. - Load a morph target using `Mesh::set_morph_targets` - Add `VisitMorphTargets` and `VisitMorphAttributes` traits to `bevy_render`, this allows defining morph targets (a fairly complex and nested data structure) through iterators (ie: single copy instead of passing around buffers), see documentation of those traits for details - Add `MorphWeights` component exported by `bevy_render` - `MorphWeights` control mesh's morph target weights, blending between various poses defined as morph targets. - `MorphWeights` are directly inherited by direct children (single level of hierarchy) of an entity. This allows controlling several mesh primitives through a unique entity _as per GLTF spec_. - Add `MorphTargetNames` component, naming each indices of loaded morph targets. - Load morph targets weights and buffers in `bevy_gltf` - handle morph targets animations in `bevy_animation` (previously, it was a `warn!` log) - Add the `MorphStressTest.gltf` asset for morph targets testing, taken from the glTF samples repo, CC0. - Add morph target manipulation to `scene_viewer` - Separate the animation code in `scene_viewer` from the rest of the code, reducing `#[cfg(feature)]` noise - Add the `morph_targets.rs` example to show off how to manipulate morph targets, loading `MorpStressTest.gltf` ## Migration Guide - (very specialized, unlikely to be touched by 3rd parties) - `MeshPipeline` now has a single `mesh_layouts` field rather than separate `mesh_layout` and `skinned_mesh_layout` fields. You should handle all possible mesh bind group layouts in your implementation - You should also handle properly the new `MORPH_TARGETS` shader def and mesh pipeline key. A new function is exposed to make this easier: `setup_moprh_and_skinning_defs` - The `MeshBindGroup` is now `MeshBindGroups`, cached bind groups are now accessed through the `get` method. [1]: https://en.wikipedia.org/wiki/Morph_target_animation [2]: https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#morph-targets --------- Co-authored-by: François <mockersf@gmail.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2023-06-22 20:00:01 +00:00
}
*previous_len = values.len();
commands.insert_or_spawn_batch(values);
}