Mirrors/bevy

mirror of https://github.com/bevyengine/bevy synced 2024-12-25 12:33:07 +00:00

Author	SHA1	Message	Date
Wybe Westra	abf12f3b3b	Fixed several missing links in docs. (#8117 ) Links in the api docs are nice. I noticed that there were several places where structs / functions and other things were referenced in the docs, but weren't linked. I added the links where possible / logical. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: François <mockersf@gmail.com>	2023-04-23 17:28:36 +00:00
Nile	9db70da96f	Add screenshot api (#7163 ) Fixes https://github.com/bevyengine/bevy/issues/1207 # Objective Right now, it's impossible to capture a screenshot of the entire window without forking bevy. This is because - The swapchain texture never has the COPY_SRC usage - It can't be accessed without taking ownership of it - Taking ownership of it breaks a lot of stuff ## Solution - Introduce a dedicated api for taking a screenshot of a given bevy window, and guarantee this screenshot will always match up with what gets put on the screen. --- ## Changelog - Added the `ScreenshotManager` resource with two functions, `take_screenshot` and `save_screenshot_to_disk`	2023-04-19 21:28:42 +00:00
IceSentry	2c21d423fd	Make render graph slots optional for most cases (#8109 ) # Objective - Currently, the render graph slots are only used to pass the view_entity around. This introduces significant boilerplate for very little value. Instead of using slots for this, make the view_entity part of the `RenderGraphContext`. This also means we won't need to have `IN_VIEW` on every node and and we'll be able to use the default impl of `Node::input()`. ## Solution - Add `view_entity: Option<Entity>` to the `RenderGraphContext` - Update all nodes to use this instead of entity slot input --- ## Changelog - Add optional `view_entity` to `RenderGraphContext` ## Migration Guide You can now get the view_entity directly from the `RenderGraphContext`. When implementing the Node: ```rust // 0.10 struct FooNode; impl FooNode { const IN_VIEW: &'static str = "view"; } impl Node for FooNode { fn input(&self) -> Vec<SlotInfo> { vec![SlotInfo::new(Self::IN_VIEW, SlotType::Entity)] } fn run( &self, graph: &mut RenderGraphContext, // ... ) -> Result<(), NodeRunError> { let view_entity = graph.get_input_entity(Self::IN_VIEW)?; // ... Ok(()) } } // 0.11 struct FooNode; impl Node for FooNode { fn run( &self, graph: &mut RenderGraphContext, // ... ) -> Result<(), NodeRunError> { let view_entity = graph.view_entity(); // ... Ok(()) } } ``` When adding the node to the graph, you don't need to specify a slot_edge for the view_entity. ```rust // 0.10 let mut graph = RenderGraph::default(); graph.add_node(FooNode::NAME, node); let input_node_id = draw_2d_graph.set_input(vec![SlotInfo::new( graph::input::VIEW_ENTITY, SlotType::Entity, )]); graph.add_slot_edge( input_node_id, graph::input::VIEW_ENTITY, FooNode::NAME, FooNode::IN_VIEW, ); // add_node_edge ... // 0.11 let mut graph = RenderGraph::default(); graph.add_node(FooNode::NAME, node); // add_node_edge ... ``` ## Notes This PR paired with #8007 will help reduce a lot of annoying boilerplate with the render nodes. Depending on which one gets merged first. It will require a bit of clean up work to make both compatible. I tagged this as a breaking change, because using the old system to get the view_entity will break things because it's not a node input slot anymore. ## Notes for reviewers A lot of the diffs are just removing the slots in every nodes and graph creation. The important part is mostly in the graph_runner/CameraDriverNode.	2023-03-21 20:11:13 +00:00
Elabajaba	bfd1d4b0a7	Wgpu 0.15 (#7356 ) # Objective Update Bevy to wgpu 0.15. ## Changelog - Update to wgpu 0.15, wgpu-hal 0.15.1, and naga 0.11 - Users can now use the [DirectX Shader Compiler](https://github.com/microsoft/DirectXShaderCompiler) (DXC) on Windows with DX12 for faster shader compilation and ShaderModel 6.0+ support (requires `dxcompiler.dll` and `dxil.dll`, which are included in DXC downloads from [here](https://github.com/microsoft/DirectXShaderCompiler/releases/latest)) ## Migration Guide ### WGSL Top-Level `let` is now `const` All top level constants are now declared with `const`, catching up with the wgsl spec. `let` is no longer allowed at the global scope, only within functions. ```diff -let SOME_CONSTANT = 12.0; +const SOME_CONSTANT = 12.0; ``` #### `TextureDescriptor` and `SurfaceConfiguration` now requires a `view_formats` field The new `view_formats` field in the `TextureDescriptor` is used to specify a list of formats the texture can be re-interpreted to in a texture view. Currently only changing srgb-ness is allowed (ex. `Rgba8Unorm` <=> `Rgba8UnormSrgb`). You should set `view_formats` to `&[]` (empty) unless you have a specific reason not to. #### The DirectX Shader Compiler (DXC) is now supported on DX12 DXC is now the default shader compiler when using the DX12 backend. DXC is Microsoft's replacement for their legacy FXC compiler, and is faster, less buggy, and allows for modern shader features to be used (ShaderModel 6.0+). DXC requires `dxcompiler.dll` and `dxil.dll` to be available, otherwise it will log a warning and fall back to FXC. You can get `dxcompiler.dll` and `dxil.dll` by downloading the latest release from [Microsoft's DirectXShaderCompiler github repo](https://github.com/microsoft/DirectXShaderCompiler/releases/latest) and copying them into your project's root directory. These must be included when you distribute your Bevy game/app/etc if you plan on supporting the DX12 backend and are using DXC. `WgpuSettings` now has a `dx12_shader_compiler` field which can be used to choose between either FXC or DXC (if you pass None for the paths for DXC, it will check for the .dlls in the working directory).	2023-01-29 20:27:30 +00:00
James Liu	a85b740f24	Support recording multiple CommandBuffers in RenderContext (#7248 ) # Objective `RenderContext`, the core abstraction for running the render graph, currently only supports recording one `CommandBuffer` across the entire render graph. This means the entire buffer must be recorded sequentially, usually via the render graph itself. This prevents parallelization and forces users to only encode their commands in the render graph. ## Solution Allow `RenderContext` to store a `Vec<CommandBuffer>` that it progressively appends to. By default, the context will not have a command encoder, but will create one as soon as either `begin_tracked_render_pass` or the `command_encoder` accesor is first called. `RenderContext::add_command_buffer` allows users to interrupt the current command encoder, flush it to the vec, append a user-provided `CommandBuffer` and reset the command encoder to start a new buffer. Users or the render graph will call `RenderContext::finish` to retrieve the series of buffers for submitting to the queue. This allows users to encode their own `CommandBuffer`s outside of the render graph, potentially in different threads, and store them in components or resources. Ideally, in the future, the core pipeline passes can run in `RenderStage::Render` systems and end up saving the completed command buffers to either `Commands` or a field in `RenderPhase`. ## Alternatives The alternative is to use to use wgpu's `RenderBundle`s, which can achieve similar results; however it's not universally available (no OpenGL, WebGL, and DX11). --- ## Changelog Added: `RenderContext::new` Added: `RenderContext::add_command_buffer` Added: `RenderContext::finish` Changed: `RenderContext::render_device` is now private. Use the accessor `RenderContext::render_device()` instead. Changed: `RenderContext::command_encoder` is now private. Use the accessor `RenderContext::command_encoder()` instead. Changed: `RenderContext` now supports adding external `CommandBuffer`s for inclusion into the render graphs. These buffers can be encoded outside of the render graph (i.e. in a system). ## Migration Guide `RenderContext`'s fields are now private. Use the accessors on `RenderContext` instead, and construct it with `RenderContext::new`.	2023-01-22 00:21:55 +00:00
Jakob Hellermann	02637b609e	fix clippy (#7302 ) # Objective - `cargo clippy` should work (except for clippy::type_complexity) ## Solution - fix new clippy lints	2023-01-20 14:25:25 +00:00
James Liu	bef9bc1844	Reduce branching in TrackedRenderPass (#7053 ) # Objective Speed up the render phase for rendering. ## Solution - Follow up #6988 and make the internals of atomic IDs `NonZeroU32`. This niches the `Option`s of the IDs in draw state, which reduces the size and branching behavior when evaluating for equality. - Require `&RenderDevice` to get the device's `Limits` when initializing a `TrackedRenderPass` to preallocate the bind groups and vertex buffer state in `DrawState`, this removes the branch on needing to resize those `Vec`s. ## Performance This produces a similar speed up akin to that of #6885. This shows an approximate 6% speed up in `main_opaque_pass_3d` on `many_foxes` (408.79 us -> 388us). This should be orthogonal to the gains seen there. ![image](https://user-images.githubusercontent.com/3137680/209906239-e430f026-63c2-4b95-957e-a2045b810d79.png) --- ## Changelog Added: `RenderContext::begin_tracked_render_pass`. Changed: `TrackedRenderPass` now requires a `&RenderDevice` on construction. Removed: `bevy_render::render_phase::DrawState`. It was not usable in any form outside of `bevy_render`. ## Migration Guide TODO	2023-01-09 19:24:56 +00:00
Jinlei Li	09c64ffe9f	Remove redundant bitwise OR `TEXTURE_ADAPTER_SPECIFIC_FORMAT_FEATURES` (#7033 ) # Objective `TEXTURE_ADAPTER_SPECIFIC_FORMAT_FEATURES` was already included in `adapter.features()` on non-wasm target, and since it is the default value for `WgpuSettings.features`, the subsequent code will also combine into this feature: `b6066c30b6/crates/bevy_render/src/renderer/mod.rs (L155-L156)`	2022-12-27 16:27:55 +00:00
Torstein Grindvik	daa57fe489	Add try_* to add_slot_edge, add_node_edge (#6720 ) # Objective `add_node_edge` and `add_slot_edge` are fallible methods, but are always used with `.unwrap()`. `input_node` is often unwrapped as well. This points to having an infallible behaviour as default, with an alternative fallible variant if needed. Improves readability and ergonomics. ## Solution - Change `add_node_edge` and `add_slot_edge` to panic on error. - Change `input_node` to panic on `None`. - Add `try_add_node_edge` and `try_add_slot_edge` in case fallible methods are needed. - Add `get_input_node` to still be able to get an `Option`. --- ## Changelog ### Added - `try_add_node_edge` - `try_add_slot_edge` - `get_input_node` ### Changed - `add_node_edge` is now infallible (panics on error) - `add_slot_edge` is now infallible (panics on error) - `input_node` now panics on `None` ## Migration Guide Remove `.unwrap()` from `add_node_edge` and `add_slot_edge`. For cases where the error was handled, use `try_add_node_edge` and `try_add_slot_edge` instead. Remove `.unwrap()` from `input_node`. For cases where the option was handled, use `get_input_node` instead. Co-authored-by: Torstein Grindvik <52322338+torsteingrindvik@users.noreply.github.com>	2022-11-21 21:58:39 +00:00
robtfm	2cd0bd7575	improve compile time by type-erasing wgpu structs (#5950 ) # Objective structs containing wgpu types take a long time to compile. this is particularly bad for generics containing the wgpu structs (like the depth pipeline builder with `#[derive(SystemParam)]` i've been working on). we can avoid that by boxing and type-erasing in the bevy `render_resource` wrappers. type system magic is not a strength of mine so i guess there will be a cleaner way to achieve this, happy to take feedback or for it to be taken as a proof of concept if someone else wants to do a better job. ## Solution - add macros to box and type-erase in debug mode - leave current impl for release mode timings: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 15"> <link id=Main-File rel=Main-File href="file:///C:/Users/robfm/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> <link rel=File-List href="file:///C:/Users/robfm/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> <!--table {mso-displayed-decimal-separator:"\."; mso-displayed-thousand-separator:"\,";} @page {margin:.75in .7in .75in .7in; mso-header-margin:.3in; mso-footer-margin:.3in;} tr {mso-height-source:auto;} col {mso-width-source:auto;} br {mso-data-placement:same-cell;} td {padding-top:1px; padding-right:1px; padding-left:1px; mso-ignore:padding; color:black; font-size:11.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:Calibri, sans-serif; mso-font-charset:0; mso-number-format:General; text-align:general; vertical-align:bottom; border:none; mso-background-source:auto; mso-pattern:auto; mso-protection:locked visible; white-space:nowrap; mso-rotate:0;} .xl65 {mso-number-format:0%;} .xl66 {vertical-align:middle; white-space:normal;} .xl67 {vertical-align:middle;} --> </head> <body link="#0563C1" vlink="#954F72"> current \| \| \| -- \| -- \| -- \| -- \| Total time: \| 64.9s \| \| bevy_pbr v0.9.0-dev \| 19.2s \| \| bevy_render v0.9.0-dev \| 17.0s \| \| bevy_sprite v0.9.0-dev \| 15.1s \| \| DepthPipelineBuilder \| 18.7s \| \| \| \| with type-erasing \| \| \| diff \| Total time: \| 49.0s \| -24% \| bevy_render v0.9.0-dev \| 12.0s \| -38% \| bevy_pbr v0.9.0-dev \| 8.7s \| -49% \| bevy_sprite v0.9.0-dev \| 6.1s \| -60% \| DepthPipelineBuilder \| 1.2s \| -94% </body> </html> the depth pipeline builder is a binary with body: ```rust use std::{marker::PhantomData, hash::Hash}; use bevy::{prelude::*, ecs::system::SystemParam, pbr::{RenderMaterials, MaterialPipeline, ShadowPipeline}, render::{renderer::RenderDevice, render_resource::{SpecializedMeshPipelines, PipelineCache}, render_asset::RenderAssets}}; fn main() { println!("Hello, world p!\n"); } #[derive(SystemParam)] pub struct DepthPipelineBuilder<'w, 's, M: Material> where M::Data: Eq + Hash + Clone, { render_device: Res<'w, RenderDevice>, material_pipeline: Res<'w, MaterialPipeline<M>>, material_pipelines: ResMut<'w, SpecializedMeshPipelines<MaterialPipeline<M>>>, shadow_pipeline: Res<'w, ShadowPipeline>, pipeline_cache: ResMut<'w, PipelineCache>, render_meshes: Res<'w, RenderAssets<Mesh>>, render_materials: Res<'w, RenderMaterials<M>>, msaa: Res<'w, Msaa>, #[system_param(ignore)] _p: PhantomData<&'s M>, } ```	2022-11-18 22:04:23 +00:00
Jakob Hellermann	e71c4d2802	fix nightly clippy warnings (#6395 ) # Objective - fix new clippy lints before they get stable and break CI ## Solution - run `clippy --fix` to auto-fix machine-applicable lints - silence `clippy::should_implement_trait` for `fn HandleId::default<T: Asset>` ## Changes - always prefer `format!("{inline}")` over `format!("{}", not_inline)` - prefer `Box::default` (or `Box::<T>::default` if necessary) over `Box::new(T::default())`	2022-10-28 21:03:01 +00:00
Carter Anderson	4d3d3c869e	Support arbitrary RenderTarget texture formats (#6380 ) # Objective Currently, Bevy only supports rendering to the current "surface texture format". This means that "render to texture" scenarios must use the exact format the primary window's surface uses, or Bevy will crash. This is even harder than it used to be now that we detect preferred surface formats at runtime instead of using hard coded BevyDefault values. ## Solution 1. Look up and store each window surface's texture format alongside other extracted window information 2. Specialize the upscaling pass on the current `RenderTarget`'s texture format, now that we can cheaply correlate render targets to their current texture format 3. Remove the old `SurfaceTextureFormat` and `AvailableTextureFormats`: these are now redundant with the information stored on each extracted window, and probably should not have been globals in the first place (as in theory each surface could have a different format). This means you can now use any texture format you want when rendering to a texture! For example, changing the `render_to_texture` example to use `R16Float` now doesn't crash / properly only stores the red component: ![image](https://user-images.githubusercontent.com/2694663/198140125-c606dd0e-6fdf-4544-b93d-dbbd10dbadd2.png)	2022-10-26 23:12:12 +00:00
Jakob Hellermann	838b318863	separate tonemapping and upscaling passes (#3425 ) Attempt to make features like bloom https://github.com/bevyengine/bevy/pull/2876 easier to implement. This PR: - Moves the tonemapping from `pbr.wgsl` into a separate pass - also add a separate upscaling pass after the tonemapping which writes to the swap chain (enables resolution-independant rendering and post-processing after tonemapping) - adds a `hdr` bool to the camera which controls whether the pbr and sprite shaders render into a `Rgba16Float` texture Open questions: - ~should the 2d graph work the same as the 3d one?~ it is the same now - ~The current solution is a bit inflexible because while you can add a post processing pass that writes to e.g. the `hdr_texture`, you can't write to a separate `user_postprocess_texture` while reading the `hdr_texture` and tell the tone mapping pass to read from the `user_postprocess_texture` instead. If the tonemapping and upscaling render graph nodes were to take in a `TextureView` instead of the view entity this would almost work, but the bind groups for their respective input textures are already created in the `Queue` render stage in the hardcoded order.~ solved by creating bind groups in render node New render graph: ![render_graph](https://user-images.githubusercontent.com/22177966/147767249-57dd4229-cfab-4ec5-9bf3-dc76dccf8e8b.png) <details> <summary>Before</summary> ![render_graph_old](https://user-images.githubusercontent.com/22177966/147284579-c895fdbd-4028-41cf-914c-e1ffef60e44e.png) </details> Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2022-10-26 20:13:59 +00:00
Lena Milizé	5878a62c3f	Link to `linux_dependencies.md` in the panic message when failing to detect a GPU (#6261 ) As suggested in #6104, it would be nice to link directly to `linux_dependencies.md` file in the panic message when running on Linux. And when not compiling for Linux, we fall back to the old message. Signed-off-by: Lena Milizé <me@lvmn.org> # Objective Resolves #6104. ## Solution Add link to `linux_dependencies.md` when compiling for Linux, and fall back to the old one when not.	2022-10-17 14:01:52 +00:00
François	13dcdba05f	use bevy default texture format if the surface is not yet available (#6233 ) # Objective - Fix #6231 ## Solution - In case no supported format is found, try to use Bevy default instead of panicking	2022-10-11 12:32:03 +00:00
VitalyR	f5322cd757	get proper texture format after the renderer is initialized, fix #3897 (#5413 ) # Objective There is no Srgb support on some GPU and display protocols with `winit` (for example, Nvidia's GPUs with Wayland). Thus `TextureFormat::bevy_default()` which returns `Rgba8UnormSrgb` or `Bgra8UnormSrgb` will cause panics on such platforms. This patch will resolve this problem. Fix https://github.com/bevyengine/bevy/issues/3897. ## Solution Make `initialize_renderer` expose `wgpu::Adapter` and `first_available_texture_format`, use the `first_available_texture_format` by default. ## Changelog * Fixed https://github.com/bevyengine/bevy/issues/3897.	2022-10-10 16:10:05 +00:00
ira	992681b59b	Make `Resource` trait opt-in, requiring `#[derive(Resource)]` V2 (#5577 ) This PR description is an edited copy of #5007, written by @alice-i-cecile. # Objective Follow-up to https://github.com/bevyengine/bevy/pull/2254. The `Resource` trait currently has a blanket implementation for all types that meet its bounds. While ergonomic, this results in several drawbacks: * it is possible to make confusing, silent mistakes such as inserting a function pointer (Foo) rather than a value (Foo::Bar) as a resource * it is challenging to discover if a type is intended to be used as a resource * we cannot later add customization options (see the [RFC](https://github.com/bevyengine/rfcs/blob/main/rfcs/27-derive-component.md) for the equivalent choice for Component). * dependencies can use the same Rust type as a resource in invisibly conflicting ways * raw Rust types used as resources cannot preserve privacy appropriately, as anyone able to access that type can read and write to internal values * we cannot capture a definitive list of possible resources to display to users in an editor ## Notes to reviewers * Review this commit-by-commit; there's effectively no back-tracking and there's a lot of churn in some of these commits. ira: My commits are not as well organized :') * I've relaxed the bound on Local to Send + Sync + 'static: I don't think these concerns apply there, so this can keep things simple. Storing e.g. a u32 in a Local is fine, because there's a variable name attached explaining what it does. * I think this is a bad place for the Resource trait to live, but I've left it in place to make reviewing easier. IMO that's best tackled with https://github.com/bevyengine/bevy/issues/4981. ## Changelog `Resource` is no longer automatically implemented for all matching types. Instead, use the new `#[derive(Resource)]` macro. ## Migration Guide Add `#[derive(Resource)]` to all types you are using as a resource. If you are using a third party type as a resource, wrap it in a tuple struct to bypass orphan rules. Consider deriving `Deref` and `DerefMut` to improve ergonomics. `ClearColor` no longer implements `Component`. Using `ClearColor` as a component in 0.8 did nothing. Use the `ClearColorConfig` in the `Camera3d` and `Camera2d` components instead. Co-authored-by: Alice <alice.i.cecile@gmail.com> Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: devil-ira <justthecooldude@gmail.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2022-08-08 21:36:35 +00:00
Rob Parrett	cfee0e882e	Fix various typos (#5417 ) ## Objective - Fix some typos ## Solution - Fix em. - My favorite was `maxizimed`	2022-07-21 20:46:54 +00:00
François	814f8d1635	update wgpu to 0.13 (#5168 ) # Objective - Update wgpu to 0.13 - ~~Wait, is wgpu 0.13 released? No, but I had most of the changes already ready since playing with webgpu~~ well it has been released now - Also update parking_lot to 0.12 and naga to 0.9 ## Solution - Update syntax for wgsl shaders https://github.com/gfx-rs/wgpu/blob/master/CHANGELOG.md#wgsl-syntax - Add a few options, remove some references: https://github.com/gfx-rs/wgpu/blob/master/CHANGELOG.md#other-breaking-changes - fragment inputs should now exactly match vertex outputs for locations, so I added exports for those to be able to reuse them https://github.com/gfx-rs/wgpu/pull/2704	2022-07-14 21:17:16 +00:00
Mike	a1d3f1b3b4	Update time by sending frame instant through a channel (#4744 ) # Objective - The time update is currently done in the wrong part of the schedule. For a single frame the current order of things is update input, update time (First stage), other stages, render stage (frame presentation). So when we update the time it includes the input processing of the current frame and the frame presentation of the previous frame. This is a problem when vsync is on. When input processing takes a longer amount of time for a frame, the vsync wait time gets shorter. So when these are not paired correctly we can potentially have a long input processing time added to the normal vsync wait time in the previous frame. This leads to inaccurate frame time reporting and more variance of the time than actually exists. For more details of why this is an issue see the linked issue below. - Helps with https://github.com/bevyengine/bevy/issues/4669 - Supercedes https://github.com/bevyengine/bevy/pull/4728 and https://github.com/bevyengine/bevy/pull/4735. This PR should be less controversial than those because it doesn't add to the API surface. ## Solution - The most accurate frame time would come from hardware. We currently don't have access to that for multiple reasons, so the next best thing we can do is measure the frame time as close to frame presentation as possible. This PR gets the Instant::now() for the time immediately after frame presentation in the render system and then sends that time to the app world through a channel. - implements suggestion from @aevyrie from here https://github.com/bevyengine/bevy/pull/4728#discussion_r872010606 ## Statistics ![image](https://user-images.githubusercontent.com/2180432/168410265-f249f66e-ea9d-45d1-b3d8-7207a7bc536c.png) --- ## Changelog - Make frame time reporting more accurate. ## Migration Guide `time.delta()` now reports zero for 2 frames on startup instead of 1 frame.	2022-07-11 23:19:00 +00:00
dataphract	b47291264b	diagnostics: meaningful error when graph node has wrong number of inputs (#4924 ) # Objective Currently, providing the wrong number of inputs to a render graph node triggers this assertion: ``` thread 'main' panicked at 'assertion failed: `(left == right)` left: `1`, right: `2`', /[redacted]/bevy/crates/bevy_render/src/renderer/graph_runner.rs:164:13 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ``` This does not provide the user any context. ## Solution Add a new `RenderGraphRunnerError` variant to handle this case. The new message looks like this: ``` ERROR bevy_render::renderer: Error running render graph: ERROR bevy_render::renderer: > node (name: 'Some("outline_pass")') has 2 input slots, but was provided 1 values ``` --- ## Changelog ### Changed `RenderGraphRunnerError` now has a new variant, `MismatchedInputCount`. ## Migration Guide Exhaustive matches on `RenderGraphRunnerError` will need to add a branch to handle the new `MismatchedInputCount` variant.	2022-06-06 15:47:52 +00:00
Félix Lescaudey de Maneville	f000c2b951	Clippy improvements (#4665 ) # Objective Follow up to my previous MR #3718 to add new clippy warnings to bevy: - [x] [~~option_if_let_else~~](https://rust-lang.github.io/rust-clippy/master/#option_if_let_else) (reverted) - [x] [redundant_else](https://rust-lang.github.io/rust-clippy/master/#redundant_else) - [x] [match_same_arms](https://rust-lang.github.io/rust-clippy/master/#match_same_arms) - [x] [semicolon_if_nothing_returned](https://rust-lang.github.io/rust-clippy/master/#semicolon_if_nothing_returned) - [x] [explicit_iter_loop](https://rust-lang.github.io/rust-clippy/master/#explicit_iter_loop) - [x] [map_flatten](https://rust-lang.github.io/rust-clippy/master/#map_flatten) There is one commit per clippy warning, and the matching flags are added to the CI execution. To test the CI execution you may run `cargo run -p ci -- clippy` at the root. I choose the add the flags in the `ci` tool crate to avoid having them in every `lib.rs` but I guess it could become an issue with suprise warnings coming up after a commit/push Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2022-05-31 01:38:07 +00:00
Jakob Hellermann	d3b2439c82	fix tracy frame marker placement (after preseting all windows) (#4731 ) # Objective The frame marker event was emitted in the loop of presenting all the windows. This would mark the frame as finished multiple times if more than one window is used. ## Solution Move the frame marker to after the `for`-loop, so that it gets executed only once.	2022-05-12 18:39:20 +00:00
Jakob Hellermann	c12ee81822	bevy_app: add tracing event with `tracy.frame_mark` (#4320 ) Currently `tracy` interprets the entire trace as one frame because the marker for frames isn't being recorded. ~~When an event with `tracy.trace_marker=true` is recorded, `tracing-tracy` will mark the frame as finished: <`aa0b96b2ae/tracing-tracy/src/lib.rs (L240)`>~~ ~~Unfortunately this leads to~~ ```rs INFO bevy_app:frame: bevy_app::app: finished frame tracy.frame_mark=true ``` ~~being printed every frame (we can't use DEBUG because bevy_log sets `max_release_level_info`.~~ Instead of emitting an event that gets logged every frame, we can depend on tracy-client itself and call `finish_continuous_frame!();`	2022-04-08 22:50:23 +00:00
Robert Swain	c5963b4fd5	Use storage buffers for clustered forward point lights (#3989 ) # Objective - Make use of storage buffers, where they are available, for clustered forward bindings to support far more point lights in a scene - Fixes #3605 - Based on top of #4079 This branch on an M1 Max can keep 60fps with about 2150 point lights of radius 1m in the Sponza scene where I've been testing. The bottleneck is mostly assigning lights to clusters which grows faster than linearly (I think 1000 lights was about 1.5ms and 5000 was 7.5ms). I have seen papers and presentations leveraging compute shaders that can get this up to over 1 million. That said, I think any further optimisations should probably be done in a separate PR. ## Solution - Add `RenderDevice` to the `Material` and `SpecializedMaterial` trait `::key()` functions to allow setting flags on the keys depending on feature/limit availability - Make `GpuPointLights` and `ViewClusterBuffers` into enums containing `UniformVec` and `StorageBuffer` variants. Implement the necessary API on them to make usage the same for both cases, and the only difference is at initialisation time. - Appropriate shader defs in the shader code to handle the two cases ## Context on some decisions / open questions - I'm using `max_storage_buffers_per_shader_stage >= 3` as a check to see if storage buffers are supported. I was thinking about diving into 'binding resource management' but it feels like we don't have enough use cases to understand the problem yet, and it is mostly a separate concern to this PR, so I think it should be handled separately. - Should `ViewClusterBuffers` and `ViewClusterBindings` be merged, duplicating the count variables into the enum variants? Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2022-04-07 16:16:35 +00:00
Carter Anderson	de677dbfc9	Use more ergonomic span syntax (#4246 ) Tracing added support for "inline span entering", which cuts down on a lot of complexity: ```rust let span = info_span!("my_span").entered(); ``` This adapts our code to use this pattern where possible, and updates our docs to recommend it. This produces equivalent tracing behavior. Here is a side by side profile of "before" and "after" these changes. ![image](https://user-images.githubusercontent.com/2694663/158912137-b0aa6dc8-c603-425f-880f-6ccf5ad1b7ef.png)	2022-03-18 04:19:21 +00:00
Robert Swain	0529f633f9	KTX2/DDS/.basis compressed texture support (#3884 ) # Objective - Support compressed textures including 'universal' formats (ETC1S, UASTC) and transcoding of them to - Support `.dds`, `.ktx2`, and `.basis` files ## Solution - Fixes https://github.com/bevyengine/bevy/issues/3608 Look there for more details. - Note that the functionality is all enabled through non-default features. If it is desirable to enable some by default, I can do that. - The `basis-universal` crate, used for `.basis` file support and for transcoding, is built on bindings against a C++ library. It's not feasible to rewrite in Rust in a short amount of time. There are no Rust alternatives of which I am aware and it's specialised code. In its current state it doesn't support the wasm target, but I don't know for sure. However, it is possible to build the upstream C++ library with emscripten, so there is perhaps a way to add support for web too with some shenanigans. - There's no support for transcoding from BasisLZ/ETC1S in KTX2 files as it was quite non-trivial to implement and didn't feel important given people could use `.basis` files for ETC1S.	2022-03-15 22:26:46 +00:00
dataphract	cba9bcc7ba	improve error messages for render graph runner (#3930 ) # Objective Currently, errors in the render graph runner are exposed via a `Result::unwrap()` panic message, which dumps the debug representation of the error. ## Solution This PR updates `render_system` to log the chain of errors, followed by an explicit panic: ``` ERROR bevy_render::renderer: Error running render graph: ERROR bevy_render::renderer: > encountered an error when running a sub-graph ERROR bevy_render::renderer: > tried to pass inputs to sub-graph "outline_graph", which has no input slots thread 'main' panicked at 'Error running render graph: encountered an error when running a sub-graph', /[redacted]/bevy/crates/bevy_render/src/renderer/mod.rs:44:9 ``` Some errors' `Display` impls (via `thiserror`) have also been updated to provide more detail about the cause of the error.	2022-03-07 09:09:24 +00:00
Alice Cecile	557ab9897a	Make get_resource (and friends) infallible (#4047 ) # Objective - In the large majority of cases, users were calling `.unwrap()` immediately after `.get_resource`. - Attempting to add more helpful error messages here resulted in endless manual boilerplate (see #3899 and the linked PRs). ## Solution - Add an infallible variant named `.resource` and so on. - Use these infallible variants over `.get_resource().unwrap()` across the code base. ## Notes I did not provide equivalent methods on `WorldCell`, in favor of removing it entirely in #3939. ## Migration Guide Infallible variants of `.get_resource` have been added that implicitly panic, rather than needing to be unwrapped. Replace `world.get_resource::<Foo>().unwrap()` with `world.resource::<Foo>()`. ## Impact - `.unwrap` search results before: 1084 - `.unwrap` search results after: 942 - internal `unwrap_or_else` calls added: 4 - trivial unwrap calls removed from tests and code: 146 - uses of the new `try_get_resource` API: 11 - percentage of the time the unwrapping API was used internally: 93%	2022-02-27 22:37:18 +00:00
Robert Swain	936468aa1e	bevy_render: Use RenderDevice to get limits/features and expose AdapterInfo (#3931 ) # Objective - `WgpuOptions` is mutated to be updated with the actual device limits and features, but this information is readily available to both the main and render worlds through the `RenderDevice` which has .limits() and .features() methods - Information about the adapter in terms of its name, the backend in use, etc were not being exposed but have clear use cases for being used to take decisions about what rendering code to use. For example, if something works well on AMD GPUs but poorly on Intel GPUs. Or perhaps something works well in Vulkan but poorly in DX12. ## Solution - Stop mutating `WgpuOptions `and don't insert the updated values into the main and render worlds - Return `AdapterInfo` from `initialize_renderer` and insert it into the main and render worlds - Use `RenderDevice` limits in the lighting code that was using `WgpuOptions.limits`. - Renamed `WgpuOptions` to `WgpuSettings`	2022-02-16 21:17:37 +00:00
danieleades	d8974e7c3d	small and mostly pointless refactoring (#2934 ) What is says on the tin. This has got more to do with making `clippy` slightly more quiet than it does with changing anything that might greatly impact readability or performance. that said, deriving `Default` for a couple of structs is a nice easy win	2022-02-13 22:33:55 +00:00
Robert Swain	803e8cdf80	bevy_render: Support overriding wgpu features and limits (#3912 ) # Objective - Support overriding wgpu features and limits that were calculated from default values or queried from the adapter/backend. - Fixes #3686 ## Solution - Add `disabled_features: Option<wgpu::Features>` to `WgpuOptions` - Add `constrained_limits: Option<wgpu::Limits>` to `WgpuOptions` - After maybe obtaining updated features and limits from the adapter/backend in the case of `WgpuOptionsPriority::Functionality`, enable the `WgpuOptions` `features`, disable the `disabled_features`, and constrain the `limits` by `constrained_limits`. - Note that constraining the limits means for `wgpu::Limits` members named `max_.` we take the minimum of that which was configured/queried for the backend/adapter and the specified constrained limit value. This means the configured/queried value is used if the constrained limit is larger as that is as much as the device/API supports, or the constrained limit value is used if it is smaller as we are imposing an artificial constraint. For members named `min_.` we take the maximum instead. For example, a minimum stride might be 256 but we set constrained limit value of 1024, then 1024 is the more conservative value. If the constrained limit value were 16, then 256 would be the more conservative.	2022-02-13 04:24:52 +00:00
Robert Swain	33ef5b5039	bevy_render: Only auto-disable mappable primary buffers for discrete GPUs (#3803 ) # Objective - While it is not safe to enable mappable primary buffers for all GPUs, it should be preferred for integrated GPUs where an integrated GPU is one that is sharing system memory. ## Solution - Auto-disable mappable primary buffers only for discrete GPUs. If the GPU is integrated and mappable primary buffers are supported, use them.	2022-01-31 01:22:17 +00:00
Robert Swain	ef823d369f	bevy_render: Do not automatically enable MAPPABLE_PRIMARY_BUFFERS (#3698 ) # Objective - When using `WgpuOptionsPriority::Functionality`, which is the default, wgpu::Features::MAPPABLE_PRIMARY_BUFFERS would be automatically enabled. This feature can and does have a significant negative impact on performance for discrete GPUs where resizable bar is not supported, which is a common case. As such, this feature should not be automatically enabled. - Fixes the performance regression part of https://github.com/bevyengine/bevy/issues/3686 and at least some, if not all cases of https://github.com/bevyengine/bevy/issues/3687 ## Solution - When using `WgpuOptionsPriority::Functionality`, use the adapter-supported features, enable `TEXTURE_ADAPTER_SPECIFIC_FORMAT_FEATURES` and disable `MAPPABLE_PRIMARY_BUFFERS`	2022-01-17 22:03:14 +00:00
Michael Dorst	e56685370b	Fix `doc_markdown` lints in `bevy_render` (#3479 ) #3457 adds the `doc_markdown` clippy lint, which checks doc comments to make sure code identifiers are escaped with backticks. This causes a lot of lint errors, so this is one of a number of PR's that will fix those lint errors one crate at a time. This PR fixes lints in the `bevy_render` crate.	2022-01-09 11:09:46 +00:00
Robert Swain	b9c623e4f3	Configurable wgpu features/limits priority (#3452 ) # Objective - Allow the user to specify the priority when configuring wgpu features/limits and by default use the maximum capabilities of the chosen adapter. ## Solution - Add a `WgpuOptionsPriority` enum with `Compatibility`, `Functionality` and `WebGL2` options. - Add a `priority: WgpuOptionsPriority` member to `WgpuOptions`. - When initialising the renderer, if `WgpuOptions::priority == WgpuOptionsPriority::Functionality`, query the adapter for the available features and limits, use them when creating a device, and update `WgpuOptions` with those values. If `Compatibility` use the behaviour as before this PR. If `WebGL2` then use the WebGL2 downlevel limits as used when when building for wasm, for convenience of testing WebGL2 limits without having to build for wasm. - Add an environment variable `WGPU_OPTIONS_PRIO` that takes `compatibility`, `functionality`, `webgl2`. - Default to `WgpuOptionsPriority::Functionality`. - Insert updated `WgpuOptions` into render app world as well. This is useful for applying the limits when rendering, such as limiting the directional light shadow map texture to 2048x2048 when using WebGL2 downlevel limits but not on wasm. - Reduced `draw_state` logs from `debug` to `trace` and added `debug` level logs for the wgpu features and limits. Use `RUST_LOG=bevy_render=debug` to see the output.	2022-01-04 20:08:12 +00:00
François	6c479649bf	enable Webgl2 optimisation in pbr under feature (#3291 ) # Objective - 3d examples fail to run in webgl2 because of unsupported texture formats or texture too large ## Solution - switch to supported formats if a feature is enabled. I choose a feature instead of a build target to not conflict with a potential webgpu support Very inspired by `6813b2edc5`, and need #3290 to work. I named the feature `webgl2`, but it's only needed if one want to use PBR in webgl2. Examples using only 2D already work. Co-authored-by: François <8672791+mockersf@users.noreply.github.com>	2021-12-22 20:59:48 +00:00
François	6a0008f3d3	Fix doc warnings (#3339 ) # Objective - There are a few warnings when building Bevy docs for dead links - CI seems to not catch those warnings when it should ## Solution - Enable doc CI on all Bevy workspace - Fix warnings - Also noticed plugin GilrsPlugin was not added anymore when feature was enabled First commit to check that CI would actually fail with it: https://github.com/bevyengine/bevy/runs/4532652688?check_suite_focus=true Co-authored-by: François <8672791+mockersf@users.noreply.github.com>	2021-12-18 00:09:23 +00:00
Carter Anderson	ffecb05a0a	Replace old renderer with new renderer (#3312 ) This makes the [New Bevy Renderer](#2535) the default (and only) renderer. The new renderer isn't _quite_ ready for the final release yet, but I want as many people as possible to start testing it so we can identify bugs and address feedback prior to release. The examples are all ported over and operational with a few exceptions: * I removed a good portion of the examples in the `shader` folder. We still have some work to do in order to make these examples possible / ergonomic / worthwhile: #3120 and "high level shader material plugins" are the big ones. This is a temporary measure. * Temporarily removed the multiple_windows example: doing this properly in the new renderer will require the upcoming "render targets" changes. Same goes for the render_to_texture example. * Removed z_sort_debug: entity visibility sort info is no longer available in app logic. we could do this on the "render app" side, but i dont consider it a priority.	2021-12-14 03:58:23 +00:00
Carter Anderson	43e8a156fb	Upgrade to wgpu 0.11 (#2933 ) Upgrades both the old and new renderer to wgpu 0.11 (and naga 0.7). This builds on @zicklag's work here #2556. Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2021-10-08 19:55:24 +00:00
Jonathan Behrens	4b1d47da99	Enable downcasting of `RenderContext` (#2240 ) Related to https://github.com/bevyengine/bevy/discussions/2210. This may make it possible to have external `wgpu` libraries work with `bevy`.	2021-05-24 19:38:33 +00:00
Yoh Deadfall	653c10371e	Use bevy_reflect as path in case of no direct references (#1875 ) Fixes #1844 Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2021-05-19 19:03:36 +00:00
Paweł Grabarz	189df30a83	use bytemuck crate instead of Byteable trait (#2183 ) This gets rid of multiple unsafe blocks that we had to maintain ourselves, and instead depends on library that's commonly used and supported by the ecosystem. We also get support for glam types for free. There is still some things to clear up with the `Bytes` trait, but that is a bit more substantial change and can be done separately. Also there are already separate efforts to use `crevice` crate, so I've just added that as a TODO.	2021-05-17 22:29:10 +00:00
Nathan Ward	b07db8462f	Bevy derives handling generics in impl definitions. (#2044 ) Fixes #2037 (and then some) Problem: - `TypeUuid`, `RenderResource`, and `Bytes` derive macros did not properly handle generic structs. Solution: - Rework the derive macro implementations to handle the generics.	2021-05-01 02:57:20 +00:00
TehPers	e0b52079da	Implement RenderResource for Box<T> (#1893 ) Allows render resources to move data to the heap by boxing them. I did this as a workaround to #1892, but it seems like it'd be useful regardless. If not, feel free to close this PR.	2021-04-14 23:58:25 +00:00
TehPers	deb9f23667	Implement Byteable and RenderResource for [T; N] (#1872 ) Implements `Byteable` and `RenderResource` for any array containing `Byteable` elements. This allows `RenderResources` to be implemented on structs with arbitrarily-sized arrays, among other things: ```rust #[derive(RenderResources, TypeUuid)] #[uuid = "2733ff34-8f95-459f-bf04-3274e686ac5f"] struct Foo { buffer: [i32; 256], } ```	2021-04-14 22:20:25 +00:00
MinerSebas	0fce6f0406	Override size_hint for all Iterators and add ExactSizeIterator where applicable (#1734 ) After #1697 I looked at all other Iterators from Bevy and added overrides for `size_hint` where it wasn't done. Also implemented `ExactSizeIterator` where applicable.	2021-04-13 01:28:14 +00:00
Carter Anderson	b17f8a4bce	format comments (#1612 ) Uses the new unstable comment formatting features added to rustfmt.toml.	2021-03-11 00:27:30 +00:00
Carter Anderson	3a2a68852c	Bevy ECS V2 (#1525 ) # Bevy ECS V2 This is a rewrite of Bevy ECS (basically everything but the new executor/schedule, which are already awesome). The overall goal was to improve the performance and versatility of Bevy ECS. Here is a quick bulleted list of changes before we dive into the details: * Complete World rewrite * Multiple component storage types: * Tables: fast cache friendly iteration, slower add/removes (previously called Archetypes) * Sparse Sets: fast add/remove, slower iteration * Stateful Queries (caches query results for faster iteration. fragmented iteration is _fast_ now) * Stateful System Params (caches expensive operations. inspired by @DJMcNab's work in #1364) * Configurable System Params (users can set configuration when they construct their systems. once again inspired by @DJMcNab's work) * Archetypes are now "just metadata", component storage is separate * Archetype Graph (for faster archetype changes) * Component Metadata * Configure component storage type * Retrieve information about component size/type/name/layout/send-ness/etc * Components are uniquely identified by a densely packed ComponentId * TypeIds are now totally optional (which should make implementing scripting easier) * Super fast "for_each" query iterators * Merged Resources into World. Resources are now just a special type of component * EntityRef/EntityMut builder apis (more efficient and more ergonomic) * Fast bitset-backed `Access<T>` replaces old hashmap-based approach everywhere * Query conflicts are determined by component access instead of archetype component access (to avoid random failures at runtime) * With/Without are still taken into account for conflicts, so this should still be comfy to use * Much simpler `IntoSystem` impl * Significantly reduced the amount of hashing throughout the ecs in favor of Sparse Sets (indexed by densely packed ArchetypeId, ComponentId, BundleId, and TableId) * Safety Improvements * Entity reservation uses a normal world reference instead of unsafe transmute * QuerySets no longer transmute lifetimes * Made traits "unsafe" where relevant * More thorough safety docs * WorldCell * Exposes safe mutable access to multiple resources at a time in a World * Replaced "catch all" `System::update_archetypes(world: &World)` with `System::new_archetype(archetype: &Archetype)` * Simpler Bundle implementation * Replaced slow "remove_bundle_one_by_one" used as fallback for Commands::remove_bundle with fast "remove_bundle_intersection" * Removed `Mut<T>` query impl. it is better to only support one way: `&mut T` * Removed with() from `Flags<T>` in favor of `Option<Flags<T>>`, which allows querying for flags to be "filtered" by default * Components now have is_send property (currently only resources support non-send) * More granular module organization * New `RemovedComponents<T>` SystemParam that replaces `query.removed::<T>()` * `world.resource_scope()` for mutable access to resources and world at the same time * WorldQuery and QueryFilter traits unified. FilterFetch trait added to enable "short circuit" filtering. Auto impled for cases that don't need it * Significantly slimmed down SystemState in favor of individual SystemParam state * System Commands changed from `commands: &mut Commands` back to `mut commands: Commands` (to allow Commands to have a World reference) Fixes #1320 ## `World` Rewrite This is a from-scratch rewrite of `World` that fills the niche that `hecs` used to. Yes, this means Bevy ECS is no longer a "fork" of hecs. We're going out our own! (the only shared code between the projects is the entity id allocator, which is already basically ideal) A huge shout out to @SanderMertens (author of [flecs](https://github.com/SanderMertens/flecs)) for sharing some great ideas with me (specifically hybrid ecs storage and archetype graphs). He also helped advise on a number of implementation details. ## Component Storage (The Problem) Two ECS storage paradigms have gained a lot of traction over the years: * Archetypal ECS: * Stores components in "tables" with static schemas. Each "column" stores components of a given type. Each "row" is an entity. * Each "archetype" has its own table. Adding/removing an entity's component changes the archetype. * Enables super-fast Query iteration due to its cache-friendly data layout * Comes at the cost of more expensive add/remove operations for an Entity's components, because all components need to be copied to the new archetype's "table" * Sparse Set ECS: * Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids) * Query iteration is slower than Archetypal ECS because each entity's component could be at any position in the sparse set. This "random access" pattern isn't cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array. * Adding/removing components is a cheap, constant time operation Bevy ECS V1, hecs, legion, flec, and Unity DOTS are all "archetypal ecs-es". I personally think "archetypal" storage is a good default for game engines. An entity's archetype doesn't need to change frequently in general, and it creates "fast by default" query iteration (which is a much more common operation). It is also "self optimizing". Users don't need to think about optimizing component layouts for iteration performance. It "just works" without any extra boilerplate. Shipyard and EnTT are "sparse set ecs-es". They employ "packing" as a way to work around the "suboptimal by default" iteration performance for specific sets of components. This helps, but I didn't think this was a good choice for a general purpose engine like Bevy because: 1. "packs" conflict with each other. If bevy decides to internally pack the Transform and GlobalTransform components, users are then blocked if they want to pack some custom component with Transform. 2. users need to take manual action to optimize Developers selecting an ECS framework are stuck with a hard choice. Select an "archetypal" framework with "fast iteration everywhere" but without the ability to cheaply add/remove components, or select a "sparse set" framework to cheaply add/remove components but with slower iteration performance. ## Hybrid Component Storage (The Solution) In Bevy ECS V2, we get to have our cake and eat it too. It now has _both_ of the component storage types above (and more can be added later if needed): * Tables (aka "archetypal" storage) * The default storage. If you don't configure anything, this is what you get * Fast iteration by default * Slower add/remove operations * Sparse Sets * Opt-in * Slower iteration * Faster add/remove operations These storage types complement each other perfectly. By default Query iteration is fast. If developers know that they want to add/remove a component at high frequencies, they can set the storage to "sparse set": ```rust world.register_component( ComponentDescriptor:🆕:<MyComponent>(StorageType::SparseSet) ).unwrap(); ``` ## Archetypes Archetypes are now "just metadata" ... they no longer store components directly. They do store: * The `ComponentId`s of each of the Archetype's components (and that component's storage type) * Archetypes are uniquely defined by their component layouts * For example: entities with "table" components `[A, B, C]` _and_ "sparse set" components `[D, E]` will always be in the same archetype. * The `TableId` associated with the archetype * For now each archetype has exactly one table (which can have no components), * There is a 1->Many relationship from Tables->Archetypes. A given table could have any number of archetype components stored in it: * Ex: an entity with "table storage" components `[A, B, C]` and "sparse set" components `[D, E]` will share the same `[A, B, C]` table as an entity with `[A, B, C]` table component and `[F]` sparse set components. * This 1->Many relationship is how we preserve fast "cache friendly" iteration performance when possible (more on this later) * A list of entities that are in the archetype and the row id of the table they are in * ArchetypeComponentIds * unique densely packed identifiers for (ArchetypeId, ComponentId) pairs * used by the schedule executor for cheap system access control * "Archetype Graph Edges" (see the next section) ## The "Archetype Graph" Archetype changes in Bevy (and a number of other archetypal ecs-es) have historically been expensive to compute. First, you need to allocate a new vector of the entity's current component ids, add or remove components based on the operation performed, sort it (to ensure it is order-independent), then hash it to find the archetype (if it exists). And thats all before we get to the _already_ expensive full copy of all components to the new table storage. The solution is to build a "graph" of archetypes to cache these results. @SanderMertens first exposed me to the idea (and he got it from @gjroelofs, who came up with it). They propose adding directed edges between archetypes for add/remove component operations. If `ComponentId`s are densely packed, you can use sparse sets to cheaply jump between archetypes. Bevy takes this one step further by using add/remove `Bundle` edges instead of `Component` edges. Bevy encourages the use of `Bundles` to group add/remove operations. This is largely for "clearer game logic" reasons, but it also helps cut down on the number of archetype changes required. `Bundles` now also have densely-packed `BundleId`s. This allows us to use a _single_ edge for each bundle operation (rather than needing to traverse N edges ... one for each component). Single component operations are also bundles, so this is strictly an improvement over a "component only" graph. As a result, an operation that used to be _heavy_ (both for allocations and compute) is now two dirt-cheap array lookups and zero allocations. ## Stateful Queries World queries are now stateful. This allows us to: 1. Cache archetype (and table) matches * This resolves another issue with (naive) archetypal ECS: query performance getting worse as the number of archetypes goes up (and fragmentation occurs). 2. Cache Fetch and Filter state * The expensive parts of fetch/filter operations (such as hashing the TypeId to find the ComponentId) now only happen once when the Query is first constructed 3. Incrementally build up state * When new archetypes are added, we only process the new archetypes (no need to rebuild state for old archetypes) As a result, the direct `World` query api now looks like this: ```rust let mut query = world.query::<(&A, &mut B)>(); for (a, mut b) in query.iter_mut(&mut world) { } ``` Requiring `World` to generate stateful queries (rather than letting the `QueryState` type be constructed separately) allows us to ensure that _all_ queries are properly initialized (and the relevant world state, such as ComponentIds). This enables QueryState to remove branches from its operations that check for initialization status (and also enables query.iter() to take an immutable world reference because it doesn't need to initialize anything in world). However in systems, this is a non-breaking change. State management is done internally by the relevant SystemParam. ## Stateful SystemParams Like Queries, `SystemParams` now also cache state. For example, `Query` system params store the "stateful query" state mentioned above. Commands store their internal `CommandQueue`. This means you can now safely use as many separate `Commands` parameters in your system as you want. `Local<T>` system params store their `T` value in their state (instead of in Resources). SystemParam state also enabled a significant slim-down of SystemState. It is much nicer to look at now. Per-SystemParam state naturally insulates us from an "aliased mut" class of errors we have hit in the past (ex: using multiple `Commands` system params). (credit goes to @DJMcNab for the initial idea and draft pr here #1364) ## Configurable SystemParams @DJMcNab also had the great idea to make SystemParams configurable. This allows users to provide some initial configuration / values for system parameters (when possible). Most SystemParams have no config (the config type is `()`), but the `Local<T>` param now supports user-provided parameters: ```rust fn foo(value: Local<usize>) { } app.add_system(foo.system().config(\|c\| c.0 = Some(10))); ``` ## Uber Fast "for_each" Query Iterators Developers now have the choice to use a fast "for_each" iterator, which yields ~1.5-3x iteration speed improvements for "fragmented iteration", and minor ~1.2x iteration speed improvements for unfragmented iteration. ```rust fn system(query: Query<(&A, &mut B)>) { // you now have the option to do this for a speed boost query.for_each_mut(\|(a, mut b)\| { }); // however normal iterators are still available for (a, mut b) in query.iter_mut() { } } ``` I think in most cases we should continue to encourage "normal" iterators as they are more flexible and more "rust idiomatic". But when that extra "oomf" is needed, it makes sense to use `for_each`. We should also consider using `for_each` for internal bevy systems to give our users a nice speed boost (but that should be a separate pr). ## Component Metadata `World` now has a `Components` collection, which is accessible via `world.components()`. This stores mappings from `ComponentId` to `ComponentInfo`, as well as `TypeId` to `ComponentId` mappings (where relevant). `ComponentInfo` stores information about the component, such as ComponentId, TypeId, memory layout, send-ness (currently limited to resources), and storage type. ## Significantly Cheaper `Access<T>` We used to use `TypeAccess<TypeId>` to manage read/write component/archetype-component access. This was expensive because TypeIds must be hashed and compared individually. The parallel executor got around this by "condensing" type ids into bitset-backed access types. This worked, but it had to be re-generated from the `TypeAccess<TypeId>`sources every time archetypes changed. This pr removes TypeAccess in favor of faster bitset access everywhere. We can do this thanks to the move to densely packed `ComponentId`s and `ArchetypeComponentId`s. ## Merged Resources into World Resources had a lot of redundant functionality with Components. They stored typed data, they had access control, they had unique ids, they were queryable via SystemParams, etc. In fact the _only_ major difference between them was that they were unique (and didn't correlate to an entity). Separate resources also had the downside of requiring a separate set of access controls, which meant the parallel executor needed to compare more bitsets per system and manage more state. I initially got the "separate resources" idea from `legion`. I think that design was motivated by the fact that it made the direct world query/resource lifetime interactions more manageable. It certainly made our lives easier when using Resources alongside hecs/bevy_ecs. However we already have a construct for safely and ergonomically managing in-world lifetimes: systems (which use `Access<T>` internally). This pr merges Resources into World: ```rust world.insert_resource(1); world.insert_resource(2.0); let a = world.get_resource::<i32>().unwrap(); let mut b = world.get_resource_mut::<f64>().unwrap(); b = 3.0; ``` Resources are now just a special kind of component. They have their own ComponentIds (and their own resource TypeId->ComponentId scope, so they don't conflict wit components of the same type). They are stored in a special "resource archetype", which stores components inside the archetype using a new `unique_components` sparse set (note that this sparse set could later be used to implement Tags). This allows us to keep the code size small by reusing existing datastructures (namely Column, Archetype, ComponentFlags, and ComponentInfo). This allows us the executor to use a single `Access<ArchetypeComponentId>` per system. It should also make scripting language integration easier. _But_ this merge did create problems for people directly interacting with `World`. What if you need mutable access to multiple resources at the same time? `world.get_resource_mut()` borrows World mutably! ## WorldCell WorldCell applies the `Access<ArchetypeComponentId>` concept to direct world access: ```rust let world_cell = world.cell(); let a = world_cell.get_resource_mut::<i32>().unwrap(); let b = world_cell.get_resource_mut::<f64>().unwrap(); ``` This adds cheap runtime checks (a sparse set lookup of `ArchetypeComponentId` and a counter) to ensure that world accesses do not conflict with each other. Each operation returns a `WorldBorrow<'w, T>` or `WorldBorrowMut<'w, T>` wrapper type, which will release the relevant ArchetypeComponentId resources when dropped. World caches the access sparse set (and only one cell can exist at a time), so `world.cell()` is a cheap operation. WorldCell does _not_ use atomic operations. It is non-send, does a mutable borrow of world to prevent other accesses, and uses a simple `Rc<RefCell<ArchetypeComponentAccess>>` wrapper in each WorldBorrow pointer. The api is currently limited to resource access, but it can and should be extended to queries / entity component access. ## Resource Scopes WorldCell does not yet support component queries, and even when it does there are sometimes legitimate reasons to want a mutable world ref _and_ a mutable resource ref (ex: bevy_render and bevy_scene both need this). In these cases we could always drop down to the unsafe `world.get_resource_unchecked_mut()`, but that is not ideal! Instead developers can use a "resource scope" ```rust world.resource_scope(\|world: &mut World, a: &mut A\| { }) ``` This temporarily removes the `A` resource from `World`, provides mutable pointers to both, and re-adds A to World when finished. Thanks to the move to ComponentIds/sparse sets, this is a cheap operation. If multiple resources are required, scopes can be nested. We could also consider adding a "resource tuple" to the api if this pattern becomes common and the boilerplate gets nasty. ## Query Conflicts Use ComponentId Instead of ArchetypeComponentId For safety reasons, systems cannot contain queries that conflict with each other without wrapping them in a QuerySet. On bevy `main`, we use ArchetypeComponentIds to determine conflicts. This is nice because it can take into account filters: ```rust // these queries will never conflict due to their filters fn filter_system(a: Query<&mut A, With<B>>, b: Query<&mut B, Without<B>>) { } ``` But it also has a significant downside: ```rust // these queries will not conflict _until_ an entity with A, B, and C is spawned fn maybe_conflicts_system(a: Query<(&mut A, &C)>, b: Query<(&mut A, &B)>) { } ``` The system above will panic at runtime if an entity with A, B, and C is spawned. This makes it hard to trust that your game logic will run without crashing. In this pr, I switched to using `ComponentId` instead. This _is_ more constraining. `maybe_conflicts_system` will now always fail, but it will do it consistently at startup. Naively, it would also _disallow_ `filter_system`, which would be a significant downgrade in usability. Bevy has a number of internal systems that rely on disjoint queries and I expect it to be a common pattern in userspace. To resolve this, I added a new `FilteredAccess<T>` type, which wraps `Access<T>` and adds with/without filters. If two `FilteredAccess` have with/without values that prove they are disjoint, they will no longer conflict. ## EntityRef / EntityMut World entity operations on `main` require that the user passes in an `entity` id to each operation: ```rust let entity = world.spawn((A, )); // create a new entity with A world.get::<A>(entity); world.insert(entity, (B, C)); world.insert_one(entity, D); ``` This means that each operation needs to look up the entity location / verify its validity. The initial spawn operation also requires a Bundle as input. This can be awkward when no components are required (or one component is required). These operations have been replaced by `EntityRef` and `EntityMut`, which are "builder-style" wrappers around world that provide read and read/write operations on a single, pre-validated entity: ```rust // spawn now takes no inputs and returns an EntityMut let entity = world.spawn() .insert(A) // insert a single component into the entity .insert_bundle((B, C)) // insert a bundle of components into the entity .id() // id returns the Entity id // Returns EntityMut (or panics if the entity does not exist) world.entity_mut(entity) .insert(D) .insert_bundle(SomeBundle::default()); { // returns EntityRef (or panics if the entity does not exist) let d = world.entity(entity) .get::<D>() // gets the D component .unwrap(); // world.get still exists for ergonomics let d = world.get::<D>(entity).unwrap(); } // These variants return Options if you want to check existence instead of panicing world.get_entity_mut(entity) .unwrap() .insert(E); if let Some(entity_ref) = world.get_entity(entity) { let d = entity_ref.get::<D>().unwrap(); } ``` This _does not_ affect the current Commands api or terminology. I think that should be a separate conversation as that is a much larger breaking change. ## Safety Improvements Entity reservation in Commands uses a normal world borrow instead of an unsafe transmute * QuerySets no longer transmutes lifetimes * Made traits "unsafe" when implementing a trait incorrectly could cause unsafety * More thorough safety docs ## RemovedComponents SystemParam The old approach to querying removed components: `query.removed:<T>()` was confusing because it had no connection to the query itself. I replaced it with the following, which is both clearer and allows us to cache the ComponentId mapping in the SystemParamState: ```rust fn system(removed: RemovedComponents<T>) { for entity in removed.iter() { } } ``` ## Simpler Bundle implementation Bundles are no longer responsible for sorting (or deduping) TypeInfo. They are just a simple ordered list of component types / data. This makes the implementation smaller and opens the door to an easy "nested bundle" implementation in the future (which i might even add in this pr). Duplicate detection is now done once per bundle type by World the first time a bundle is used. ## Unified WorldQuery and QueryFilter types (don't worry they are still separate type _parameters_ in Queries .. this is a non-breaking change) WorldQuery and QueryFilter were already basically identical apis. With the addition of `FetchState` and more storage-specific fetch methods, the overlap was even clearer (and the redundancy more painful). QueryFilters are now just `F: WorldQuery where F::Fetch: FilterFetch`. FilterFetch requires `Fetch<Item = bool>` and adds new "short circuit" variants of fetch methods. This enables a filter tuple like `(With<A>, Without<B>, Changed<C>)` to stop evaluating the filter after the first mismatch is encountered. FilterFetch is automatically implemented for `Fetch` implementations that return bool. This forces fetch implementations that return things like `(bool, bool, bool)` (such as the filter above) to manually implement FilterFetch and decide whether or not to short-circuit. ## More Granular Modules World no longer globs all of the internal modules together. It now exports `core`, `system`, and `schedule` separately. I'm also considering exporting `core` submodules directly as that is still pretty "glob-ey" and unorganized (feedback welcome here). ## Remaining Draft Work (to be done in this pr) * ~~panic on conflicting WorldQuery fetches (&A, &mut A)~~ * ~~bevy `main` and hecs both currently allow this, but we should protect against it if possible~~ * ~~batch_iter / par_iter (currently stubbed out)~~ * ~~ChangedRes~~ * ~~I skipped this while we sort out #1313. This pr should be adapted to account for whatever we land on there~~. * ~~The `Archetypes` and `Tables` collections use hashes of sorted lists of component ids to uniquely identify each archetype/table. This hash is then used as the key in a HashMap to look up the relevant ArchetypeId or TableId. (which doesn't handle hash collisions properly)~~ * ~~It is currently unsafe to generate a Query from "World A", then use it on "World B" (despite the api claiming it is safe). We should probably close this gap. This could be done by adding a randomly generated WorldId to each world, then storing that id in each Query. They could then be compared to each other on each `query.do_thing(&world)` operation. This _does_ add an extra branch to each query operation, so I'm open to other suggestions if people have them.~~ * ~~Nested Bundles (if i find time)~~ ## Potential Future Work * Expand WorldCell to support queries. * Consider not allocating in the empty archetype on `world.spawn()` * ex: return something like EntityMutUninit, which turns into EntityMut after an `insert` or `insert_bundle` op * this actually regressed performance last time i tried it, but in theory it should be faster * Optimize SparseSet::insert (see `PERF` comment on insert) * Replace SparseArray `Option<T>` with T::MAX to cut down on branching * would enable cheaper get_unchecked() operations * upstream fixedbitset optimizations * fixedbitset could be allocation free for small block counts (store blocks in a SmallVec) * fixedbitset could have a const constructor * Consider implementing Tags (archetype-specific by-value data that affects archetype identity) * ex: ArchetypeA could have `[A, B, C]` table components and `[D(1)]` "tag" component. ArchetypeB could have `[A, B, C]` table components and a `[D(2)]` tag component. The archetypes are different, despite both having D tags because the value inside D is different. * this could potentially build on top of the `archetype.unique_components` added in this pr for resource storage. * Consider reverting `all_tuples` proc macro in favor of the old `macro_rules` implementation * all_tuples is more flexible and produces cleaner documentation (the macro_rules version produces weird type parameter orders due to parser constraints) * but unfortunately all_tuples also appears to make Rust Analyzer sad/slow when working inside of `bevy_ecs` (does not affect user code) * Consider "resource queries" and/or "mixed resource and entity component queries" as an alternative to WorldCell * this is basically just "systems" so maybe it's not worth it * Add more world ops * `world.clear()` * `world.reserve<T: Bundle>(count: usize)` * Try using the old archetype allocation strategy (allocate new memory on resize and copy everything over). I expect this to improve batch insertion performance at the cost of unbatched performance. But thats just a guess. I'm not an allocation perf pro :) * Adapt Commands apis for consistency with new World apis ## Benchmarks key: * `bevy_old`: bevy `main` branch * `bevy`: this branch * `_foreach`: uses an optimized for_each iterator * ` _sparse`: uses sparse set storage (if unspecified assume table storage) * `_system`: runs inside a system (if unspecified assume test happens via direct world ops) ### Simple Insert (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245573-9c3ce100-7795-11eb-9003-bfd41cd5c51f.png) ### Simpler Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245795-ffc70e80-7795-11eb-92fb-3ffad09aabf7.png) ### Fragment Iter (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109245849-0fdeee00-7796-11eb-8d25-eb6b7a682c48.png) ### Sparse Fragmented Iter Iterate a query that matches 5 entities from a single matching archetype, but there are 100 unmatching archetypes ![image](https://user-images.githubusercontent.com/2694663/109245916-2b49f900-7796-11eb-9a8f-ed89c203f940.png) ### Schedule (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246428-1fab0200-7797-11eb-8841-1b2161e90fa4.png) ### Add Remove Component (from ecs_bench_suite) ![image](https://user-images.githubusercontent.com/2694663/109246492-39e4e000-7797-11eb-8985-2706bd0495ab.png) ### Add Remove Component Big Same as the test above, but each entity has 5 "large" matrix components and 1 "large" matrix component is added and removed ![image](https://user-images.githubusercontent.com/2694663/109246517-449f7500-7797-11eb-835e-28b6790daeaa.png) ### Get Component Looks up a single component value a large number of times ![image](https://user-images.githubusercontent.com/2694663/109246129-87ad1880-7796-11eb-9fcb-c38012aa7c70.png)	2021-03-05 07:54:35 +00:00
Nathan Stocks	13b602ee3f	Xtask CI (#1387 ) This PR is easiest to review commit by commit. Followup on https://github.com/bevyengine/bevy/pull/1309#issuecomment-767310084 - [x] Switch from a bash script to an xtask rust workspace member. - Results in ~30s longer CI due to compilation of the xtask itself - Enables Bevy contributors on any platform to run `cargo ci` to run linting -- if the default available Rust is the same version as on CI, then the command should give an identical result. - [x] Use the xtask from official CI so there's only one place to update. - [x] Bonus: Run clippy on the _entire_ workspace (existing CI setup was missing the `--workspace` flag - [x] Clean up newly-exposed clippy errors ~#1388 builds on this to clean up newly discovered clippy errors -- I thought it might be nicer as a separate PR.~ Nope, merged it into this one so CI would pass. Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2021-02-22 08:42:19 +00:00

1 2 3

110 commits