Mirrors/bevy

mirror of https://github.com/bevyengine/bevy synced 2024-12-20 18:13:10 +00:00

Author	SHA1	Message	Date
Gino Valente	d21c7a1911	bevy_reflect: Function Overloading (Generic & Variadic Functions) (#15074 ) # Objective Currently function reflection requires users to manually monomorphize their generic functions. For example: ```rust fn add<T: Add<Output=T>>(a: T, b: T) -> T { a + b } // We have to specify the type of `T`: let reflect_add = add::<i32>.into_function(); ``` This PR doesn't aim to solve that problem—this is just a limitation in Rust. However, it also means that reflected functions can only ever work for a single monomorphization. If we wanted to support other types for `T`, we'd have to create a separate function for each one: ```rust let reflect_add_i32 = add::<i32>.into_function(); let reflect_add_u32 = add::<u32>.into_function(); let reflect_add_f32 = add::<f32>.into_function(); // ... ``` So in addition to requiring manual monomorphization, we also lose the benefit of having a single function handle multiple argument types. If a user wanted to create a small modding script that utilized function reflection, they'd have to either: - Store all sets of supported monomorphizations and require users to call the correct one - Write out some logic to find the correct function based on the given arguments While the first option would work, it wouldn't be very ergonomic. The second option is better, but it adds additional complexity to the user's logic—complexity that `bevy_reflect` could instead take on. ## Solution Introduce [function overloading](https://en.wikipedia.org/wiki/Function_overloading). A `DynamicFunction` can now be overloaded with other `DynamicFunction`s. We can rewrite the above code like so: ```rust let reflect_add = add::<i32> .into_function() .with_overload(add::<u32>) .with_overload(add::<f32>); ``` When invoked, the `DynamicFunction` will attempt to find a matching overload for the given set of arguments. And while I went into this PR only looking to improve generic function reflection, I accidentally added support for variadic functions as well (hence why I use the broader term "overload" over "generic"). ```rust // Supports 1 to 4 arguments let multiply_all = (\|a: i32\| a) .into_function() .with_overload(\|a: i32, b: i32\| a * b) .with_overload(\|a: i32, b: i32, c: i32\| a * b * c) .with_overload(\|a: i32, b: i32, c: i32, d: i32\| a * b * c * d); ``` This is simply an added bonus to this particular implementation. ~~Full variadic support (i.e. allowing for an indefinite number of arguments) will be added in a later PR.~~ I actually decided to limit the maximum number of arguments to 63 to supplement faster lookups, a reduced memory footprint, and faster cloning. ### Alternatives & Rationale I explored a few options for handling generic functions. This PR is the one I feel the most confident in, but I feel I should mention the others and why I ultimately didn't move forward with them. #### Adding `GenericDynamicFunction` TL;DR: Adding a distinct `GenericDynamicFunction` type unnecessarily splits and complicates the API. <details> <summary>Details</summary> My initial explorations involved a dedicated `GenericDynamicFunction` to contain and handle the mappings. This was initially started back when `DynamicFunction` was distinct from `DynamicClosure`. My goal was to not prevent us from being able to somehow make `DynamicFunction` implement `Copy`. But once we reverted back to a single `DynamicFunction`, that became a non-issue. But that aside, the real problem was that it created a split in the API. If I'm using a third-party library that uses function reflection, I have to know whether to request a `DynamicFunction` or a `GenericDynamicFunction`. I might not even know ahead of time which one I want. It might need to be determined at runtime. And if I'm creating a library, I might want a type to contain both `DynamicFunction` and `GenericDynamicFunction`. This might not be possible if, for example, I need to store the function in a `HashMap`. The other concern is with `IntoFunction`. Right now `DynamicFunction` trivially implements `IntoFunction` since it can just return itself. But what should `GenericDynamicFunction` do? It could return itself wrapped into a `DynamicFunction`, but then the API for `DynamicFunction` would have to account for this. So then what was the point of having a separate `GenericDynamicFunction` anyways? And even apart from `IntoFunction`, there's nothing stopping someone from manually creating a generic `DynamicFunction` through lying about its `FunctionInfo` and wrapping a `GenericDynamicFunction`. That being said, this is probably the "best" alternative if we added a `Function` trait and stored functions as `Box<dyn Function>`. However, I'm not convinced we gain much from this. Sure, we could keep the API for `DynamicFunction` the same, but consumers of `Function` will need to account for `GenericDynamicFunction` regardless (e.g. handling multiple `FunctionInfo`, a ranged argument count, etc.). And for all cases, except where using `DynamicFunction` directly, you end up treating them all like `GenericDynamicFunction`. Right now, if we did go with `GenericDynamicFunction`, the only major benefit we'd gain would be saving 24 bytes. If memory ever does become an issue here, we could swap over. But I think for the time being it's better for us to pursue a clearer mental model and end-user ergonomics through unification. </details> ##### Using the `FunctionRegistry` TL;DR: Having overloads only exist in the `FunctionRegistry` unnecessarily splits and complicates the API. <details> <summary>Details</summary> Another idea was to store the overloads in the `FunctionRegistry`. Users would then just call functions directly through the registry (i.e. `registry.call("my_func", my_args)`). I didn't go with this option because of how it specifically relies on the functions being registered. You'd not only always need access to the registry, but you'd need to ensure that the functions you want to call are even registered. It also means you can't just store a generic `DynamicFunction` on a type. Instead, you'll need to store the function's name and use that to look up the function in the registry—even if it's only ever used by that type. Doing so also removes all the benefits of `DynamicFunction`, such as the ability to pass it to functions accepting `IntoFunction`, modify it if needed, and so on. Like `GenericDynamicFunction` this introduces a split in the ecosystem: you either store `DynamicFunction`, store a string to look up the function, or force `DynamicFunction` to wrap your generic function anyways. Or worse yet: have `DynamicFunction` wrap the lookup function using `FunctionRegistryArc`. </details> #### Generic `ArgInfo` TL;DR: Allowing `ArgInfo` and `ReturnInfo` to store the generic information introduces a footgun when interpreting `FunctionInfo`. <details> <summary>Details</summary> Regardless of how we represent a generic function, one thing is clear: we need to be able to represent the information for such a function. This PR does so by introducing a `FunctionInfoType` enum to wrap one or more `FunctionInfo` values. Originally, I didn't do this. I had `ArgInfo` and `ReturnInfo` allow for generic types. This allowed us to have a single `FunctionInfo` to represent our function, but then I realized that it actually lies about our function. If we have two `ArgInfo` that both allow for either `i32` or `u32`, what does this tell us about our function? It turns out: nothing! We can't know whether our function takes `(i32, i32)`, `(u32, u32)`, `(i32, u32)`, or `(u32, i32)`. It therefore makes more sense to just represent a function with multiple `FunctionInfo` since that's really what it's made up of. </details> #### Flatten `FunctionInfo` TL;DR: Flattening removes additional per-overload information some users may desire and prevents us from adding more information in the future. <details> <summary>Details</summary> Why don't we just flatten multiple `FunctionInfo` into just one that can contain multiple signatures? This is something we could do, but I decided against it for a few reasons: - The only thing we'd be able to get rid of for each signature would be the `name`. While not enough to not do it, it doesn't really suggest we have to either. - Some consumers may want access to the names of the functions that make up the overloaded function. For example, to track a bug where an undesirable function is being added as an overload. Or to more easily locate the original function of an overload. - We may eventually allow for more information to be stored on `FunctionInfo`. For example, we may allow for documentation to be stored like we do for `TypeInfo`. Consumers of this documentation may want access to the documentation of each overload as they may provide documentation specific to that overload. </details> ## Testing This PR adds lots of tests and benchmarks, and also adds to the example. To run the tests: ``` cargo test --package bevy_reflect --all-features ``` To run the benchmarks: ``` cargo bench --bench reflect_function --all-features ``` To run the example: ``` cargo run --package bevy --example function_reflection --all-features ``` ### Benchmarks One of my goals with this PR was to leave the typical case of non-overloaded functions largely unaffected by the changes introduced in this PR. ~~And while the static size of `DynamicFunction` has increased by 17% (from 136 to 160 bytes), the performance has generally stayed the same~~ The static size of `DynamicFunction` has decreased from 136 to 112 bytes, while calling performance has generally stayed the same: \| \| `main` \| 7d293ab \| `252f3897d` \| \|-------------------------------------\|--------\|---------\|-----------\| \| `into/function` \| 37 ns \| 46 ns \| 142 ns \| \| `with_overload/01_simple_overload` \| - \| 149 ns \| 268 ns \| \| `with_overload/01_complex_overload` \| - \| 332 ns \| 431 ns \| \| `with_overload/10_simple_overload` \| - \| 1266 ns \| 2618 ns \| \| `with_overload/10_complex_overload` \| - \| 2544 ns \| 4170 ns \| \| `call/function` \| 57 ns \| 58 ns \| 61 ns \| \| `call/01_simple_overload` \| - \| 255 ns \| 242 ns \| \| `call/01_complex_overload` \| - \| 595 ns \| 431 ns \| \| `call/10_simple_overload` \| - \| 740 ns \| 699 ns \| \| `call/10_complex_overload` \| - \| 1824 ns \| 1618 ns \| For the overloaded function tests, the leading number indicates how many overloads there are: `01` indicates 1 overload, `10` indicates 10 overloads. The `complex` cases have 10 unique generic types and 10 arguments, compared to the `simple` 1 generic type and 2 arguments. I aimed to prioritize the performance of calling the functions over creating them, hence creation speed tends to be a bit slower. There may be other optimizations we can look into but that's probably best saved for a future PR. The important bit is that the standard ~~`into/function`~~ and `call/function` benchmarks show minimal regressions. Since the latest changes, `into/function` does have some regressions, but again the priority was `call/function`. We can probably optimize `into/function` if needed in the future. --- ## Showcase Function reflection now supports [function overloading](https://en.wikipedia.org/wiki/Function_overloading)! This can be used to simulate generic functions: ```rust fn add<T: Add<Output=T>>(a: T, b: T) -> T { a + b } let reflect_add = add::<i32> .into_function() .with_overload(add::<u32>) .with_overload(add::<f32>); let args = ArgList::default().push_owned(25_i32).push_owned(75_i32); let result = func.call(args).unwrap().unwrap_owned(); assert_eq!(result.try_take::<i32>().unwrap(), 100); let args = ArgList::default().push_owned(25.0_f32).push_owned(75.0_f32); let result = func.call(args).unwrap().unwrap_owned(); assert_eq!(result.try_take::<f32>().unwrap(), 100.0); ``` You can also simulate variadic functions: ```rust #[derive(Reflect, PartialEq, Debug)] struct Player { name: Option<String>, health: u32, } // Creates a `Player` with one of the following: // - No name and 100 health // - A name and 100 health // - No name and custom health // - A name and custom health let create_player = (\|\| Player { name: None, health: 100, }) .into_function() .with_overload(\|name: String\| Player { name: Some(name), health: 100, }) .with_overload(\|health: u32\| Player { name: None, health }) .with_overload(\|name: String, health: u32\| Player { name: Some(name), health, }); let args = ArgList::default() .push_owned(String::from("Urist")) .push_owned(55_u32); let player = create_player .call(args) .unwrap() .unwrap_owned() .try_take::<Player>() .unwrap(); assert_eq!( player, Player { name: Some(String::from("Urist")), health: 55 } ); ```	2024-12-10 01:51:47 +00:00
Zachary Harrold	a35811d088	Add Immutable `Component` Support (#16372 ) # Objective - Fixes #16208 ## Solution - Added an associated type to `Component`, `Mutability`, which flags whether a component is mutable, or immutable. If `Mutability= Mutable`, the component is mutable. If `Mutability= Immutable`, the component is immutable. - Updated `derive_component` to default to mutable unless an `#[component(immutable)]` attribute is added. - Updated `ReflectComponent` to check if a component is mutable and, if not, panic when attempting to mutate. ## Testing - CI - `immutable_components` example. --- ## Showcase Users can now mark a component as `#[component(immutable)]` to prevent safe mutation of a component while it is attached to an entity: ```rust #[derive(Component)] #[component(immutable)] struct Foo { // ... } ``` This prevents creating an exclusive reference to the component while it is attached to an entity. This is particularly powerful when combined with component hooks, as you can now fully track a component's value, ensuring whatever invariants you desire are upheld. Before this would be done my making a component private, and manually creating a `QueryData` implementation which only permitted read access. <details> <summary>Using immutable components as an index</summary> ```rust /// This is an example of a component like [`Name`](bevy::prelude::Name), but immutable. #[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Component)] #[component( immutable, on_insert = on_insert_name, on_replace = on_replace_name, )] pub struct Name(pub &'static str); /// This index allows for O(1) lookups of an [`Entity`] by its [`Name`]. #[derive(Resource, Default)] struct NameIndex { name_to_entity: HashMap<Name, Entity>, } impl NameIndex { fn get_entity(&self, name: &'static str) -> Option<Entity> { self.name_to_entity.get(&Name(name)).copied() } } fn on_insert_name(mut world: DeferredWorld<'_>, entity: Entity, _component: ComponentId) { let Some(&name) = world.entity(entity).get::<Name>() else { unreachable!() }; let Some(mut index) = world.get_resource_mut::<NameIndex>() else { return; }; index.name_to_entity.insert(name, entity); } fn on_replace_name(mut world: DeferredWorld<'_>, entity: Entity, _component: ComponentId) { let Some(&name) = world.entity(entity).get::<Name>() else { unreachable!() }; let Some(mut index) = world.get_resource_mut::<NameIndex>() else { return; }; index.name_to_entity.remove(&name); } // Setup our name index world.init_resource::<NameIndex>(); // Spawn some entities! let alyssa = world.spawn(Name("Alyssa")).id(); let javier = world.spawn(Name("Javier")).id(); // Check our index let index = world.resource::<NameIndex>(); assert_eq!(index.get_entity("Alyssa"), Some(alyssa)); assert_eq!(index.get_entity("Javier"), Some(javier)); // Changing the name of an entity is also fully capture by our index world.entity_mut(javier).insert(Name("Steven")); // Javier changed their name to Steven let steven = javier; // Check our index let index = world.resource::<NameIndex>(); assert_eq!(index.get_entity("Javier"), None); assert_eq!(index.get_entity("Steven"), Some(steven)); ``` </details> Additionally, users can use `Component<Mutability = ...>` in trait bounds to enforce that a component _is_ mutable or _is_ immutable. When using `Component` as a trait bound without specifying `Mutability`, any component is applicable. However, methods which only work on mutable or immutable components are unavailable, since the compiler must be pessimistic about the type. ## Migration Guide - When implementing `Component` manually, you must now provide a type for `Mutability`. The type `Mutable` provides equivalent behaviour to earlier versions of `Component`: ```rust impl Component for Foo { type Mutability = Mutable; // ... } ``` - When working with generic components, you may need to specify that your generic parameter implements `Component<Mutability = Mutable>` rather than `Component` if you require mutable access to said component. - The entity entry API has had to have some changes made to minimise friction when working with immutable components. Methods which previously returned a `Mut<T>` will now typically return an `OccupiedEntry<T>` instead, requiring you to add an `into_mut()` to get the `Mut<T>` item again. ## Draft Release Notes Components can now be made immutable while stored within the ECS. Components are the fundamental unit of data within an ECS, and Bevy provides a number of ways to work with them that align with Rust's rules around ownership and borrowing. One part of this is hooks, which allow for defining custom behavior at key points in a component's lifecycle, such as addition and removal. However, there is currently no way to respond to _mutation_ of a component using hooks. The reasons for this are quite technical, but to summarize, their addition poses a significant challenge to Bevy's core promises around performance. Without mutation hooks, it's relatively trivial to modify a component in such a way that breaks invariants it intends to uphold. For example, you can use `core::mem::swap` to swap the components of two entities, bypassing the insertion and removal hooks. This means the only way to react to this modification is via change detection in a system, which then begs the question of what happens _between_ that alteration and the next run of that system? Alternatively, you could make your component private to prevent mutation, but now you need to provide commands and a custom `QueryData` implementation to allow users to interact with your component at all. Immutable components solve this problem by preventing the creation of an exclusive reference to the component entirely. Without an exclusive reference, the only way to modify an immutable component is via removal or replacement, which is fully captured by component hooks. To make a component immutable, simply add `#[component(immutable)]`: ```rust #[derive(Component)] #[component(immutable)] struct Foo { // ... } ``` When implementing `Component` manually, there is an associated type `Mutability` which controls this behavior: ```rust impl Component for Foo { type Mutability = Mutable; // ... } ``` Note that this means when working with generic components, you may need to specify that a component is mutable to gain access to certain methods: ```rust // Before fn bar<C: Component>() { // ... } // After fn bar<C: Component<Mutability = Mutable>>() { // ... } ``` With this new tool, creating index components, or caching data on an entity should be more user friendly, allowing libraries to provide APIs relying on components and hooks to uphold their invariants. ## Notes - ~~I've done my best to implement this feature, but I'm not happy with how reflection has turned out. If any reflection SMEs know a way to improve this situation I'd greatly appreciate it.~~ There is an outstanding issue around the fallibility of mutable methods on `ReflectComponent`, but the DX is largely unchanged from `main` now. - I've attempted to prevent all safe mutable access to a component that does not implement `Component<Mutability = Mutable>`, but there may still be some methods I have missed. Please indicate so and I will address them, as they are bugs. - Unsafe is an escape hatch I am _not_ attempting to prevent. Whatever you do with unsafe is between you and your compiler. - I am marking this PR as ready, but I suspect it will undergo fairly major revisions based on SME feedback. - I've marked this PR as _Uncontroversial_ based on the feature, not the implementation. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: Benjamin Brienen <benjamin.brienen@outlook.com> Co-authored-by: Gino Valente <49806985+MrGVSV@users.noreply.github.com> Co-authored-by: Nuutti Kotivuori <naked@iki.fi>	2024-12-05 14:27:48 +00:00
Joona Aalto	3ada15ee1c	Add more Glam types and constructors to prelude (#16261 ) # Objective Glam has some common and useful types and helpers that are not in the prelude of `bevy_math`. This includes shorthand constructors like `vec3`, or even `Vec3A`, the aligned version of `Vec3`. ```rust // The "normal" way to create a 3D vector let vec = Vec3::new(2.0, 1.0, -3.0); // Shorthand version let vec = vec3(2.0, 1.0, -3.0); ``` ## Solution Add the following types and methods to the prelude: - `vec2`, `vec3`, `vec3a`, `vec4` - `uvec2`, `uvec3`, `uvec4` - `ivec2`, `ivec3`, `ivec4` - `bvec2`, `bvec3`, `bvec3a`, `bvec4`, `bvec4a` - `mat2`, `mat3`, `mat3a`, `mat4` - `quat` (not sure if anyone uses this, but for consistency) - `Vec3A` - `BVec3A`, `BVec4A` - `Mat3A` I did not add the u16, i16, or f64 variants like `dvec2`, since there are currently no existing types like those in the prelude. The shorthand constructors are currently used a lot in some places in Bevy, and not at all in others. In a follow-up, we might want to consider if we have a preference for the shorthand, and make a PR to change the codebase to use it more consistently.	2024-11-11 18:47:16 +00:00
Benjamin Brienen	4df8b1998e	Allow or fix dead code in `benches` (#16282 ) # Objective Fixes #15806 ## Solution Fix an undeclared module and expect `dead_code`. ## Testing Run this command and see no `dead_code` warnings. `cargo +nightly check --benches --target-dir ../target --manifest-path ./benches/Cargo.toml`	2024-11-07 22:19:07 +00:00
Aevyrie	54b323ec80	Mesh picking fixes (#16110 ) # Objective - Mesh picking is noisy when a non triangle list is used - Mesh picking runs even when users don't need it - Resolve #16065 ## Solution - Don't add the mesh picking plugin by default - Remove error spam	2024-10-27 19:03:48 +00:00
MiniaczQ	e5e44888c6	Validate param benchmarks (#15885 ) # Objective Benchmark overhead of validation for: - `DynSystemParam`, - `ParamSet`, - combinator systems. Needed for #15606 ## Solution As noted in objective, I've added 3 benchmarks, where each uses an excessive amount of the specific functionality. I benchmark on the level of schedules, rather than individual `validate_param` calls, so we get a better idea how changes to the code impact memory-lookup, etc. related side effects. ## Testing ``` param/combinator_system/8_piped_systems time: [1.7560 µs 1.7865 µs 1.8180 µs] change: [+4.5244% +6.7955% +9.1413%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe param/combinator_system/8_dyn_params_system time: [89.354 ns 89.790 ns 90.300 ns] change: [+0.6751% +1.6825% +2.6842%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 6 (6.00%) high mild 3 (3.00%) high severe param/combinator_system/8_variant_param_set_system time: [88.295 ns 89.202 ns 90.208 ns] change: [+0.1320% +1.0060% +1.8482%] (p = 0.02 < 0.05) Change within noise threshold. Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild ``` 2 back-to-back runs of the benchmarks, there is quire a lot of noise, can use feedback on fixing that	2024-10-15 02:38:22 +00:00
Pablo Reinhardt	d96a9d15f6	Migrate from `Query::single` and friends to `Single` (#15872 ) # Objective - closes #15866 ## Solution - Simply migrate where possible. ## Testing - Expect that CI will do most of the work. Examples is another way of testing this, as most of the work is in that area. --- ## Notes For now, this PR doesn't migrate `QueryState::single` and friends as for now, this look like another issue. So for example, QueryBuilders that used single or `World::query` that used single wasn't migrated. If there is a easy way to migrate those, please let me know. Most of the uses of `Query::single` were removed, the only other uses that I found was related to tests of said methods, so will probably be removed when we remove `Query::single`.	2024-10-13 20:32:06 +00:00
JaySpruce	3d6b24880e	Add `insert_batch` and variations (#15702 ) # Objective `insert_or_spawn_batch` exists, but a version for just inserting doesn't - Closes #2693 - Closes #8384 - Adopts/supersedes #8600 ## Solution Add `insert_batch`, along with the most common `insert` variations: - `World::insert_batch` - `World::insert_batch_if_new` - `World::try_insert_batch` - `World::try_insert_batch_if_new` - `Commands::insert_batch` - `Commands::insert_batch_if_new` - `Commands::try_insert_batch` - `Commands::try_insert_batch_if_new` ## Testing Added tests, and added a benchmark for `insert_batch`. Performance is slightly better than `insert_or_spawn_batch` when only inserting: ![Code_HPnUN0QeWe](https://github.com/user-attachments/assets/53091e4f-6518-43f4-a63f-ae57d5470c66) <details> <summary>old benchmark</summary> This was before reworking it to remove the `UnsafeWorldCell`: ![Code_QhXJb8sjlJ](https://github.com/user-attachments/assets/1061e2a7-a521-48e1-a799-1b6b8d1c0b93) </details> --- ## Showcase Usage is the same as `insert_or_spawn_batch`: ``` use bevy_ecs::{entity::Entity, world::World, component::Component}; #[derive(Component)] struct A(&'static str); #[derive(Component, PartialEq, Debug)] struct B(f32); let mut world = World::new(); let entity_a = world.spawn_empty().id(); let entity_b = world.spawn_empty().id(); world.insert_batch([ (entity_a, (A("a"), B(0.0))), (entity_b, (A("b"), B(1.0))), ]); assert_eq!(world.get::<B>(entity_a), Some(&B(0.0))); ```	2024-10-13 18:14:16 +00:00
Joona Aalto	0e30b68b20	Add mesh picking backend and `MeshRayCast` system parameter (#15800 ) # Objective Closes #15545. `bevy_picking` supports UI and sprite picking, but not mesh picking. Being able to pick meshes would be extremely useful for various games, tools, and our own examples, as well as scene editors and inspectors. So, we need a mesh picking backend! Luckily, [`bevy_mod_picking`](https://github.com/aevyrie/bevy_mod_picking) (which `bevy_picking` is based on) by @aevyrie already has a [backend for it](`74f0c3c0fb/backends/bevy_picking_raycast/src/lib.rs`) using [`bevy_mod_raycast`](https://github.com/aevyrie/bevy_mod_raycast). As a side product of adding mesh picking, we also get support for performing ray casts on meshes! ## Solution Upstream a large chunk of the immediate-mode ray casting functionality from `bevy_mod_raycast`, and add a mesh picking backend based on `bevy_mod_picking`. Huge thanks to @aevyrie who did all the hard work on these incredible crates! All meshes are pickable by default. Picking can be disabled for individual entities by adding `PickingBehavior::IGNORE`, like normal. Or, if you want mesh picking to be entirely opt-in, you can set `MeshPickingBackendSettings::require_markers` to `true` and add a `RayCastPickable` component to the desired camera and target entities. You can also use the new `MeshRayCast` system parameter to cast rays into the world manually: ```rust fn ray_cast_system(mut ray_cast: MeshRayCast, foo_query: Query<(), With<Foo>>) { let ray = Ray3d::new(Vec3::ZERO, Dir3::X); // Only ray cast against entities with the `Foo` component. let filter = \|entity\| foo_query.contains(entity); // Never early-exit. Note that you can change behavior per-entity. let early_exit_test = \|_entity\| false; // Ignore the visibility of entities. This allows ray casting hidden entities. let visibility = RayCastVisibility::Any; let settings = RayCastSettings::default() .with_filter(&filter) .with_early_exit_test(&early_exit_test) .with_visibility(visibility); // Cast the ray with the settings, returning a list of intersections. let hits = ray_cast.cast_ray(ray, &settings); } ``` This is largely a direct port, but I did make several changes to match our APIs better, remove things we don't need or that I think are unnecessary, and do some general improvements to code quality and documentation. ### Changes Relative to `bevy_mod_raycast` and `bevy_mod_picking` - Every `Raycast` and "raycast" has been renamed to `RayCast` and "ray cast" (similar reasoning as the "Naming" section in #15724) - `Raycast` system param has been renamed to `MeshRayCast` to avoid naming conflicts and to be explicit that it is not for colliders - `RaycastBackend` has been renamed to `MeshPickingBackend` - `RayCastVisibility` variants are now `Any`, `Visible`, and `VisibleInView` instead of `Ignore`, `MustBeVisible`, and `MustBeVisibleAndInView` - `NoBackfaceCulling` has been renamed to `RayCastBackfaces`, to avoid implying that it affects the rendering of backfaces for meshes (it doesn't) - `SimplifiedMesh` and `RayCastBackfaces` live near other ray casting API types, not in their own 10 LoC module - All intersection logic and types are in the same `intersections` module, not split across several modules - Some intersection types have been renamed to be clearer and more consistent - `IntersectionData` -> `RayMeshHit` - `RayHit` -> `RayTriangleHit` - General documentation and code quality improvements ### Removed / Not Ported - Removed unused ray helpers and types, like `PrimitiveIntersection` - Removed getters on intersection types, and made their properties public - There is no `2d` feature, and `Raycast::mesh_query` and `Raycast::mesh2d_query` have been merged into `MeshRayCast::mesh_query`, which handles both 2D and 3D - I assume this existed previously because `Mesh2dHandle` used to be in `bevy_sprite`. Now both the 2D and 3D mesh are in `bevy_render`. - There is no `debug` feature or ray debug rendering - There is no deferred API (`RaycastSource`) - There is no `CursorRayPlugin` (the picking backend handles this) ### Note for Reviewers In case it's helpful, the [first commit](`281638ef10`) here is essentially a one-to-one port. The rest of the commits are primarily refactoring and cleaning things up in the ways listed earlier, as well as changes to the module structure. It may also be useful to compare the original [picking backend](`74f0c3c0fb/backends/bevy_picking_raycast/src/lib.rs`) and [`bevy_mod_raycast`](https://github.com/aevyrie/bevy_mod_raycast) to this PR. Feel free to mention if there are any changes that I should revert or something I should not include in this PR. ## Testing I tested mesh picking and relevant components in some examples, for both 2D and 3D meshes, and added a new `mesh_picking` example. I also ~~stole~~ ported over the [ray-mesh intersection benchmark](`dbc5ef32fe/benches/ray_mesh_intersection.rs`) from `bevy_mod_raycast`. --- ## Showcase Below is a version of the `2d_shapes` example modified to demonstrate 2D mesh picking. This is not included in this PR. https://github.com/user-attachments/assets/7742528c-8630-4c00-bacd-81576ac432bf And below is the new `mesh_picking` example: https://github.com/user-attachments/assets/b65c7a5a-fa3a-4c2d-8bbd-e7a2c772986e There is also a really cool new `mesh_ray_cast` example ported over from `bevy_mod_raycast`: https://github.com/user-attachments/assets/3c5eb6c0-bd94-4fb0-bec6-8a85668a06c9 --------- Co-authored-by: Aevyrie <aevyrie@gmail.com> Co-authored-by: Trent <2771466+tbillington@users.noreply.github.com> Co-authored-by: François Mockers <mockersf@gmail.com>	2024-10-13 17:24:19 +00:00
Christian Hughes	219b5930f1	Rename `App/World::observe` to `add_observer`, `EntityWorldMut::observe_entity` to `observe`. (#15754 ) # Objective - Closes #15752 Calling the functions `App::observe` and `World::observe` doesn't make sense because you're not "observing" the `App` or `World`, you're adding an observer that listens for an event that occurs within the `World`. We should rename them to better fit this. ## Solution Renames: - `App::observe` -> `App::add_observer` - `World::observe` -> `World::add_observer` - `Commands::observe` -> `Commands::add_observer` - `EntityWorldMut::observe_entity` -> `EntityWorldMut::observe` (Note this isn't a breaking change as the original rename was introduced earlier this cycle.) ## Testing Reusing current tests.	2024-10-09 15:39:29 +00:00
Trashtalk217	d1bd46d45e	Deprecate `get_or_spawn` (#15652 ) # Objective After merging retained rendering world #15320, we now have a good way of creating a link between worlds (HIYAA intensifies). This means that `get_or_spawn` is no longer necessary for that function. Entity should be opaque as the warning above `get_or_spawn` says. This is also part of #15459. I'm deprecating `get_or_spawn_batch` in a different PR in order to keep the PR small in size. ## Solution Deprecate `get_or_spawn` and replace it with `get_entity` in most contexts. If it's possible to query `&RenderEntity`, then the entity is synced and `render_entity.id()` is initialized in the render world. ## Migration Guide If you are given an `Entity` and you want to do something with it, use `Commands.entity(...)` or `World.entity(...)`. If instead you want to spawn something use `Commands.spawn(...)` or `World.spawn(...)`. If you are not sure if an entity exists, you can always use `get_entity` and match on the `Option<...>` that is returned. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>	2024-10-07 16:08:22 +00:00
Kristoffer Søholm	336c23c1aa	Rename observe to observe_entity on EntityWorldMut (#15616 ) # Objective The current observers have some unfortunate footguns where you can end up confused about what is actually being observed. For apps you can chain observe like `app.observe(..).observe(..)` which works like you would expect, but if you try the same with world the first `observe()` will return the `EntityWorldMut` for the created observer, and the second `observe()` will only observe on the observer entity. It took several hours for multiple people on discord to figure this out, which is not a great experience. ## Solution Rename `observe` on entities to `observe_entity`. It's slightly more verbose when you know you have an entity, but it feels right to me that observers for specific things have more specific naming, and it prevents this issue completely. Another possible solution would be to unify `observe` on `App` and `World` to have the same kind of return type, but I'm not sure exactly what that would look like. ## Testing Simple name change, so only concern is docs really. --- ## Migration Guide The `observe()` method on entities has been renamed to `observe_entity()` to prevent confusion about what is being observed in some cases.	2024-10-03 17:05:26 +00:00
rudderbucky	2da8d17a44	Add try_despawn methods to World/Commands (#15480 ) # Objective Fixes #14511. `despawn` allows you to remove entities from the world. However, if the entity does not exist, it emits a warning. This may not be intended behavior for many users who have use cases where they need to call `despawn` regardless of if the entity actually exists (see the issue), or don't care in general if the entity already doesn't exist. (Also trying to gauge interest on if this feature makes sense, I'd personally love to have it, but I could see arguments that this might be a footgun. Just trying to help here 😄 If there's no contention I could also implement this for `despawn_recursive` and `despawn_descendants` in the same PR) ## Solution Add `try_despawn`, `try_despawn_recursive` and `try_despawn_descendants`. Modify `World::despawn_with_caller` to also take in a `warn` boolean argument, which is then considered when logging the warning. Set `log_warning` to `true` in the case of `despawn`, and `false` in the case of `try_despawn`. ## Testing Ran `cargo run -p ci` on macOS, it seemed fine.	2024-10-03 16:21:05 +00:00
rudderbucky	5e81154e9c	Despawn and despawn_recursive benchmarks (#15610 ) # Objective Add despawn and despawn_recursive benchmarks in a similar vein to the spawn benchmark. ## Testing Ran `cargo bench` from `benches` and it compiled fine. On my machine: ``` despawn_world/1_entities time: [3.1495 ns 3.1574 ns 3.1652 ns] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe despawn_world/10_entities time: [28.629 ns 28.674 ns 28.720 ns] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe despawn_world/100_entities time: [286.95 ns 287.41 ns 287.90 ns] Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild despawn_world/1000_entities time: [2.8739 µs 2.9001 µs 2.9355 µs] Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) high mild 6 (6.00%) high severe despawn_world/10000_entities time: [28.535 µs 28.617 µs 28.698 µs] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe despawn_world_recursive/1_entities time: [5.2270 ns 5.2507 ns 5.2907 ns] Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low mild 6 (6.00%) high mild 4 (4.00%) high severe despawn_world_recursive/10_entities time: [57.495 ns 57.590 ns 57.691 ns] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high mild despawn_world_recursive/100_entities time: [514.43 ns 518.91 ns 526.88 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe despawn_world_recursive/1000_entities time: [5.0362 µs 5.0463 µs 5.0578 µs] Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) high mild 5 (5.00%) high severe despawn_world_recursive/10000_entities time: [51.159 µs 51.603 µs 52.215 µs] Found 9 outliers among 100 measurements (9.00%) 3 (3.00%) high mild 6 (6.00%) high severe ```	2024-10-03 14:59:37 +00:00
Tim	461305b3d7	Revert "Have EntityCommands methods consume self for easier chaining" (#15523 ) As discussed in #15521 - Partial revert of #14897, reverting the change to the methods to consume `self` - The `insert_if` method is kept The migration guide of #14897 should be removed Closes #15521 --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>	2024-10-02 12:47:26 +00:00
Zachary Harrold	d70595b667	Add `core` and `alloc` over `std` Lints (#15281 ) # Objective - Fixes #6370 - Closes #6581 ## Solution - Added the following lints to the workspace: - `std_instead_of_core` - `std_instead_of_alloc` - `alloc_instead_of_core` - Used `cargo +nightly fmt` with [item level use formatting](https://rust-lang.github.io/rustfmt/?version=v1.6.0&search=#Item%5C%3A) to split all `use` statements into single items. - Used `cargo clippy --workspace --all-targets --all-features --fix --allow-dirty` to _attempt_ to resolve the new linting issues, and intervened where the lint was unable to resolve the issue automatically (usually due to needing an `extern crate alloc;` statement in a crate root). - Manually removed certain uses of `std` where negative feature gating prevented `--all-features` from finding the offending uses. - Used `cargo +nightly fmt` with [crate level use formatting](https://rust-lang.github.io/rustfmt/?version=v1.6.0&search=#Crate%5C%3A) to re-merge all `use` statements matching Bevy's previous styling. - Manually fixed cases where the `fmt` tool could not re-merge `use` statements due to conditional compilation attributes. ## Testing - Ran CI locally ## Migration Guide The MSRV is now 1.81. Please update to this version or higher. ## Notes - This is a _massive_ change to try and push through, which is why I've outlined the semi-automatic steps I used to create this PR, in case this fails and someone else tries again in the future. - Making this change has no impact on user code, but does mean Bevy contributors will be warned to use `core` and `alloc` instead of `std` where possible. - This lint is a critical first step towards investigating `no_std` options for Bevy. --------- Co-authored-by: François Mockers <francois.mockers@vleue.com>	2024-09-27 00:59:59 +00:00
Benjamin Brienen	27bea6abf7	Bubbling observers traversal should use query data (#15385 ) # Objective Fixes #14331 ## Solution - Make `Traversal` a subtrait of `ReadOnlyQueryData` - Update implementations and usages ## Testing - Updated unit tests ## Migration Guide Update implementations of `Traversal`. --------- Co-authored-by: Christian Hughes <9044780+ItsDoot@users.noreply.github.com>	2024-09-23 18:08:36 +00:00
Gino Valente	59c0521690	bevy_reflect: Add `Function` trait (#15205 ) # Objective While #13152 added function reflection, it didn't really make functions reflectable. Instead, it made it so that they can be called with reflected arguments and return reflected data. But functions themselves cannot be reflected. In other words, we can't go from `DynamicFunction` to `dyn PartialReflect`. ## Solution Allow `DynamicFunction` to actually be reflected. This PR adds the `Function` reflection subtrait (and corresponding `ReflectRef`, `ReflectKind`, etc.). With this new trait, we're able to implement `PartialReflect` on `DynamicFunction`. ### Implementors `Function` is currently only implemented for `DynamicFunction<'static>`. This is because we can't implement it generically over all functions—even those that implement `IntoFunction`. What about `DynamicFunctionMut`? Well, this PR does not implement `Function` for `DynamicFunctionMut`. The reasons for this are a little complicated, but it boils down to mutability. `DynamicFunctionMut` requires `&mut self` to be invoked since it wraps a `FnMut`. However, we can't really model this well with `Function`. And if we make `DynamicFunctionMut` wrap its internal `FnMut` in a `Mutex` to allow for `&self` invocations, then we run into either concurrency issues or recursion issues (or, in the worst case, both). So for the time-being, we won't implement `Function` for `DynamicFunctionMut`. It will be better to evaluate it on its own. And we may even consider the possibility of removing it altogether if it adds too much complexity to the crate. ### Dynamic vs Concrete One of the issues with `DynamicFunction` is the fact that it's both a dynamic representation (like `DynamicStruct` or `DynamicList`) and the only way to represent a function. Because of this, it's in a weird middle ground where we can't easily implement full-on `Reflect`. That would require `Typed`, but what static `TypeInfo` could it provide? Just that it's a `DynamicFunction`? None of the other dynamic types implement `Typed`. However, by not implementing `Reflect`, we lose the ability to downcast back to our `DynamicStruct`. Our only option is to call `Function::clone_dynamic`, which clones the data rather than by simply downcasting. This works in favor of the `PartialReflect::try_apply` implementation since it would have to clone anyways, but is definitely not ideal. This is also the reason I had to add `Debug` as a supertrait on `Function`. For now, this PR chooses not to implement `Reflect` for `DynamicFunction`. We may want to explore this in a followup PR (or even this one if people feel strongly that it's strictly required). The same is true for `FromReflect`. We may decide to add an implementation there as well, but it's likely out-of-scope of this PR. ## Testing You can test locally by running: ``` cargo test --package bevy_reflect --all-features ``` --- ## Showcase You can now pass around a `DynamicFunction` as a `dyn PartialReflect`! This also means you can use it as a field on a reflected type without having to ignore it (though you do need to opt out of `FromReflect`). ```rust #[derive(Reflect)] #[reflect(from_reflect = false)] struct ClickEvent { callback: DynamicFunction<'static>, } let event: Box<dyn Struct> = Box::new(ClickEvent { callback: (\|\| println!("Clicked!")).into_function(), }); // We can access our `DynamicFunction` as a `dyn PartialReflect` let callback: &dyn PartialReflect = event.field("callback").unwrap(); // And access function-related methods via the new `Function` trait let ReflectRef::Function(callback) = callback.reflect_ref() else { unreachable!() }; // Including calling the function callback.reflect_call(ArgList::new()).unwrap(); // Prints: Clicked! ```	2024-09-22 14:19:12 +00:00
Benjamin Brienen	1b8c1c1242	simplify std::mem references (#15315 ) # Objective - Fixes #15314 ## Solution - Remove unnecessary usings and simplify references to those functions. ## Testing CI	2024-09-19 21:28:16 +00:00
Benjamin Brienen	b45d83ebda	Rename Add to Queue for methods with deferred semantics (#15234 ) # Objective - Fixes #15106 ## Solution - Trivial refactor to rename the method. The duplicate method `push` was removed as well. This will simpify the API and make the semantics more clear. `Add` implies that the action happens immediately, whereas in reality, the command is queued to be run eventually. - `ChildBuilder::add_command` has similarly been renamed to `queue_command`. ## Testing Unit tests should suffice for this simple refactor. --- ## Migration Guide - `Commands::add` and `Commands::push` have been replaced with `Commnads::queue`. - `ChildBuilder::add_command` has been renamed to `ChildBuilder::queue_command`.	2024-09-17 00:17:49 +00:00
Adam	9bda913e36	Remove redundent information and optimize dynamic allocations in `Table` (#12929 ) # Objective - fix #12853 - Make `Table::allocate` faster ## Solution The PR consists of multiple steps: 1) For the component data: create a new data-structure that's similar to `BlobVec` but doesn't store `len` & `capacity` inside of it: "BlobArray" (name suggestions welcome) 2) For the `Tick` data: create a new data-structure that's similar to `ThinSlicePtr` but supports dynamic reallocation: "ThinArrayPtr" (name suggestions welcome) 3) Create a new data-structure that's very similar to `Column` that doesn't store `len` & `capacity` inside of it: "ThinColumn" 4) Adjust the `Table` implementation to use `ThinColumn` instead of `Column` The result is that only one set of `len` & `capacity` is stored in `Table`, in `Table::entities` ### Notes Regarding Performance Apart from shaving off some excess memory in `Table`, the changes have also brought noteworthy performance improvements: The previous implementation relied on `Vec::reserve` & `BlobVec::reserve`, but that redundantly repeated the same if statement (`capacity` == `len`). Now that check could be made at the `Table` level because the capacity and length of all the columns are synchronized; saving N branches per allocation. The result is a respectable performance improvement per every `Table::reserve` (and subsequently `Table::allocate`) call. I'm hesitant to give exact numbers because I don't have a lot of experience in profiling and benchmarking, but these are the results I got so far: `add_remove_big/table` benchmark after the implementation: ![after_add_remove_big_table](https://github.com/bevyengine/bevy/assets/46227443/b667da29-1212-4020-8bb0-ec0f15bb5f8a) `add_remove_big/table` benchmark in main branch (measured in comparison to the implementation): ![main_add_remove_big_table](https://github.com/bevyengine/bevy/assets/46227443/41abb92f-3112-4e01-b935-99696eb2fe58) `add_remove_very_big/table` benchmark after the implementation: ![after_add_remove_very_big](https://github.com/bevyengine/bevy/assets/46227443/f268a155-295b-4f55-ab02-f8a9dcc64fc2) `add_remove_very_big/table` benchmark in main branch (measured in comparison to the implementation): ![main_add_remove_very_big](https://github.com/bevyengine/bevy/assets/46227443/78b4e3a6-b255-47c9-baee-1a24c25b9aea) cc @james7132 to verify --- ## Changelog - New data-structure that's similar to `BlobVec` but doesn't store `len` & `capacity` inside of it: `BlobArray` - New data-structure that's similar to `ThinSlicePtr` but supports dynamic allocation:`ThinArrayPtr` - New data-structure that's very similar to `Column` that doesn't store `len` & `capacity` inside of it: `ThinColumn` - Adjust the `Table` implementation to use `ThinColumn` instead of `Column` - New benchmark: `add_remove_very_big` to benchmark the performance of spawning a lot of entities with a lot of components (15) each ## Migration Guide `Table` now uses `ThinColumn` instead of `Column`. That means that methods that previously returned `Column`, will now return `ThinColumn` instead. `ThinColumn` has a much more limited and low-level API, but you can still achieve the same things in `ThinColumn` as you did in `Column`. For example, instead of calling `Column::get_added_tick`, you'd call `ThinColumn::get_added_ticks_slice` and index it to get the specific added tick. --------- Co-authored-by: James Liu <contact@jamessliu.com>	2024-09-16 22:52:05 +00:00
re0312	739007f148	Opportunistically use dense iter for archetypal iteration in Par_iter (#14673 ) # Objective - follow of #14049 ,we could use it on our Parallel Iterator,this pr also unified the used function in both regular iter and parallel iterations. ## Performance ![image](https://github.com/user-attachments/assets/cba700bc-169c-4b58-b504-823bdca8ec05) no performance regression for regular itertaion 3.5X faster in hybrid parallel iteraion,this number is far greater than the benefits obtained in regular iteration(~1.81) because mutable iterations on continuous memory can effectively reduce the cost of mataining core cache coherence	2024-09-03 23:41:10 +00:00
Shane	484721be80	Have EntityCommands methods consume self for easier chaining (#14897 ) # Objective Fixes #14883 ## Solution Pretty simple update to `EntityCommands` methods to consume `self` and return it rather than taking `&mut self`. The things probably worth noting: * I added `#[allow(clippy::should_implement_trait)]` to the `add` method because it causes a linting conflict with `std::ops::Add`. * `despawn` and `log_components` now return `Self`. I'm not sure if that's exactly the desired behavior so I'm happy to adjust if that seems wrong. ## Testing Tested with `cargo run -p ci`. I think that should be sufficient to call things good. ## Migration Guide The most likely migration needed is changing code from this: ``` let mut entity = commands.get_or_spawn(entity); if depth_prepass { entity.insert(DepthPrepass); } if normal_prepass { entity.insert(NormalPrepass); } if motion_vector_prepass { entity.insert(MotionVectorPrepass); } if deferred_prepass { entity.insert(DeferredPrepass); } ``` to this: ``` let mut entity = commands.get_or_spawn(entity); if depth_prepass { entity = entity.insert(DepthPrepass); } if normal_prepass { entity = entity.insert(NormalPrepass); } if motion_vector_prepass { entity = entity.insert(MotionVectorPrepass); } if deferred_prepass { entity.insert(DeferredPrepass); } ``` as can be seen in several of the example code updates here. There will probably also be instances where mutable `EntityCommands` vars no longer need to be mutable.	2024-08-26 18:24:59 +00:00
EdJoPaTo	938d810766	Apply unused_qualifications lint (#14828 ) # Objective Fixes #14782 ## Solution Enable the lint and fix all upcoming hints (`--fix`). Also tried to figure out the false-positive (see review comment). Maybe split this PR up into multiple parts where only the last one enables the lint, so some can already be merged resulting in less many files touched / less potential for merge conflicts? Currently, there are some cases where it might be easier to read the code with the qualifier, so perhaps remove the import of it and adapt its cases? In the current stage it's just a plain adoption of the suggestions in order to have a base to discuss. ## Testing `cargo clippy` and `cargo run -p ci` are happy.	2024-08-21 12:29:33 +00:00
Gino Valente	2b4180ca8f	bevy_reflect: Function reflection terminology refactor (#14813 ) # Objective One of the changes in #14704 made `DynamicFunction` effectively the same as `DynamicClosure<'static>`. This change meant that the de facto function type would likely be `DynamicClosure<'static>` instead of the intended `DynamicFunction`, since the former is much more flexible. We _could_ explore ways of making `DynamicFunction` implement `Copy` using some unsafe code, but it likely wouldn't be worth it. And users would likely still reach for the convenience of `DynamicClosure<'static>` over the copy-ability of `DynamicFunction`. The goal of this PR is to fix this confusion between the two types. ## Solution Firstly, the `DynamicFunction` type was removed. Again, it was no different than `DynamicClosure<'static>` so it wasn't a huge deal to remove. Secondly, `DynamicClosure<'env>` and `DynamicClosureMut<'env>` were renamed to `DynamicFunction<'env>` and `DynamicFunctionMut<'env>`, respectively. Yes, we still ultimately kept the naming of `DynamicFunction`, but changed its behavior to that of `DynamicClosure<'env>`. We need a term to refer to both functions and closures, and "function" was the best option. [Originally](https://discord.com/channels/691052431525675048/1002362493634629796/1274091992162242710), I was going to go with "callable" as the replacement term to encompass both functions and closures (e.g. `DynamciCallable<'env>`). However, it was [suggested](https://discord.com/channels/691052431525675048/1002362493634629796/1274653581777047625) by @SkiFire13 that the simpler "function" term could be used instead. While "callable" is perhaps the better umbrella term—being truly ambiguous over functions and closures— "function" is more familiar, used more often, easier to discover, and is subjectively just "better-sounding". ## Testing Most changes are purely swapping type names or updating documentation, but you can verify everything still works by running the following command: ``` cargo test --package bevy_reflect ```	2024-08-19 21:52:36 +00:00
re0312	3bd039e821	Skip empty archetype/table (#14749 ) # Objective - As sander commneted on discord [link](https://discord.com/channels/691052431525675048/749335865876021248/1273414144091230228), ![image](https://github.com/user-attachments/assets/62f2b6f3-1aaf-49d9-bafa-bf62b83b10be) ## Performance ![image](https://github.com/user-attachments/assets/11122940-1547-42ae-9576-0e1a93fd9f5f) --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: Giacomo Stevanato <giaco.stevanato@gmail.com>	2024-08-15 14:07:20 +00:00
Mike	3d460e98ec	Fix CI bench compile check (#14728 ) # Objective - Fixes #14723 ## Solution - add the manifest path to the cargo command ## Testing - ran `cargo run -p ci -- bench-check` locally	2024-08-14 13:23:00 +00:00
Matty	4ace888e4b	Fix broken bezier curve benchmark (#14677 ) # Objective Apparently #14382 broke this, but it's not a part of CI, so it wasn't found until earlier today. ## Solution Update the benchmark like we updated the examples. ## Testing Running `cargo bench` actually works now.	2024-08-12 16:10:11 +00:00
Gino Valente	91fa4bb649	bevy_reflect: Function reflection benchmarks (#14647 ) # Objective It would be good to have benchmarks handy for function reflection as it continues to be worked on. ## Solution Add some basic benchmarks for function reflection. ## Testing To test locally, run the following in the `benches` directory: ``` cargo bench --bench reflect_function ``` ## Results Here are a couple of the results (M1 Max MacBook Pro): <img width="936" alt="Results of benching calling functions vs closures via reflection. Closures average about 40ns, while functions average about 55ns" src="https://github.com/user-attachments/assets/b9a6c585-5fbe-43db-9a7b-f57dbd3815e3"> <img width="936" alt="Results of benching converting functions vs closures into their dynamic representations. Closures average about 34ns, while functions average about 37ns" src="https://github.com/user-attachments/assets/4614560a-7192-4c1e-9ade-7bc5a4ca68e3"> Currently, it seems `DynamicClosure` is just a bit more performant. This is likely due to the fact that `DynamicFunction` stores its function object in an `Arc` instead of a `Box` so that it can be `Send + Sync` (and also `Clone`). We'll likely need to do the same for `DynamicClosure` so I suspect these results to change in the near future.	2024-08-11 03:02:06 +00:00
Periwink	ec4cf024f8	Add a ComponentIndex and update QueryState creation/update to use it (#13460 ) # Objective To implement relations we will need to add a `ComponentIndex`, which is a map from a Component to the list of archetypes that contain this component. One of the reasons is that with fragmenting relations the number of archetypes will explode, so it will become inefficient to create and update the query caches by iterating through the list of all archetypes. In this PR, we introduce the `ComponentIndex`, and we update the `QueryState` to make use of it: - if a query has at least 1 required component (i.e. something other than `()`, `Entity` or `Option<>`, etc.): for each of the required components we find the list of archetypes that contain it (using the ComponentIndex). Then, we select the smallest list among these. This gives a small subset of archetypes to iterate through compared with iterating through all new archetypes - if it doesn't, then we keep using the current approach of iterating through all new archetypes # Implementation - This breaks query iteration order, in the sense that we are not guaranteed anymore to return results in the order in which the archetypes were created. I think this should be fine because this wasn't an explicit bevy guarantee so users should not be relying on this. I updated a bunch of unit tests that were failing because of this. - I had an issue with the borrow checker because iterating the list of potential archetypes requires access to `&state.component_access`, which was conflicting with the calls to ``` if state.new_archetype_internal(archetype) { state.update_archetype_component_access(archetype, access); } ``` which need a mutable access to the state. The solution I chose was to introduce a `QueryStateView` which is a temporary view into the `QueryState` which enables a "split-borrows" kind of approach. It is described in detail in this blog post: https://smallcultfollowing.com/babysteps/blog/2018/11/01/after-nll-interprocedural-conflicts/ # Test The unit tests pass. Benchmark results: ``` ❯ critcmp main pr group main pr ----- ---- -- iter_fragmented/base 1.00 342.2±25.45ns ? ?/sec 1.02 347.5±16.24ns ? ?/sec iter_fragmented/foreach 1.04 165.4±11.29ns ? ?/sec 1.00 159.5±4.27ns ? ?/sec iter_fragmented/foreach_wide 1.03 3.3±0.04µs ? ?/sec 1.00 3.2±0.06µs ? ?/sec iter_fragmented/wide 1.03 3.1±0.06µs ? ?/sec 1.00 3.0±0.08µs ? ?/sec iter_fragmented_sparse/base 1.00 6.5±0.14ns ? ?/sec 1.02 6.6±0.08ns ? ?/sec iter_fragmented_sparse/foreach 1.00 6.3±0.08ns ? ?/sec 1.04 6.6±0.08ns ? ?/sec iter_fragmented_sparse/foreach_wide 1.00 43.8±0.15ns ? ?/sec 1.02 44.6±0.53ns ? ?/sec iter_fragmented_sparse/wide 1.00 29.8±0.44ns ? ?/sec 1.00 29.8±0.26ns ? ?/sec iter_simple/base 1.00 8.2±0.10µs ? ?/sec 1.00 8.2±0.09µs ? ?/sec iter_simple/foreach 1.00 3.8±0.02µs ? ?/sec 1.02 3.9±0.03µs ? ?/sec iter_simple/foreach_sparse_set 1.00 19.0±0.26µs ? ?/sec 1.01 19.3±0.16µs ? ?/sec iter_simple/foreach_wide 1.00 17.8±0.24µs ? ?/sec 1.00 17.9±0.31µs ? ?/sec iter_simple/foreach_wide_sparse_set 1.06 95.6±6.23µs ? ?/sec 1.00 90.6±0.59µs ? ?/sec iter_simple/sparse_set 1.00 19.3±1.63µs ? ?/sec 1.01 19.5±0.29µs ? ?/sec iter_simple/system 1.00 8.1±0.10µs ? ?/sec 1.00 8.1±0.09µs ? ?/sec iter_simple/wide 1.05 37.7±2.53µs ? ?/sec 1.00 35.8±0.57µs ? ?/sec iter_simple/wide_sparse_set 1.00 95.7±1.62µs ? ?/sec 1.00 95.9±0.76µs ? ?/sec par_iter_simple/with_0_fragment 1.04 35.0±2.51µs ? ?/sec 1.00 33.7±0.49µs ? ?/sec par_iter_simple/with_1000_fragment 1.00 50.4±2.52µs ? ?/sec 1.01 51.0±3.84µs ? ?/sec par_iter_simple/with_100_fragment 1.02 40.3±2.23µs ? ?/sec 1.00 39.5±1.32µs ? ?/sec par_iter_simple/with_10_fragment 1.14 38.8±7.79µs ? ?/sec 1.00 34.0±0.78µs ? ?/sec ```	2024-08-06 00:57:15 +00:00
re0312	8235daaea0	Opportunistically use dense iteration for archetypal iteration (#14049 ) # Objective - currently, bevy employs sparse iteration if any of the target components in the query are stored in a sparse set. it may lead to increased cache misses in some cases, potentially impacting performance. - partial fixes #12381 ## Solution - use dense iteration when an archetype and its table have the same entity count. - to avoid introducing complicate unsafe noise, this pr only implement for `for_each ` style iteration. - added a benchmark to test performance for hybrid iteration. ## Performance ![image](https://github.com/bevyengine/bevy/assets/45868716/5cce13cf-6ff2-4861-9576-e75edc63bd46) nearly 2x win in specific scenarios, and no performance degradation in other test cases. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: Christian Hughes <9044780+ItsDoot@users.noreply.github.com>	2024-08-02 21:18:15 +00:00
re0312	e5bf59d712	Recalibrated observe benchmark (#14381 ) # Objective - The event propagation benchmark is largely derived from bevy_eventlistener. However, it doesn't accurately reflect performance of bevy side, as our event bubble propagation is based on observer. ## Solution - added several new benchmarks that focuse on observer itself rather than event bubble	2024-07-18 18:25:33 +00:00
Miles Silberling-Cook	ed2b8e0f35	Minimal Bubbling Observers (#13991 ) # Objective Add basic bubbling to observers, modeled off `bevy_eventlistener`. ## Solution - Introduce a new `Traversal` trait for components which point to other entities. - Provide a default `TraverseNone: Traversal` component which cannot be constructed. - Implement `Traversal` for `Parent`. - The `Event` trait now has an associated `Traversal` which defaults to `TraverseNone`. - Added a field `bubbling: &mut bool` to `Trigger` which can be used to instruct the runner to bubble the event to the entity specified by the event's traversal type. - Added an associated constant `SHOULD_BUBBLE` to `Event` which configures the default bubbling state. - Added logic to wire this all up correctly. Introducing the new associated information directly on `Event` (instead of a new `BubblingEvent` trait) lets us dispatch both bubbling and non-bubbling events through the same api. ## Testing I have added several unit tests to cover the common bugs I identified during development. Running the unit tests should be enough to validate correctness. The changes effect unsafe portions of the code, but should not change any of the safety assertions. ## Changelog Observers can now bubble up the entity hierarchy! To create a bubbling event, change your `Derive(Event)` to something like the following: ```rust #[derive(Component)] struct MyEvent; impl Event for MyEvent { type Traverse = Parent; // This event will propagate up from child to parent. const AUTO_PROPAGATE: bool = true; // This event will propagate by default. } ``` You can dispatch a bubbling event using the normal `world.trigger_targets(MyEvent, entity)`. Halting an event mid-bubble can be done using `trigger.propagate(false)`. Events with `AUTO_PROPAGATE = false` will not propagate by default, but you can enable it using `trigger.propagate(true)`. If there are multiple observers attached to a target, they will all be triggered by bubbling. They all share a bubbling state, which can be accessed mutably using `trigger.propagation_mut()` (`trigger.propagate` is just sugar for this). You can choose to implement `Traversal` for your own types, if you want to bubble along a different structure than provided by `bevy_hierarchy`. Implementers must be careful never to produce loops, because this will cause bevy to hang. ## Migration Guide + Manual implementations of `Event` should add associated type `Traverse = TraverseNone` and associated constant `AUTO_PROPAGATE = false`; + `Trigger::new` has new field `propagation: &mut Propagation` which provides the bubbling state. + `ObserverRunner` now takes the same `&mut Propagation` as a final parameter. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: Torstein Grindvik <52322338+torsteingrindvik@users.noreply.github.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2024-07-15 13:39:41 +00:00
re0312	f0bdce7425	Fair Change Detection Benchmarking (#11173 ) # Objective - #4972 introduce a benchmark to measure chang detection performance - However,it uses `iter_batch ` cause a lot of overhead in clone data to each routine closure(it feels like a bug in`iter_batch `) and constructs new query in every iter.This overhead masks the real change detection throughput we want to measure. Instead of evaluating raw change detection, the benchmark ends up dominated by data cloning and allocation costs. ## Solution - Use iter_batch_ref to reduce the benchmark overload - Use cached query to better reflect real-world usage scenarios. - Add more benmark --- ## Changelog	2024-06-26 12:46:41 +00:00
charlotte	4c3b7679ec	#12502 Remove limit on RenderLayers. (#13317 ) # Objective Remove the limit of `RenderLayer` by using a growable mask using `SmallVec`. Changes adopted from @UkoeHB's initial PR here https://github.com/bevyengine/bevy/pull/12502 that contained additional changes related to propagating render layers. Changes ## Solution The main thing needed to unblock this is removing `RenderLayers` from our shader code. This primarily affects `DirectionalLight`. We are now computing a `skip` field on the CPU that is then used to skip the light in the shader. ## Testing Checked a variety of examples and did a quick benchmark on `many_cubes`. There were some existing problems identified during the development of the original pr (see: https://discord.com/channels/691052431525675048/1220477928605749340/1221190112939872347). This PR shouldn't change any existing behavior besides removing the layer limit (sans the comment in migration about `all` layers no longer being possible). --- ## Changelog Removed the limit on `RenderLayers` by using a growable bitset that only allocates when layers greater than 64 are used. ## Migration Guide - `RenderLayers::all()` no longer exists. Entities expecting to be visible on all layers, e.g. lights, should compute the active layers that are in use. --------- Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>	2024-05-16 16:15:47 +00:00
re0312	4ca8cf5d66	Cluster small table/archetype into single Task in parallel iteration (#12846 ) # Objective - Fix #7303 - bevy would spawn a lot of tasks in parallel iteration when it matchs a large storage and many small storage ,it significantly increase the overhead of schedule. ## Solution - collect small storage into one task	2024-04-04 07:09:26 +00:00
Cameron	01649f13e2	Refactor `App` and `SubApp` internals for better separation (#9202 ) # Objective This is a necessary precursor to #9122 (this was split from that PR to reduce the amount of code to review all at once). Moving `!Send` resource ownership to `App` will make it unambiguously `!Send`. `SubApp` must be `Send`, so it can't wrap `App`. ## Solution Refactor `App` and `SubApp` to not have a recursive relationship. Since `SubApp` no longer wraps `App`, once `!Send` resources are moved out of `World` and into `App`, `SubApp` will become unambiguously `Send`. There could be less code duplication between `App` and `SubApp`, but that would break `App` method chaining. ## Changelog - `SubApp` no longer wraps `App`. - `App` fields are no longer publicly accessible. - `App` can no longer be converted into a `SubApp`. - Various methods now return references to a `SubApp` instead of an `App`. ## Migration Guide - To construct a sub-app, use `SubApp::new()`. `App` can no longer convert into `SubApp`. - If you implemented a trait for `App`, you may want to implement it for `SubApp` as well. - If you're accessing `app.world` directly, you now have to use `app.world()` and `app.world_mut()`. - `App::sub_app` now returns `&SubApp`. - `App::sub_app_mut` now returns `&mut SubApp`. - `App::get_sub_app` now returns `Option<&SubApp>.` - `App::get_sub_app_mut` now returns `Option<&mut SubApp>.`	2024-03-31 03:16:10 +00:00
W. Black	622f9a35b6	Torus benchmark (#12781 ) # Objective - Primitive meshing is suboptimal - Improve primitive meshing ## Solution - Add primitive meshing benchmark - Allows measuring future improvements --- First of a few PRs to refactor and improve primitive meshing.	2024-03-29 20:30:26 +00:00
targrub	13cbb9cf10	Move commands module into bevy::ecs::world (#12234 ) # Objective Fixes https://github.com/bevyengine/bevy/issues/11628 ## Migration Guide `Command` and `CommandQueue` have migrated from `bevy_ecs::system` to `bevy_ecs::world`, so `use bevy_ecs::world::{Command, CommandQueue};` when necessary.	2024-03-02 23:13:45 +00:00
James Liu	bc82749012	Remove APIs deprecated in 0.13 (#11974 ) # Objective We deprecated quite a few APIs in 0.13. 0.13 has shipped already. It should be OK to remove them in 0.14's release. Fixes #4059. Fixes #9011. ## Solution Remove them.	2024-02-19 19:04:47 +00:00
David M. Lary	0dccfb5788	Stepping disabled performance fix (#11959 ) # Objective * Fixes #11932 (performance impact when stepping is disabled) ## Solution The `Option<FixedBitSet>` argument added to `ScheduleExecutor::run()` in #8453 caused a measurable performance impact even when stepping is disabled. This can be seen by the benchmark of running `Schedule:run()` on an empty schedule in a tight loop (https://github.com/bevyengine/bevy/issues/11932#issuecomment-1950970236). I was able to get the same performance results as on 0.12.1 by changing the argument `ScheduleExecutor::run()` from `Option<FixedBitSet>` to `Option<&FixedBitSet>`. The down-side of this change is that `Schedule::run()` now takes about 6% longer (3.7319 ms vs 3.9855ns) when stepping is enabled --- ## Changelog * Change `ScheduleExecutor::run()` `_skipped_systems` from `Option<FixedBitSet>` to `Option<&FixedBitSet>` * Added a few benchmarks to measure `Schedule::run()` performance with various executors	2024-02-19 17:02:14 +00:00
Doonv	1c67e020f7	Move `EntityHash` related types into `bevy_ecs` (#11498 ) # Objective Reduce the size of `bevy_utils` (https://github.com/bevyengine/bevy/issues/11478) ## Solution Move `EntityHash` related types into `bevy_ecs`. This also allows us access to `Entity`, which means we no longer need `EntityHashMap`'s first generic argument. --- ## Changelog - Moved `bevy::utils::{EntityHash, EntityHasher, EntityHashMap, EntityHashSet}` into `bevy::ecs::entity::hash` . - Removed `EntityHashMap`'s first generic argument. It is now hardcoded to always be `Entity`. ## Migration Guide - Uses of `bevy::utils::{EntityHash, EntityHasher, EntityHashMap, EntityHashSet}` now have to be imported from `bevy::ecs::entity::hash`. - Uses of `EntityHashMap` no longer have to specify the first generic parameter. It is now hardcoded to always be `Entity`.	2024-02-12 15:02:24 +00:00
BD103	3d2d61d063	Use batch spawn in benchmarks (#11611 ) # Objective - The benchmarks for `bevy_ecs`' `iter_simple` group use `for` loops instead of `World::spawn_batch`. - There's a TODO comment that says to batch spawn them. ## Solution - Replace the `for` loops with `World::spawn_batch`.	2024-02-01 19:23:09 +00:00
Natalie Bonnibel Baker	b257fffef8	Change Entity::generation from u32 to NonZeroU32 for niche optimization (#9907 ) # Objective - Implements change described in https://github.com/bevyengine/bevy/issues/3022 - Goal is to allow Entity to benefit from niche optimization, especially in the case of Option<Entity> to reduce memory overhead with structures with empty slots ## Discussion - First PR attempt: https://github.com/bevyengine/bevy/pull/3029 - Discord: https://discord.com/channels/691052431525675048/1154573759752183808/1154573764240093224 ## Solution - Change `Entity::generation` from u32 to NonZeroU32 to allow for niche optimization. - The reason for changing generation rather than index is so that the costs are only encountered on Entity free, instead of on Entity alloc - There was some concern with generations being used, due to there being some desire to introduce flags. This was more to do with the original retirement approach, however, in reality even if generations were reduced to 24-bits, we would still have 16 million generations available before wrapping and current ideas indicate that we would be using closer to 4-bits for flags. - Additionally, another concern was the representation of relationships where NonZeroU32 prevents us using the full address space, talking with Joy it seems unlikely to be an issue. The majority of the time these entity references will be low-index entries (ie. `ChildOf`, `Owes`), these will be able to be fast lookups, and the remainder of the range can use slower lookups to map to the address space. - It has the additional benefit of being less visible to most users, since generation is only ever really set through `from_bits` type methods. - `EntityMeta` was changed to match - On free, generation now explicitly wraps: - Originally, generation would panic in debug mode and wrap in release mode due to using regular ops. - The first attempt at this PR changed the behavior to "retire" slots and remove them from use when generations overflowed. This change was controversial, and likely needs a proper RFC/discussion. - Wrapping matches current release behaviour, and should therefore be less controversial. - Wrapping also more easily migrates to the retirement approach, as users likely to exhaust the exorbitant supply of generations will code defensively against aliasing and that defensive code is less likely to break than code assuming that generations don't wrap. - We use some unsafe code here when wrapping generations, to avoid branch on NonZeroU32 construction. It's guaranteed safe due to how we perform wrapping and it results in significantly smaller ASM code. - https://godbolt.org/z/6b6hj8PrM ## Migration - Previous `bevy_scene` serializations have a high likelihood of being broken, as they contain 0th generation entities. ## Current Issues - `Entities::reserve_generations` and `EntityMapper` wrap now, even in debug - although they technically did in release mode already so this probably isn't a huge issue. It just depends if we need to change anything here? --------- Co-authored-by: Natalie Baker <natalie.baker@advancednavigation.com>	2024-01-08 23:03:00 +00:00
James Liu	2148518758	Override QueryIter::fold to port Query::for_each perf gains to select Iterator combinators (#6773 ) # Objective After #6547, `Query::for_each` has been capable of automatic vectorization on certain queries, which is seeing a notable (>50% CPU time improvements) for iteration. However, `Query::for_each` isn't idiomatic Rust, and lacks the flexibility of iterator combinators. Ideally, `Query::iter` and friends should be able to achieve the same results. However, this does seem to blocked upstream (rust-lang/rust#104914) by Rust's loop optimizations. ## Solution This is an intermediate solution and refactor. This moves the `Query::for_each` implementation onto the `Iterator::fold` implementation for `QueryIter` instead. This should result in the same automatic vectorization optimization on all `Iterator` functions that internally use fold, including `Iterator::for_each`, `Iterator::count`, etc. With this, it should close the gap between the two completely. Internally, this PR changes `Query::for_each` to use `query.iter().for_each(..)` instead of the duplicated implementation. Separately, the duplicate implementations of internal iteration (i.e. `Query::par_for_each`) now use portions of the current `Query::for_each` implementation factored out into their own functions. This also massively cleans up our internal fragmentation of internal iteration options, deduplicating the iteration code used in `for_each` and `par_iter().for_each()`. --- ## Changelog Changed: `Query::for_each`, `Query::for_each_mut`, `Query::for_each`, and `Query::for_each_mut` have been moved to `QueryIter`'s `Iterator::for_each` implementation, and still retains their performance improvements over normal iteration. These APIs are deprecated in 0.13 and will be removed in 0.14. --------- Co-authored-by: JoJoJet <21144246+JoJoJet@users.noreply.github.com> Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>	2023-12-01 09:09:55 +00:00
Zachary Harrold	52c11f6986	Ran `cargo fmt` on `benches` crate (#10758 ) # Objective - Format `benches` crate to match current Rust standards. ## Solution - Ran `cargo fmt` in the `benches` crate. ## Notes I accidentally came across this when working on the `Drop` implementation for `CommandQueue` and rather embarrassingly let it sneak into my PR there. I think it makes sense to ensure this crate is also well formatted to avoid it in the future.	2023-11-27 18:41:35 +00:00
scottmcm	713b1d8fa4	Optimize `Entity::eq` (#10519 ) (This is my first PR here, so I've probably missed some things. Please let me know what else I should do to help you as a reviewer!) # Objective Due to https://github.com/rust-lang/rust/issues/117800, the `derive`'d `PartialEq::eq` on `Entity` isn't as good as it could be. Since that's used in hashtable lookup, let's improve it. ## Solution The derived `PartialEq::eq` short-circuits if the generation doesn't match. However, having a branch there is sub-optimal, especially on 64-bit systems like x64 that could just load the whole `Entity` in one load anyway. Due to complications around `poison` in LLVM and the exact details of what unsafe code is allowed to do with reference in Rust (https://github.com/rust-lang/unsafe-code-guidelines/issues/346), LLVM isn't allowed to completely remove the short-circuiting. `&Entity` is marked `dereferencable(8)` so LLVM knows it's allowed to load all 8 bytes -- and does so -- but it has to assume that the `index` might be undef/poison if the `generation` doesn't match, and thus while it finds a way to do it without needing a branch, it has to do something slightly more complicated than optimal to combine the results. (LLVM is allowed to change non-short-circuiting code to use branches, but not the other way around.) Here's a link showing the codegen today: <https://rust.godbolt.org/z/9WzjxrY7c> ```rust #[no_mangle] pub fn demo_eq_ref(a: &Entity, b: &Entity) -> bool { a == b } ``` ends up generating the following assembly: ```asm demo_eq_ref: movq xmm0, qword ptr [rdi] movq xmm1, qword ptr [rsi] pcmpeqd xmm1, xmm0 pshufd xmm0, xmm1, 80 movmskpd eax, xmm0 cmp eax, 3 sete al ret ``` (It's usually not this bad in real uses after inlining and LTO, but it makes a strong demo.) This PR manually implements `PartialEq::eq` without short-circuiting, and because that tells LLVM that neither the generations nor the index can be poison, it doesn't need to be so careful and can generate the "just compare the two 64-bit values" code you'd have probably already expected: ```asm demo_eq_ref: mov rax, qword ptr [rsi] cmp qword ptr [rdi], rax sete al ret ``` Since this doesn't change the representation of `Entity`, if it's instead passed by value, then each `Entity` is two `u32` registers, and the old and the new code do exactly the same thing. (Other approaches, like changing `Entity` to be `[u32; 2]` or `u64`, affect this case.) This should hopefully merge easily with changes like https://github.com/bevyengine/bevy/pull/9907 that also want to change `Entity`. ## Benchmarks I'm not super-confident that I got my machine fully consistent for benchmarking, but whether I run the old or the new one first I get reasonably consistent results. Here's a fairly typical example of the benchmarks I added in this PR: ![image](https://github.com/bevyengine/bevy/assets/18526288/24226308-4616-4082-b0ff-88fc06285ef1) Building the sets seems to be basically the same. It's usually reported as noise, but sometimes I see a few percent slower or faster. But lookup hits in particular -- since a hit checks that the key is equal -- consistently shows around 10% improvement. `cargo run --example many_cubes --features bevy/trace_tracy --release -- --benchmark` showed as slightly faster with this change, though if I had to bet I'd probably say it's more noise than meaningful (but at least it's not worse either): ![image](https://github.com/bevyengine/bevy/assets/18526288/58bb8c96-9c45-487f-a5ab-544bbfe9fba0) This is my first PR here -- and my first time running Tracy -- so please let me know what else I should run, or run things on your own more reliable machines to double-check. --- ## Changelog (probably not worth including) Changed: micro-optimized `Entity::eq` to help LLVM slightly. ## Migration Guide (I really hope nobody was using this on uninitialized entities where sufficiently tortured `unsafe` could could technically notice that this has changed.)	2023-11-14 02:06:21 +00:00
Pixelstorm	faa1b57de5	Global TaskPool API improvements (#10008 ) # Objective Reduce code duplication and improve APIs of Bevy's [global taskpools](https://github.com/bevyengine/bevy/blob/main/crates/bevy_tasks/src/usages.rs). ## Solution - As all three of the global taskpools have identical implementations and only differ in their identifiers, this PR moves the implementation into a macro to reduce code duplication. - The `init` method is renamed to `get_or_init` to more accurately reflect what it really does. - Add a new `try_get` method that just returns `None` when the pool is uninitialized, to complement the other getter methods. - Minor documentation improvements to accompany the above changes. --- ## Changelog - Added a new `try_get` method to the global TaskPools - The global TaskPools' `init` method has been renamed to `get_or_init` for clarity - Documentation improvements ## Migration Guide - Uses of `ComputeTaskPool::init`, `AsyncComputeTaskPool::init` and `IoTaskPool::init` should be changed to `::get_or_init`.	2023-10-23 20:48:48 +00:00
Nicola Papale	47409c8a72	Add inline(never) to bench systems (#9824 ) # Objective It is difficult to inspect the generated assembly of benchmark systems using a tool such as `cargo-asm` ## Solution Mark the related functions as `#[inline(never)]`. This way, you can pass the module name as argument to `cargo-asm` to get the generated assembly for the given function. It may have as side effect to make benchmarks a bit more predictable and useful too. As it prevents inlining where in bevy no inlining could possibly take place. ### Measurements Following the recommendations in <https://easyperf.net/blog/2019/08/02/Perf-measurement-environment-on-Linux>, I 1. Put my CPU in "AMD ECO" mode, which surprisingly is the equivalent of disabling turboboost, giving more consistent performances 2. Disabled all hyperthreading cores using `echo 0 > /sys/devices/system/cpu/cpu{11,12…}/online` 3. Set the scaling governor to `performance` 4. Manually disabled AMD boost with `echo 0 > /sys/devices/system/cpu/cpufreq/boost` 5. Set the nice level of the criterion benchmark using `cargo bench … & sudo renice -n -5 -p $! ; fg` 6. Not running any other program than the benchmarks (outside of system daemons and the X11 server) With this setup, running multiple times the same benchmarks on `main` gives me a lot of "regression" and "improvement" messages, which is absurd given that no code changed. On this branch, there is still some spurious performance change detection, but they are much less frequent. This only accounts for `iter_simple` and `iter_frag` benchmarks of course.	2023-10-02 12:52:18 +00:00
Nicola Papale	359e6c718d	Use single threaded executor for archetype benches (#9835 ) # Objective `no_archetype` benchmark group results were very noisy ## Solution Use the `SingeThreaded` executor. On my machine, this makes the `no_archetype` bench group 20 to 30 times faster. Meaning that most of the runtime was accounted by the multithreaded scheduler. ie: the benchmark was not testing system archetype update, but the overhead of multithreaded scheduling. With this change, the benchmark results are more meaningful. The add_archetypes function is also simplified.	2023-09-18 16:06:42 +00:00

1 2 3

114 commits