Commit graph

4899 commits

Author SHA1 Message Date
Marco Buono
1099a4243f Improve camera initial position, limit distance 2023-10-08 20:36:07 -03:00
Marco Buono
3b6a8cad15 Consolidate controls and display in the same UI element 2023-10-08 20:22:20 -03:00
Marco Buono
7231b0c6ff Apply feedback to demo 2023-10-08 20:09:10 -03:00
Marco Buono
f8db1a6d4b Use De Morgan's laws to make #[cfg()] more readable 2023-10-08 19:49:47 -03:00
Marco Buono
402518fd57 Improve comment about switch statement 2023-10-08 19:31:48 -03:00
Marco Buono
02960236b5 Additional improvements and PR feedback on comments 2023-10-08 19:04:49 -03:00
Marco Buono
94e155765d Update name left behind by previous refactor 2023-10-08 18:54:12 -03:00
Marco Buono
1d68885680 Apply feedback to comments 2023-10-08 18:53:23 -03:00
Marco Buono
30eec56e81 Use SCREEN_SPACE_SPECULAR_TRANSMISSION_* name (kinda verbose?) 2023-10-08 18:52:22 -03:00
Marco Buono
14e3753751 Simplify blur taps code to use a nested match expression 2023-10-08 18:47:37 -03:00
Marco Buono
5c4f00b120 Rename the blur taps shader def 2023-10-08 18:40:53 -03:00
Marco Buono
e4d133a420 Apply additional renames for consistency 2023-10-08 18:39:08 -03:00
Marco Buono
5fe0258e79 Apply additional feedback on names and comments 2023-10-08 18:22:16 -03:00
Marco Buono
3925745870 Use full channel names instead o R/G/B/A 2023-10-08 18:22:10 -03:00
Marco Buono
fd58c77f82 Apply PR feedback to documentation 2023-10-08 18:21:48 -03:00
Marco Buono
ca62069537 Fix spiral scaling calculation to not leave a hole 2023-09-26 03:01:45 -03:00
Marco Buono
1e4e4f7043 Add a fast codepath for fetching background with zero roughness 2023-09-26 03:01:18 -03:00
Marco Buono
3dfe796b34 Fix indentation of comments 2023-09-26 02:41:14 -03:00
Marco Buono
f2b34dc741 Add missing use statement 2023-09-26 02:38:42 -03:00
Marco Buono
0b51e120d9 Add transmissive quality controls 2023-09-26 02:25:44 -03:00
Marco Buono
e2b017fae9 Add TransmissiveQuality enum to control number of taps 2023-09-26 02:23:12 -03:00
Marco Buono
d4cc7b8ff7 Inline spiral values, use a fixed number of taps for now (for loop unrolling) 2023-09-26 01:29:34 -03:00
Marco Buono
6558dbb0da Remove outdated TODO comment 2023-09-26 01:00:55 -03:00
Marco Buono
e973e17660 Rename transmission to specular_transmission 2023-09-26 01:00:25 -03:00
Marco Buono
15ebf76341 Add comment on disabled MSAA 2023-09-24 17:00:26 -03:00
Marco Buono
8a17bb7f27 Make pbr_transmission_textures description shorter, update docs 2023-09-24 16:56:14 -03:00
Marco Buono
20bced86ef Use default font in example 2023-09-24 16:45:21 -03:00
Marco Buono
46715e6458 Rename pixel_mesh to pixel_checkboard, improve comments 2023-09-24 16:42:54 -03:00
Marco Buono
f02bba3684 Flip horizontal camera controls 2023-09-24 16:34:07 -03:00
Marco Buono
1edab1145c Tweak camera controls, move roughness to E/R 2023-09-24 16:32:50 -03:00
Marco Buono
45a37eb7d7 Use view_z to scale specular transmission blur 2023-09-24 16:26:26 -03:00
Marco Buono
2da45423ae Remove mentions of texture packing 2023-09-24 14:47:56 -03:00
Marco Buono
0adc5da0b6 Add a material flag to enable attenuation, removing need for non-portable infinity use 2023-09-24 14:38:09 -03:00
Marco Buono
e3487e30c8 Add missing min_filter 2023-09-24 14:16:43 -03:00
Marco Buono
12c2d43f7f Improve/fix comments based on PR feedback 2023-09-24 14:16:24 -03:00
Marco Buono
4c0b6a18b7 Make Transparent3d phase work again with batching changes 2023-09-24 12:15:32 -03:00
Marco Buono
c157a25e08 Merge branch 'main' into transmission 2023-09-24 12:12:03 -03:00
James Liu
8ace2ff9e3
Only run event systems if they have tangible work to do (#7728)
# Objective
Scheduling low cost systems has significant overhead due to task pool
contention and the extra machinery to schedule and run them. Event
update systems are the prime example of a low cost system, requiring a
guaranteed O(1) operation, and there are a *lot* of them.

## Solution
Add a run condition to every event system so they only run when there is
an event in either of it's two internal Vecs.

---

## Changelog
Changed: Event update systems will not run if there are no events to
process.

## Migration Guide
`Events<T>::update_system` has been split off from the the type and can
be found at `bevy_ecs::event::event_update_system`.

---------

Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>
2023-09-24 00:16:33 +00:00
Robert Swain
22dfa9ee96
skybox.wgsl: Fix precision issues (#9909)
# Objective

- Fixes #9707 

## Solution

- At large translations (a few thousand units), the precision of
calculating the ray direction from the fragment world position and
camera world position seems to break down. Sampling the cubemap only
needs the ray direction. As such we can use the view space fragment
position, normalise it, rotate it to world space, and use that.

---

## Changelog

- Fixed: Jittery skybox at large translations.
2023-09-23 22:11:59 +00:00
François
b416d181a7
don't create windows on winit StartCause::Init event (#9684)
# Objective

- https://github.com/bevyengine/bevy/pull/7609 broke Android support

```
8721  8770 I event crates/bevy_winit/src/system.rs:55:  Creating new window "App" (0v0)
8721  8769 I RustStdoutStderr: thread '<unnamed>' panicked at 'Cannot get the native window, it's null and will always be null before Event::Resumed and after Event::Suspended. Make sure you only call this function between those events.', winit-0.28.6/src/platform_impl/android/mod.rs:1058:13
```
## Solution

- Don't create windows on `StartCause::Init` as it's too early
2023-09-23 06:28:49 +00:00
iiYese
0181d40d83
Add as_slice to parent (#9871)
# Objective
- Make it possible to write APIs that require a type or homogenous
storage for both `Children` & `Parent` that is agnostic to edge
direction.

## Solution
- Add a way to get the `Entity` from `Parent` as a slice.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Joseph <21144246+JoJoJet@users.noreply.github.com>
2023-09-22 06:27:58 +00:00
Robert Swain
5c884c5a15
Automatic batching/instancing of draw commands (#9685)
# Objective

- Implement the foundations of automatic batching/instancing of draw
commands as the next step from #89
- NOTE: More performance improvements will come when more data is
managed and bound in ways that do not require rebinding such as mesh,
material, and texture data.

## Solution

- The core idea for batching of draw commands is to check whether any of
the information that has to be passed when encoding a draw command
changes between two things that are being drawn according to the sorted
render phase order. These should be things like the pipeline, bind
groups and their dynamic offsets, index/vertex buffers, and so on.
  - The following assumptions have been made:
- Only entities with prepared assets (pipelines, materials, meshes) are
queued to phases
- View bindings are constant across a phase for a given draw function as
phases are per-view
- `batch_and_prepare_render_phase` is the only system that performs this
batching and has sole responsibility for preparing the per-object data.
As such the mesh binding and dynamic offsets are assumed to only vary as
a result of the `batch_and_prepare_render_phase` system, e.g. due to
having to split data across separate uniform bindings within the same
buffer due to the maximum uniform buffer binding size.
- Implement `GpuArrayBuffer` for `Mesh2dUniform` to store Mesh2dUniform
in arrays in GPU buffers rather than each one being at a dynamic offset
in a uniform buffer. This is the same optimisation that was made for 3D
not long ago.
- Change batch size for a range in `PhaseItem`, adding API for getting
or mutating the range. This is more flexible than a size as the length
of the range can be used in place of the size, but the start and end can
be otherwise whatever is needed.
- Add an optional mesh bind group dynamic offset to `PhaseItem`. This
avoids having to do a massive table move just to insert
`GpuArrayBufferIndex` components.

## Benchmarks

All tests have been run on an M1 Max on AC power. `bevymark` and
`many_cubes` were modified to use 1920x1080 with a scale factor of 1. I
run a script that runs a separate Tracy capture process, and then runs
the bevy example with `--features bevy_ci_testing,trace_tracy` and
`CI_TESTING_CONFIG=../benchmark.ron` with the contents of
`../benchmark.ron`:
```rust
(
    exit_after: Some(1500)
)
```
...in order to run each test for 1500 frames.

The recent changes to `many_cubes` and `bevymark` added reproducible
random number generation so that with the same settings, the same rng
will occur. They also added benchmark modes that use a fixed delta time
for animations. Combined this means that the same frames should be
rendered both on main and on the branch.

The graphs compare main (yellow) to this PR (red).

### 3D Mesh `many_cubes --benchmark`

<img width="1411" alt="Screenshot 2023-09-03 at 23 42 10"
src="https://github.com/bevyengine/bevy/assets/302146/2088716a-c918-486c-8129-090b26fd2bc4">
The mesh and material are the same for all instances. This is basically
the best case for the initial batching implementation as it results in 1
draw for the ~11.7k visible meshes. It gives a ~30% reduction in median
frame time.

The 1000th frame is identical using the flip tool:

![flip many_cubes-main-mesh3d many_cubes-batching-mesh3d 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/2511f37a-6df8-481a-932f-706ca4de7643)

```
     Mean: 0.000000
     Weighted median: 0.000000
     1st weighted quartile: 0.000000
     3rd weighted quartile: 0.000000
     Min: 0.000000
     Max: 0.000000
     Evaluation time: 0.4615 seconds
```

### 3D Mesh `many_cubes --benchmark --material-texture-count 10`

<img width="1404" alt="Screenshot 2023-09-03 at 23 45 18"
src="https://github.com/bevyengine/bevy/assets/302146/5ee9c447-5bd2-45c6-9706-ac5ff8916daf">
This run uses 10 different materials by varying their textures. The
materials are randomly selected, and there is no sorting by material
bind group for opaque 3D so any batching is 'random'. The PR produces a
~5% reduction in median frame time. If we were to sort the opaque phase
by the material bind group, then this should be a lot faster. This
produces about 10.5k draws for the 11.7k visible entities. This makes
sense as randomly selecting from 10 materials gives a chance that two
adjacent entities randomly select the same material and can be batched.

The 1000th frame is identical in flip:

![flip many_cubes-main-mesh3d-mtc10 many_cubes-batching-mesh3d-mtc10
67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/2b3a8614-9466-4ed8-b50c-d4aa71615dbb)

```
     Mean: 0.000000
     Weighted median: 0.000000
     1st weighted quartile: 0.000000
     3rd weighted quartile: 0.000000
     Min: 0.000000
     Max: 0.000000
     Evaluation time: 0.4537 seconds
```

### 3D Mesh `many_cubes --benchmark --vary-per-instance`

<img width="1394" alt="Screenshot 2023-09-03 at 23 48 44"
src="https://github.com/bevyengine/bevy/assets/302146/f02a816b-a444-4c18-a96a-63b5436f3b7f">
This run varies the material data per instance by randomly-generating
its colour. This is the worst case for batching and that it performs
about the same as `main` is a good thing as it demonstrates that the
batching has minimal overhead when dealing with ~11k visible mesh
entities.

The 1000th frame is identical according to flip:

![flip many_cubes-main-mesh3d-vpi many_cubes-batching-mesh3d-vpi 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/ac5f5c14-9bda-4d1a-8219-7577d4aac68c)

```
     Mean: 0.000000
     Weighted median: 0.000000
     1st weighted quartile: 0.000000
     3rd weighted quartile: 0.000000
     Min: 0.000000
     Max: 0.000000
     Evaluation time: 0.4568 seconds
```

### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode
mesh2d`

<img width="1412" alt="Screenshot 2023-09-03 at 23 59 56"
src="https://github.com/bevyengine/bevy/assets/302146/cb02ae07-237b-4646-ae9f-fda4dafcbad4">
This spawns 160 waves of 1000 quad meshes that are shaded with
ColorMaterial. Each wave has a different material so 160 waves currently
should result in 160 batches. This results in a 50% reduction in median
frame time.

Capturing a screenshot of the 1000th frame main vs PR gives:

![flip bevymark-main-mesh2d bevymark-batching-mesh2d 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/80102728-1217-4059-87af-14d05044df40)

```
     Mean: 0.001222
     Weighted median: 0.750432
     1st weighted quartile: 0.453494
     3rd weighted quartile: 0.969758
     Min: 0.000000
     Max: 0.990296
     Evaluation time: 0.4255 seconds
```

So they seem to produce the same results. I also double-checked the
number of draws. `main` does 160000 draws, and the PR does 160, as
expected.

### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode
mesh2d --material-texture-count 10`

<img width="1392" alt="Screenshot 2023-09-04 at 00 09 22"
src="https://github.com/bevyengine/bevy/assets/302146/4358da2e-ce32-4134-82df-3ab74c40849c">
This generates 10 textures and generates materials for each of those and
then selects one material per wave. The median frame time is reduced by
50%. Similar to the plain run above, this produces 160 draws on the PR
and 160000 on `main` and the 1000th frame is identical (ignoring the fps
counter text overlay).

![flip bevymark-main-mesh2d-mtc10 bevymark-batching-mesh2d-mtc10 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/ebed2822-dce7-426a-858b-b77dc45b986f)

```
     Mean: 0.002877
     Weighted median: 0.964980
     1st weighted quartile: 0.668871
     3rd weighted quartile: 0.982749
     Min: 0.000000
     Max: 0.992377
     Evaluation time: 0.4301 seconds
```

### 2D Mesh `bevymark --benchmark --waves 160 --per-wave 1000 --mode
mesh2d --vary-per-instance`

<img width="1396" alt="Screenshot 2023-09-04 at 00 13 53"
src="https://github.com/bevyengine/bevy/assets/302146/b2198b18-3439-47ad-919a-cdabe190facb">
This creates unique materials per instance by randomly-generating the
material's colour. This is the worst case for 2D batching. Somehow, this
PR manages a 7% reduction in median frame time. Both main and this PR
issue 160000 draws.

The 1000th frame is the same:

![flip bevymark-main-mesh2d-vpi bevymark-batching-mesh2d-vpi 67ppd
ldr](https://github.com/bevyengine/bevy/assets/302146/a2ec471c-f576-4a36-a23b-b24b22578b97)

```
     Mean: 0.001214
     Weighted median: 0.937499
     1st weighted quartile: 0.635467
     3rd weighted quartile: 0.979085
     Min: 0.000000
     Max: 0.988971
     Evaluation time: 0.4462 seconds
```

### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode
sprite`

<img width="1396" alt="Screenshot 2023-09-04 at 12 21 12"
src="https://github.com/bevyengine/bevy/assets/302146/8b31e915-d6be-4cac-abf5-c6a4da9c3d43">
This just spawns 160 waves of 1000 sprites. There should be and is no
notable difference between main and the PR.

### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode
sprite --material-texture-count 10`

<img width="1389" alt="Screenshot 2023-09-04 at 12 36 08"
src="https://github.com/bevyengine/bevy/assets/302146/45fe8d6d-c901-4062-a349-3693dd044413">
This spawns the sprites selecting a texture at random per instance from
the 10 generated textures. This has no significant change vs main and
shouldn't.

### 2D Sprite `bevymark --benchmark --waves 160 --per-wave 1000 --mode
sprite --vary-per-instance`

<img width="1401" alt="Screenshot 2023-09-04 at 12 29 52"
src="https://github.com/bevyengine/bevy/assets/302146/762c5c60-352e-471f-8dbe-bbf10e24ebd6">
This sets the sprite colour as being unique per instance. This can still
all be drawn using one batch. There should be no difference but the PR
produces median frame times that are 4% higher. Investigation showed no
clear sources of cost, rather a mix of give and take that should not
happen. It seems like noise in the results.

### Summary

| Benchmark  | % change in median frame time |
| ------------- | ------------- |
| many_cubes  | 🟩 -30%  |
| many_cubes 10 materials  | 🟩 -5%  |
| many_cubes unique materials  | 🟩 ~0%  |
| bevymark mesh2d  | 🟩 -50%  |
| bevymark mesh2d 10 materials  | 🟩 -50%  |
| bevymark mesh2d unique materials  | 🟩 -7%  |
| bevymark sprite  | 🟥 2%  |
| bevymark sprite 10 materials  | 🟥 0.6%  |
| bevymark sprite unique materials  | 🟥 4.1%  |

---

## Changelog

- Added: 2D and 3D mesh entities that share the same mesh and material
(same textures, same data) are now batched into the same draw command
for better performance.

---------

Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>
Co-authored-by: Nicola Papale <nico@nicopap.ch>
2023-09-21 22:12:34 +00:00
Joseph
e60249e59d
Improve codegen for world validation (#9464)
# Objective

Improve code-gen for `QueryState::validate_world` and
`SystemState::validate_world`.

## Solution

* Move panics into separate, non-inlined functions, to reduce the code
size of the outer methods.
* Mark the panicking functions with `#[cold]` to help the compiler
optimize for the happy path.
* Mark the functions with `#[track_caller]` to make debugging easier.

---------

Co-authored-by: James Liu <contact@jamessliu.com>
2023-09-21 20:57:06 +00:00
Rob Parrett
bdb063497d
Use radsort for Transparent2d PhaseItem sorting (#9882)
# Objective

Fix a performance regression in the "[bevy vs
pixi](https://github.com/SUPERCILEX/bevy-vs-pixi)" benchmark.

This benchmark seems to have a slightly pathological distribution of `z`
values -- Sprites are spawned with a random `z` value with a child
sprite at `f32::EPSILON` relative to the parent.

See discussion here:
https://github.com/bevyengine/bevy/issues/8100#issuecomment-1726978633

## Solution

Use `radsort` for sorting `Transparent2d` `PhaseItem`s.

Use random `z` values in bevymark to stress the phase sort. Add an
`--ordered-z` option to `bevymark` that uses the old behavior.

## Benchmarks

mac m1 max

| benchmark | fps before | fps after | diff |
| - | - | - | - |
| bevymark --waves 120 --per-wave 1000 --random-z | 42.16 | 47.06 | 🟩
+11.6% |
| bevymark --waves 120 --per-wave 1000 | 52.50 | 52.29 | 🟥 -0.4% |
| bevymark --waves 120 --per-wave 1000 --mode mesh2d --random-z | 9.64 |
10.24 | 🟩 +6.2% |
| bevymark --waves 120 --per-wave 1000 --mode mesh2d | 15.83 | 15.59 | 🟥
-1.5% |
| bevy-vs-pixi | 39.71 | 59.88 | 🟩 +50.1% |

## Discussion

It's possible that `TransparentUi` should also change. We could probably
use `slice::sort_unstable_by_key` with the current sort key though, as
its items are always sorted and unique. I'd prefer to follow up later to
look into that.

Here's a survey of sorts used by other `PhaseItem`s

#### slice::sort_by_key
`Transparent2d`, `TransparentUi`

#### radsort
`Opaque3d`, `AlphaMask3d`, `Transparent3d`, `Opaque3dPrepass`,
`AlphaMask3dPrepass`, `Shadow`

I also tried `slice::sort_unstable_by_key` with a compound sort key
including `Entity`, but it didn't seem as promising and I didn't test it
as thoroughly.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Robert Swain <robert.swain@gmail.com>
2023-09-21 17:53:20 +00:00
Marco Buono
a5ba6754bd Mirror loading logic from other textures 2023-09-21 01:08:30 -03:00
Marco Buono
d3d73ec5cf Fix TemporalJitter import 2023-09-21 01:03:21 -03:00
Marco Buono
420b6d6117 Add Transmissive3d entities to prepare_mesh_uniforms() 2023-09-21 00:22:44 -03:00
Marco Buono
9189f49832 Add missing instance index 2023-09-21 00:21:02 -03:00
Marco Buono
9608094585 Merge branch 'main' into transmission 2023-09-21 00:20:38 -03:00
James Liu
1116207f7d
Remove dependecies from bevy_tasks' README (#9881)
# Objective
Noticed that bevy_tasks' README mentions its dependency tree, which is
very outdated at this point.

## Solution
Remove it.
2023-09-20 22:34:28 +00:00