This changes the way we print things to first construct a mapping from
events to the children and uses that mapping to actually print things.
It should not change the actual output that we produce.
The new approach two benefits:
* It avoids a potential quadratic behavior of the previous approach.
For instance, for a vector of N elements:
```
[Message{level: (N - 1)}, ..., Message{level: 1}, Message{level: 0}]
```
we would first do a linear scan to find entry with level 0, then
another scan to find one with level 1, etc.
* It makes it much easier to improve the output in the future, because
we now pre-compute the children for each entry and can easily take
that into account when printing.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Previously `ra_prof` wouldn't actually print the unaccounted time in
some cases.
We would print, for instance, this:
```
5ms - foo
2ms - bar
```
instead of:
```
5ms - foo
2ms - bar
3ms - ???
```
The fix is to properly handle the case when an entry has 0 children
instead of using the `last` variable.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
In particular:
- Use strict inequality for comparisons, since that's what the filter
syntax supports.
- Convert to millis for comparisons, since that's the unit used both
for the filter and when printing.
Now something like `RA_PROFILE='*>0'` will only print things that took
at least 1ms (when rounded to millis).
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
This wasn't a right decision in the first place, the feature flag was
broken in the last rustfmt release, and syntax highlighting of imports
is more important anyway
Now, one can use `let _p = ra_prof::cpu_profiler()` to capture profile
of a block of code.
This is not an out of the box experience, as that relies on gperfools
See the docs on https://github.com/AtheMathmo/cpuprofiler for more!