Commit graph

503 commits

Author SHA1 Message Date
Till
d58daf9665
Update sentry reporting (#3305)
This hopefully reduces the garbage we currently produce.
(Using [GlitchTip](https://glitchtip.com/) on my personal instance, this
seems to look better)
2024-01-24 19:24:04 +01:00
Till
8e4dc6b4ae
Optimize PrevEventIDs when getting thousands of backwards extremeties (#3308)
Changes how many `PrevEventIDs` we send to other servers when
backfilling, capped to 100 events.

Unsure about how representative this benchmark is..
```
goos: linux
goarch: amd64
pkg: github.com/matrix-org/dendrite/roomserver/api
cpu: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
                            │    old.txt     │               new.txt               │
                            │     sec/op     │   sec/op     vs base                │
PrevEventIDs/Original1-8         264.9n ± 5%   237.4n ± 7%  -10.36% (p=0.000 n=10)
PrevEventIDs/Original10-8        3.101µ ± 4%   1.590µ ± 2%  -48.72% (p=0.000 n=10)
PrevEventIDs/Original100-8       44.32µ ± 2%   12.80µ ± 4%  -71.11% (p=0.000 n=10)
PrevEventIDs/Original500-8     263.835µ ± 4%   7.907µ ± 4%  -97.00% (p=0.000 n=10)
PrevEventIDs/Original1000-8    578.798µ ± 2%   7.620µ ± 2%  -98.68% (p=0.000 n=10)
PrevEventIDs/Original2000-8   1272.039µ ± 2%   8.241µ ± 9%  -99.35% (p=0.000 n=10)
geomean                          43.81µ        3.659µ       -91.65%

                            │    old.txt     │               new.txt                │
                            │      B/op      │     B/op      vs base                │
PrevEventIDs/Original1-8          72.00 ± 0%     48.00 ± 0%  -33.33% (p=0.000 n=10)
PrevEventIDs/Original10-8        1512.0 ± 0%     500.0 ± 0%  -66.93% (p=0.000 n=10)
PrevEventIDs/Original100-8     11.977Ki ± 0%   7.023Ki ± 0%  -41.36% (p=0.000 n=10)
PrevEventIDs/Original500-8     67.227Ki ± 0%   7.023Ki ± 0%  -89.55% (p=0.000 n=10)
PrevEventIDs/Original1000-8   163.227Ki ± 0%   7.023Ki ± 0%  -95.70% (p=0.000 n=10)
PrevEventIDs/Original2000-8   347.227Ki ± 0%   7.023Ki ± 0%  -97.98% (p=0.000 n=10)
geomean                         12.96Ki        1.954Ki       -84.92%

                            │   old.txt   │              new.txt               │
                            │  allocs/op  │ allocs/op   vs base                │
PrevEventIDs/Original1-8       2.000 ± 0%   1.000 ± 0%  -50.00% (p=0.000 n=10)
PrevEventIDs/Original10-8      6.000 ± 0%   2.000 ± 0%  -66.67% (p=0.000 n=10)
PrevEventIDs/Original100-8     9.000 ± 0%   3.000 ± 0%  -66.67% (p=0.000 n=10)
PrevEventIDs/Original500-8    12.000 ± 0%   3.000 ± 0%  -75.00% (p=0.000 n=10)
PrevEventIDs/Original1000-8   14.000 ± 0%   3.000 ± 0%  -78.57% (p=0.000 n=10)
PrevEventIDs/Original2000-8   16.000 ± 0%   3.000 ± 0%  -81.25% (p=0.000 n=10)
geomean                        8.137        2.335       -71.31%
```
2024-01-20 22:26:57 +01:00
Till
edd02ec468
Fix panic if unable to assign a state key NID (#3294) 2023-12-30 18:34:36 +01:00
Till
f93d1c4790
Use AckExplicitPolicy instead of AckAllPolicy (#3288)
Fixes https://github.com/matrix-org/dendrite/issues/3240 and potentially
a root cause for state resets.

While testing, I've had added some more debug logging:
```
time="2023-12-16T18:13:11.319458084Z" level=warning msg="already processed event" event_id="$qFYMl_F2vb1N0yxmvlFAMhqhGhLKq4kA-o_YCQKH7tQ" kind=KindNew times=2
time="2023-12-16T18:13:14.537389126Z" level=warning msg="already processed event" event_id="$EU-LTsKErT6Mt1k12-p_3xOHfiLaK6gtwVDlZ35lSuo" kind=KindNew times=5
time="2023-12-16T18:13:16.789551206Z" level=warning msg="already processed event" event_id="$dIPuAfTL5x0VyG873LKPslQeljCSxFT1WKxUtjIMUGE" kind=KindNew times=5
time="2023-12-16T18:13:17.383838767Z" level=warning msg="already processed event" event_id="$7noSZiCkzerpkz_UBO3iatpRnaOiPx-3IXc0GPDQVGE" kind=KindNew times=2
time="2023-12-16T18:13:22.091946597Z" level=warning msg="already processed event" event_id="$3Lvo3Wbi2ol9-nNbQ93N-E2MuGQCJZo5397KkFH-W6E" kind=KindNew times=1
time="2023-12-16T18:13:23.026417446Z" level=warning msg="already processed event" event_id="$lj1xS46zsLBCChhKOLJEG-bu7z-_pq9i_Y2DUIjzGy4" kind=KindNew times=4
```

So we did receive the same event over and over again. Given they are
`KindNew`, we don't short circuit if we already processed them, which
potentially caused the state to be calculated with a now wrong state
snapshot.

Also fixes the back pressure metric. We now correctly increment the
counter once we sent the message to NATS and decrement it once we
actually processed an event.
2023-12-19 08:25:47 +01:00
Till
b8f91485b4
Update ACLs when received as outliers (#3008)
This should fix #3004 by making sure we also update our in-memory ACLs
after joining a new room.
Also makes use of more caching in `GetStateEvent`

Bonus: Adds some tests, as I was about to use `GetBulkStateContent`, but
turns out that `GetStateEvent` is basically doing the same, just that it
only gets the `eventTypeNID`/`eventStateKeyNID` once and not for every
call.
2023-11-22 15:38:04 +01:00
Till
7863a405a5
Use IsBlacklistedOrBackingOff to determine if we should try to fetch devices (#3254)
Use `IsBlacklistedOrBackingOff` from the federation API to check if we
should fetch devices.

To reduce back pressure, we now only queue retrying servers if there's
space in the channel.
2023-11-09 08:43:27 +01:00
Till
699f5ca8c1
More rows.Close() and rows.Err() (#3262)
Looks like we missed some `rows.Close()`

Even though `rows.Err()` is mostly not necessary, we should be more
consistent in the DB layer.

[skip ci]
2023-11-09 08:42:33 +01:00
Till
5f872f4a82
Fix panic in QueryNextRoomHierarchyPage (#3253)
Sentry reported the following panic:
```
time="2023-11-01T01:33:56.220583478Z" level=error msg="Request panicked!
goroutine 43763845 [running]:
runtime/debug.Stack()
	runtime/debug/stack.go:24 +0x5e
github.com/matrix-org/dendrite/internal/httputil.MakeExternalAPI.MakeJSONAPI.Protect.func3.1()
	github.com/matrix-org/util@v0.0.0-20221111132719-399730281e66/json.go:98 +0x13e
panic({0x15b5540?, 0x2453560?})
	runtime/panic.go:914 +0x21f
github.com/matrix-org/dendrite/internal/httputil.MakeAuthAPI.func1.1()
	github.com/matrix-org/dendrite/internal/httputil/httpapi.go:91 +0x4a
panic({0x15b5540?, 0x2453560?})
	runtime/panic.go:914 +0x21f
github.com/matrix-org/dendrite/roomserver/internal/query.(*Queryer).QueryNextRoomHierarchyPage(0x413185?, {0x1a576e0, 0xc0436705a0}, {{{0xc01e5fd260, 0x1f}, {0xc01e5fd261, 0x12}, {0xc01e5fd274, 0xb}}, {0xc145cb5200, ...}, ...}, ...)
	github.com/matrix-org/dendrite/roomserver/internal/query/query_room_hierarchy.go:116 +0xbfe
github.com/matrix-org/dendrite/clientapi/routing.QueryRoomHierarchy(0xc0be13b200, 0xc144e65dd0, {0xc01e5fd260?, 0x6?}, {0x7faf140639c8, 0xc00059af20}, 0xc08adca000?)
	github.com/matrix-org/dendrite/clientapi/routing/room_hierarchy.go:141 +0x68b
github.com/matrix-org/dendrite/clientapi/routing.Setup.func35(0xc03e7d5c20?, 0x17c3a57?)
	github.com/matrix-org/dendrite/clientapi/routing/routing.go:534 +0xbe
github.com/matrix-org/dendrite/internal/httputil.MakeAuthAPI.func1(0xc0bd097300)
	github.com/matrix-org/dendrite/internal/httputil/httpapi.go:108 +0x5ed
github.com/matrix-org/util.(*jsonRequestHandlerWrapper).OnIncomingRequest(0xc0bd097200?, 0xc13b7d6fc0?)
	github.com/matrix-org/util@v0.0.0-20221111132719-399730281e66/json.go:79 +0x19
github.com/matrix-org/dendrite/internal/httputil.MakeExternalAPI.MakeJSONAPI.func2({0x1a54880, 0xc138f28b60}, 0xc0bd097200?)
	github.com/matrix-org/util@v0.0.0-20221111132719-399730281e66/json.go:141 +0xaa
github.com/matrix-org/dendrite/internal/httputil.MakeExternalAPI.MakeJSONAPI.Protect.func3({0x1a54880?, 0xc138f28b60?}, 0x17c01d9?)
	github.com/matrix-org/util@v0.0.0-20221111132719-399730281e66/json.go:103 +0x63
net/http.HandlerFunc.ServeHTTP(...)
	net/http/server.go:2136
github.com/matrix-org/dendrite/internal/httputil.MakeExternalAPI.func1({0x1a54880?, 0xc138f28b60?}, 0xc0bd097100)
	github.com/matrix-org/dendrite/internal/httputil/httpapi.go:191 +0x411
net/http.HandlerFunc.ServeHTTP(0xc0bd097000?, {0x1a54880?, 0xc138f28b60?}, 0xbe1348905308878e?)
	net/http/server.go:2136 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000000000, {0x1a54880, 0xc138f28b60}, 0xc0bd096f00)
	github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1c5
github.com/matrix-org/dendrite/setup/base.SetupAndServeHTTP.(*Handler).Handle.(*Handler).handle.func5({0x1a54880, 0xc138f28b60}, 0xc0bd096e00)
	github.com/getsentry/sentry-go@v0.14.0/http/sentryhttp.go:103 +0x298
net/http.HandlerFunc.ServeHTTP(0xc0bd096a00?, {0x1a54880?, 0xc138f28b60?}, 0x7fae6812f5d0?)
	net/http/server.go:2136 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000000a80, {0x1a54880, 0xc138f28b60}, 0xc0bd096900)
	github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1c5
net/http.serverHandler.ServeHTTP({0xc02884c4e0?}, {0x1a54880?, 0xc138f28b60?}, 0x6?)
	net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc1926922d0, {0x1a576e0, 0xc024a6ec90})
	net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 16979
	net/http/server.go:3086 +0x5cb
" context=missing panic="runtime error: invalid memory address or nil pointer dereference"
```

[skip ci]
2023-11-08 14:22:02 +01:00
Till
da7bca0224
Some tweaks for the device list updater (#3251)
This makes the following changes:
- Adds two new metrics observing the usage of the `DeviceListUpdater`
workers
- Makes the number of workers configurable
- Adds a 30s timeout for DB requests when receiving a device list update
over federation
2023-10-31 16:39:45 +01:00
Till
4fa8512d57
Check event is not rejected (#3243)
Companion PR to https://github.com/matrix-org/gomatrixserverlib/pull/421
2023-10-25 09:47:21 +02:00
Till
8b3adaf244
Fix state resets (#3231)
Needs https://github.com/matrix-org/gomatrixserverlib/pull/419

May fix: https://github.com/matrix-org/dendrite/issues/2508,
https://github.com/matrix-org/dendrite/issues/1760
2023-10-23 15:17:21 +02:00
Till
f02d998253
Remove the creator field when upgrading to v11 (#3210)
Minor oversight
2023-09-28 07:36:57 +02:00
Till
05a8f1ede3
Support for room version v11 (#3204)
Fixes #3203
2023-09-27 08:27:08 +02:00
devonh
16d922de70
Complement fixes for pseudoIDs (#3206) 2023-09-26 17:44:49 +00:00
devonh
8245b24100
Update gmsl to use new validated RoomID on PDUs (#3200)
GMSL returns a `spec.RoomID` when calling `PDU.RoomID()`
2023-09-15 14:39:06 +00:00
Sam Wedgwood
9b5be6b9c5
[pseudoIDs] More pseudo ID fixes - Part 2 (#3181)
Fixes include:
- Translating state keys that contain user IDs to their respective room
keys for both querying and sending state events
- **NOTE**: there may be design discussion needed on what should happen
when sender keys cannot be found for users
- A simple fix for kicking guests from rooms properly
- Logic for boundary history visibilities was slightly off (I'm
surprised this only manifested in pseudo ID room versions)

Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-08-24 16:43:51 +01:00
Sam Wedgwood
9a12420428
[pseudoID] More pseudo ID fixes (#3167)
Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-08-15 12:37:04 +01:00
Sam Wedgwood
35804f8493
Add config key for default room version (#3171)
This PR adds a config key `room_server.default_config_key` to set the
default room version for the room server.

Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-08-08 14:20:05 +01:00
Sam Wedgwood
c7193e24d0
Use *spec.SenderID for QuerySenderIDForUser (#3164)
There are cases where a dendrite instance is unaware of a pseudo ID for
a user, the user is not a member of that room. To represent this case,
we currently use the 'zero' value, which is often not checked and so
causes errors later down the line. To make this case more explict, and
to be consistent with `QueryUserIDForSender`, this PR changes this to
use a pointer (and `nil` to mean no sender ID).

Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-08-02 11:12:14 +01:00
Sam Wedgwood
af13fa1c75
[pseudoIDs] Fixes for room alias tests (#3159)
Some (deceptively) simple fixes for some bugs that caused room alias
tests to fail (sytext `tests/30rooms/05aliases.pl`). Each commit has
details about what it fixes.

Sytest results:

- Sytest before (79d4a0e):
https://gist.github.com/swedgwood/972ac4ef93edd130d3db0930703d6c82
- Sytest after (4b09bed):
https://gist.github.com/swedgwood/504b00ac4ee892acb757b7fac55fa28a

Room aliases go from `8/15` to `15/15`, but looks like these fixes also
managed to fix about `4` other tests, which is a nice bonus :)

Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-07-31 14:39:41 +01:00
Till Faelligen
79d4a0e399
Restore old behaviour of PurgeRoom 2023-07-26 09:09:04 +02:00
devonh
c809e95335
Fix event federation with pseudoID rooms (#3156) 2023-07-21 16:08:40 +00:00
Sam Wedgwood
9582827493
de-MSC-ifying space summaries (MSC2946) (#3134)
- This PR moves and refactors the
[code](https://github.com/matrix-org/dendrite/blob/main/setup/mscs/msc2946/msc2946.go)
for
[MSC2946](https://github.com/matrix-org/matrix-spec-proposals/pull/2946)
('Space Summaries') to integrate it into the rest of the codebase.
- Means space summaries are no longer hidden behind an MSC flag
- Solves #3096

Signed-off-by: Sam Wedgwood <sam@wedgwood.dev>
2023-07-20 15:06:05 +01:00
Till
297479ea49
Use pointer when passing the connection manager around (#3152)
As otherwise existing connections aren't reused.
2023-07-19 13:37:04 +02:00
Till
5267cc0f54
Optimise getting local members and membership counts (#3150)
The previous version was getting **ALL** membership events (as
`ClientEvents`, so going through `NewEventFromTrustedJSONWithID`) for a
given room.
Now we are querying only locally joined users as `ClientEvents`, which
should **significantly** reduce allocations.

Take for example a large room with 2k membership events, but only 1
local user - avoiding 1999 `NewEventFromTrustedJSONWithID` calls just to
calculate the `roomSize` which we can also query by other means.

This is also getting called for every `OutputRoomEvent` in the userAPI.

Benchmark with 1 local user and 100 remote users.
```
pkg: github.com/matrix-org/dendrite/userapi/consumers
cpu: 12th Gen Intel(R) Core(TM) i5-12500H
                    │   old.txt   │               new.txt               │
                    │   sec/op    │   sec/op     vs base                │
LocalRoomMembers-16   375.9µ ± 7%   327.6µ ± 6%  -12.85% (p=0.000 n=10)

                    │    old.txt    │               new.txt                │
                    │     B/op      │     B/op      vs base                │
LocalRoomMembers-16   79.426Ki ± 0%   8.507Ki ± 0%  -89.29% (p=0.000 n=10)

                    │   old.txt   │              new.txt               │
                    │  allocs/op  │ allocs/op   vs base                │
LocalRoomMembers-16   1015.0 ± 0%   277.0 ± 0%  -72.71% (p=0.000 n=10)
```
2023-07-13 14:19:08 +02:00
Till
eb9e90379d
Add event size checks similar to Synapse (#3140)
Companion to https://github.com/matrix-org/gomatrixserverlib/pull/400
This tries to mimic the logic found in Synapse, as dropping events can
break rooms (and we may end up in endless loops..)
2023-07-07 20:37:23 +02:00
devonh
d507c5fc95
Add pseudoID compatibility to Invites (#3126) 2023-07-06 15:15:24 +00:00
Till Faelligen
5a87c703fa
Fix metrics.. 2023-07-05 12:34:53 +02:00
Till
4c3a526e1b
Fix adding state events to the database (#3133)
When we're adding state to the database, we check which eventNIDs are
already in a block, if we already have that eventNID, we remove it from
the list. In its current form we would skip over eventNIDs in the case
we already found a match (we're decrementing `i` twice)
My theory is, that when we later get the state blocks, we are receiving
"too many" eventNIDs (well, yea, we stored too many), which may or may
not can result in state resets when comparing different state snapshots.
(e.g. when adding state we stored a eventNID by accident because we
skipped it, later we add more state and are not adding it because we
don't skip it)
2023-07-04 17:15:44 +02:00
Till Faelligen
939ee325f8
Actually use the parameter 2023-06-29 18:02:11 +02:00
Till
23cd7877a1
Add MXIDMapping for pseudoID rooms (#3112)
Add `MXIDMapping` on membership events when
creating/joining rooms.
2023-06-28 20:29:49 +02:00
Till
a734b112c6
Fix backfilling (#3117)
This should fix two issues with backfilling:
1. right after creating and joining a room over federation, we are doing
a `/backfill` request, which would return redacted events, because the
`authEvents` are empty. Even though the spec states that, in the absence
of a history visibility event, it should be handled as `shared`.
2. `gomatrixserverlib: unsupported room version ''` - because, well, we
were never setting the `roomInfo` field..
2023-06-20 16:52:29 +02:00
Devon Hudson
8cf6c381e2
Fix senderID/key conversion unit tests 2023-06-14 17:11:27 +01:00
Devon Hudson
3f4df25b31
Add missing dep 2023-06-14 17:04:19 +01:00
Devon Hudson
5aaa539e3e
Fix senderID/key conversions 2023-06-14 16:42:09 +01:00
devonh
e4665979bf
Merge SenderID & Per Room User Key work (#3109) 2023-06-14 14:23:46 +00:00
Till
7a2e325d10
Add AssignRoomNID to pre-assign roomNIDs (#3111) 2023-06-13 16:28:41 +02:00
Till
2c87972a3a
Create user room key if needed (#3108) 2023-06-13 14:19:31 +02:00
devonh
77d9e4e93d
Cleanup remaining statekey usage for senderIDs (#3106) 2023-06-12 11:19:25 +00:00
Till
832ccc32f6
Add initial support for storing user room keys (#3098) 2023-06-12 12:45:42 +02:00
devonh
8ea1a11105
Use SenderID Type (#3105) 2023-06-07 17:14:35 +00:00
devonh
7a1fd7f512
PDU Sender split (#3100)
Initial cut of splitting PDU Sender into SenderID & looking up UserID where required.
2023-06-06 20:55:18 +00:00
Till
725ff5567d
Make StrictValidityChecking a function (#3092)
Companion PR to https://github.com/matrix-org/gomatrixserverlib/pull/388
2023-06-06 15:16:55 +02:00
Till
d11da6ec7c
Fix newly found linter issues (#3099)
Fixes the issues found in
https://github.com/matrix-org/dendrite/actions/runs/5155539352/jobs/9285342056#step:5:22.
Only naked returns in longer functions.
2023-06-02 15:48:04 +02:00
devonh
ea6b368ad4
Move Invite logic to GMSL (#3086)
This is both the federation receiving & sending side logic (which were
previously entangeld in a single function)
2023-05-31 16:33:49 +00:00
devonh
cbdc601f1b
Move CreateRoom logic to Roomserver (#3093)
Move create room logic over to roomserver.
2023-05-31 15:27:08 +00:00
Till
61341aca50
Add tests for the UpDropEventReferenceSHAPrevEvents migration (#3087)
... as they could fail if there are duplicate events in
`roomserver_previous_events`.
This fixes the migration by trying to combine the `event_nids` if
possible (same room) as mentioned by @kegsay in
https://github.com/matrix-org/dendrite/pull/3083#discussion_r1195508963
2023-05-30 18:05:48 +02:00
Till
f956a8c1d9
Docs restructure (#2953)
Needs to be merged into `gh-pages` later on.
2023-05-30 10:02:53 +02:00
Till
11b557097c
Drop reference_sha column (#3083)
Companion PR to https://github.com/matrix-org/gomatrixserverlib/pull/383
2023-05-24 12:14:42 +02:00
devonh
2eae8dc489
Move SendJoin logic to GMSL (#3084)
Moves the core matrix logic for handling the send_join endpoint over to
gmsl.
2023-05-19 16:27:01 +00:00