Commit graph

38 commits

Author SHA1 Message Date
Neil Alexander
8a82f10046
Allow more time for device list updates (#2749)
This updates the device list updater so that it has a context
per-request, rather than a global 30 seconds for the entire server. This
could mean that talking to a slow remote server or requesting a lot of
user IDs was pretty much guaranteed to fail.

It also uses the process context to allow correct cancellation when
Dendrite wants to shut down cleanly.
2022-09-30 09:41:16 +01:00
Till
2cfcfddecc
Add a SigningKeyUpdate producer (#2697)
This adds a new stream for signing key updates, this should ensure we
don't lose any updates over federation.
2022-09-07 11:45:12 +02:00
Brian Meek
704cc5c9f5
Race in keyserver intialization (#2619)
Signed-off-by: Brian Meek <brian@hntlabs.com>
2022-08-29 09:10:42 +02:00
Neil Alexander
7120eb6bc9
Add InputDeviceListUpdate to the keyserver, remove old input API (#2536)
* Add `InputDeviceListUpdate` to the keyserver, remove old input API

* Fix copyright

* Log more information when a device list update fails
2022-06-15 14:27:07 +01:00
kegsay
6de29c1cd2
bugfix: E2EE device keys could sometimes not be sent to remote servers (#2466)
* Fix flakey sytest 'Local device key changes get to remote servers'

* Debug logs

* Remove internal/test and use /test only

Remove a lot of ancient code too.

* Use FederationRoomserverAPI in more places

* Use more interfaces in federationapi; begin adding regression test

* Linting

* Add regression test

* Unbreak tests

* ALL THE LOGS

* Fix a race condition which could cause events to not be sent to servers

If a new room event which rewrites state arrives, we remove all joined hosts
then re-calculate them. This wasn't done in a transaction so for a brief period
we would have no joined hosts. During this interim, key change events which arrive
would not be sent to destination servers. This would sporadically fail on sytest.

* Unbreak new tests

* Linting
2022-05-17 13:23:35 +01:00
Neil Alexander
09d754cfbf
One NATS instance per BaseDendrite (#2438)
* One NATS instance per `BaseDendrite`

* Fix roomserver
2022-05-09 14:15:24 +01:00
Neil Alexander
4ad5f9c982
Global database connection pool (for monolith mode) (#2411)
* Allow monolith components to share a single database pool

* Don't yell about missing connection strings

* Rename field

* Setup tweaks

* Fix panic

* Improve configuration checks

* Update config

* Fix lint errors

* Update comments
2022-05-03 16:35:06 +01:00
Neil Alexander
98a5e410d7
Per-room consumers (#2293)
* Roomserver input refactoring — again!

* Ensure the actor runs again

* Preserve consumer after unsubscribe

* Another sprinkling of magic

* Rename `TopicFor` to `Prefixed`

* Recreate the stream if the config is bad

* Check streams too

* Prefix subjects, preserve inboxes

* Recreate if subjects wrong

* Remove stream subject

* Reconstruct properly

* Fix mutex unlock

* Comments

* Fix tests

* Don't drop events

* Review comments

* Separate `queueInputRoomEvents` function

* Re-jig control flow a bit
2022-03-23 10:20:18 +00:00
Neil Alexander
9572f5ed19
Wait for safe shutdown of NATS Server (#2289) 2022-03-21 10:32:34 +00:00
Neil Alexander
e30aa38fb0
Stream tweaks, use same codepath for sync vs async input room events, wait for error response via NATS messages (#2283) 2022-03-16 14:21:11 +00:00
S7evinK
2771d93748
Remove OutputKeyChangeEvent consumer on keyserver (#2160)
* Remove keyserver consumer

* Remove keyserver from eduserver

* Directly upload device keys without eduserver

* Add passing tests
2022-02-08 18:13:38 +01:00
S7evinK
9de7efa0b0
Remove sarama/saramajetstream dependencies (#2138)
* Remove dependency on saramajetstream & sarama

Signed-off-by: Till Faelligen <tfaelligen@gmail.com>

* Remove internal.ContinualConsumer from federationapi

* Remove internal.ContinualConsumer from syncapi

* Remove internal.ContinualConsumer from keyserver

* Move to new Prepare function

* Remove saramajetstream & sarama dependency

* Delete unneeded file

* Remove duplicate import

* Log error instead of silently irgnoring it

* Move `OffsetNewest` and `OffsetOldest` into keyserver types, change them to be more sane values

* Fix comments

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-02-04 13:08:13 +00:00
kegsay
2c581377a5
Remodel how device list change IDs are created (#2098)
* Remodel how device list change IDs are created

Previously we made them using the offset Kafka supplied.
We don't run Kafka anymore, so now we make the SQL table assign
the change ID via an AUTOINCREMENTing ID. Redesign the
`keyserver_key_changes` table to have `UNIQUE(user_id)` so we
don't accumulate key changes forevermore, we now have at most 1
row per user which contains the highest change ID.

This needs a SQL migration.

* Ensure we bump the change ID on sqlite

* Actually read the DeviceChangeID not the Offset in synapi

* Add SQL migrations

* Prepare after migration; fixup dendrite-upgrade-test logging

* Use higher version numbers; fix sqlite query to increment better

* Default 0 on postgres

* fixup postgres migration on fresh dendrite instances
2022-01-21 09:56:06 +00:00
S7evinK
161f145176
Add NATS JetStream support (#1866)
* Add NATS JetStream support
Update shopify/sarama

* Fix addresses

* Don't change Addresses in Defaults

* Update saramajetstream

* Add missing error check

Keep typing events for at least one minute

* Use all configured NATS addresses

* Update saramajetstream

* Try setting up with NATS

* Make sure NATS uses own persistent directory (TODO: make this configurable)

* Update go.mod/go.sum

* Jetstream package

* Various other refactoring

* Build fixes

* Config tweaks, make random jetstream storage path for CI

* Disable interest policies

* Try to sane default on jetstream base path

* Try to use in-memory for CI

* Restore storage/retention

* Update nats.go dependency

* Adapt changes to config

* Remove unneeded TopicFor

* Dep update

* Revert "Remove unneeded TopicFor"

This reverts commit f5a4e4a339.

* Revert changes made to streams

* Fix build problems

* Update nats-server

* Update go.mod/go.sum

* Roomserver input API queuing using NATS

* Fix topic naming

* Prometheus metrics

* More refactoring to remove saramajetstream

* Add missing topic

* Don't try to populate map that doesn't exist

* Roomserver output topic

* Update go.mod/go.sum

* Message acknowledgements

* Ack tweaks

* Try to resume transaction re-sends

* Try to resume transaction re-sends

* Update to matrix-org/gomatrixserverlib@91dadfb

* Remove internal.PartitionStorer from components that don't consume keychanges

* Try to reduce re-allocations a bit in resolveConflictsV2

* Tweak delivery options on RS input

* Publish send-to-device messages into correct JetStream subject

* Async and sync roomserver input

* Update dendrite-config.yaml

* Remove roomserver tests for now (they need rewriting)

* Remove roomserver test again (was merged back in)

* Update documentation

* Docker updates

* More Docker updates

* Update Docker readme again

* Fix lint issues

* Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

* Go 1.16 instead of Go 1.13 for upgrade tests and Complement

* Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

This reverts commit 368675283f.

* Don't report any errors on `/send` to see what fun that creates

* Fix panics on closed channel sends

* Enforce state key matches sender

* Do the same for leave

* Various tweaks to make tests happier

Squashed commit of the following:

commit 13f9028e7a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:47:14 2022 +0000

    Do the same for leave

commit e6be7f05c3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:33:42 2022 +0000

    Enforce state key matches sender

commit 85ede6d64b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 14:07:04 2022 +0000

    Fix panics on closed channel sends

commit 9755494a98
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:38:22 2022 +0000

    Don't report any errors on `/send` to see what fun that creates

commit 3bb4f87b5d
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:00:26 2022 +0000

    Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

    This reverts commit 368675283f.

commit fe2673ed7b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 12:09:34 2022 +0000

    Go 1.16 instead of Go 1.13 for upgrade tests and Complement

commit 368675283f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 11:51:45 2022 +0000

    Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

commit b028dfc085
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 10:29:08 2022 +0000

    Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Merge in NATS Server v2.6.6 and nats.go v1.13 into the in-process connection fork

* Add `jetstream.WithJetStreamMessage` to make ack/nak-ing less messy, use process context in consumers

* Fix consumer component name in  federation API

* Add comment explaining where streams are defined

* Tweaks to roomserver input with comments

* Finish that sentence that I apparently forgot to finish in INSTALL.md

* Bump version number of config to 2

* Add comments around asynchronous sends to roomserver in processEventWithMissingState

* More useful error message when the config version does not match

* Set version in generate-config

* Fix version in config.Defaults

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-01-05 17:44:49 +00:00
Neil Alexander
ec716793eb
Merge federationapi, federationsender, signingkeyserver components (#2055)
* Initial federation sender -> federation API refactoring

* Move base into own package, avoids import cycle

* Fix build errors

* Fix tests

* Add signing key server tables

* Try to fold signing key server into federation API

* Fix dendritejs builds

* Update embedded interfaces

* Fix panic, fix lint error

* Update configs, docker

* Rename some things

* Reuse same keyring on the implementing side

* Fix federation tests, `NewBaseDendrite` can accept freeform options

* Fix build

* Update create_db, configs

* Name tables back

* Don't rename federationsender consumer for now
2021-11-24 10:45:23 +00:00
Neil Alexander
ff21675c5b
Cross-signing fixes, notifications via sync, federation (#1974)
* Initial work on signing key update EDUs

* Fix build

* Produce/consume EDUs

* Producer logging

* Only produce key change notifications for local users

* Better naming

* Try to notify sync

* Enable feature

* Use key change topic

* Don't bother verifying signatures, validate key lengths if we can, notifier fixes

* Copyright notices

* Remove tests from whitelist until matrix-org/sytest#1117

* Some review comment fixes

* Update to matrix-org/gomatrixserverlib@f9416ac

* Remove unneeded parameter
2021-08-17 13:44:30 +01:00
Neil Alexander
eb0efa4636
Cross-signing groundwork (#1953)
* Cross-signing groundwork

* Update to matrix-org/gomatrixserverlib#274

* Fix gobind builds, which stops unit tests in CI from yelling

* Some changes from review comments

* Fix build by passing in UIA

* Update to matrix-org/gomatrixserverlib@bec8d22

* Process master/self-signing keys from devices call

* nolint

* Enum-ify the key type in the database

* Process self-signing key too

* Fix sanity check in device list updater

* Fix check

* Fix sytest, hopefully

* Fix build
2021-08-04 17:56:29 +01:00
Neil Alexander
b5aa7ca3ab
Top-level setup package (#1605)
* Move config, setup, mscs into "setup" top-level folder

* oops, forgot the EDU server

* Add setup

* goimports
2020-12-02 17:41:00 +00:00
Neil Alexander
49abe359e6
Start Kafka connections for each component that needs them (#1527)
* Start Kafka connection for each component that needs one

* Fix roomserver unit tests

* Rename to naffkaInstance (@Kegsay review comment)

* Fix import cycle
2020-10-15 13:27:13 +01:00
Neil Alexander
04bc09f591
Defer keyserver and federationsender wakeups to give HTTP listeners time to start (#1389) 2020-09-03 21:17:55 +01:00
Kegsay
6d6bb75137
Add FederationClient interface to federationsender (#1284)
* Add FederationClient interface to federationsender

- Use a shim struct in HTTP mode to keep the same API as `FederationClient`.
- Use `federationsender` instead of `FederationClient` in `keyserver`.

* Pointers not values

* Review comments

* Fix unit tests

* Rejig backoff

* Unbreak test

* Remove debug logs

* Review comments and linting
2020-08-20 17:03:07 +01:00
Neil Alexander
52eeeb1627
Prefix-defined Kafka topics (#1254)
* Prefix-defined Kafka topics

* Fix current state server test
2020-08-10 15:18:37 +01:00
Neil Alexander
4b09f445c9
Configuration format v1 (#1230)
* Initial pass at refactoring config (not finished)

* Don't forget current state and EDU servers

* More shifting around

* Update server key API tests

* Fix roomserver test

* Fix more tests

* Further tweaks

* Fix current state server test (sort of)

* Maybe fix appservices

* Fix client API test

* Include database connection string in database options

* Fix sync API build

* Update config test

* Fix unit tests

* Fix federation sender build

* Fix gobind build

* Set Listen address for all services in HTTP monolith mode

* Validate config, reinstate appservice derived in directory, tweaks

* Tweak federation API test

* Set MaxOpenConnections/MaxIdleConnections to previous values

* Update generate-config
2020-08-10 14:18:04 +01:00
Kegsay
32a4565b55
Add device list updater which manages updating remote device lists (#1242)
* Add device list updater which manages updating remote device lists

- Doesn't persist stale lists to the database yet
- Doesn't have tests yet

* Mark device lists as fresh when we persist
2020-08-06 17:48:10 +01:00
Kegsay
642f9cb964
Process inbound device list updates from federation (#1240)
* Add InputDeviceListUpdate

* Unbreak unit tests

* Process inbound device list updates from federation

- Persist the keys in the keyserver and produce key changes
- Does not currently fetch keys from the remote server if the prev IDs are missing

* Linting
2020-08-05 13:41:16 +01:00
Kegsay
a7e67e65a8
Notify clients when devices are deleted (#1233)
* Recheck device lists when join/leave events come in

* Add PerformDeviceDeletion

* Notify clients when devices are deleted

* Unbreak things

* Remove debug logging
2020-07-30 18:00:56 +01:00
Kegsay
adf7b59294
Persist partition|offset|user_id in the keyserver (#1226)
* Persist partition|offset|user_id in the keyserver

Required for a query API which will be used by the syncapi which
will be called when a `/sync` request comes in which will return
a list of user IDs of people who have changed their device keys
between two tokens.

* Add tests and fix maxOffset bug

* s/offset/log_offset/g because 'offset' is a reserved word in postgres
2020-07-28 17:38:30 +01:00
Kegsay
98f2f09bb4
keyserver: produce key change events (#1218)
* Produce kafka events when keys are added

* Consume key changes in syncapi with TODO markers for handling them and catching up

* unbreak tests

* Linting
2020-07-23 16:41:36 +01:00
Kegsay
541a23f712
Handle inbound federation E2E key queries/claims (#1215)
* Handle inbound /keys/claim and /keys/query requests

* Add display names to device key responses

* Linting
2020-07-22 17:04:57 +01:00
Kegsay
470933789b
Perform outbound federation hits for querying/claiming E2E keys (#1212)
* Perform outbound federation hits for querying/claiming E2E keys

Untested currently because we need the receiving end to work
before sytest will be happy.

* Linting
2020-07-21 17:46:47 +01:00
Kegsay
f5e7e7513c
Implement /keys/query locally (#1204)
* Implement /keys/query locally

* Fix sqlite tests and close rows
2020-07-15 18:40:41 +01:00
Kegsay
9dd2ed7f65
Implement key uploads (#1202)
* Add storage layer for postgres/sqlite

* Return OTK counts when inserting new keys

* Hook up the key DB and make a test pass

* Convert postgres queries to be sqlite queries

* Blacklist test due to requiring rejected events

* Unbreak tests

* Update blacklist
2020-07-15 12:02:34 +01:00
Kegsay
396219ef53
Add boilerplate for key server APIs (#1196)
Also add a README which outilnes how things will work.
2020-07-13 16:02:35 +01:00
Kegsay
9c77022513
Make userapi responsible for checking access tokens (#1133)
* Make userapi responsible for checking access tokens

There's still plenty of dependencies on account/device DBs, but this
is a start. This is a breaking change as it adds a required config
value `listen.user_api`.

* Cleanup

* Review comments and test fix
2020-06-16 14:10:55 +01:00
Kegsay
4f171c56a8
Split out SetupFooComponent (#1106)
* Split out adding HTTP routes from making internal APIs for clarity

* Split out more components

* Split out more things

* Finish converting

* internal mux for internal routes
2020-06-08 15:51:07 +01:00
Neil Alexander
fe82e1f725
Separate muxes for public and internal APIs (#1056)
* Separate muxes for public and internal APIs

* Update client-api-proxy and federation-api-proxy so they don't add /api to the path

* Tidy up

* Consistent HTTP setup

* Set up prefixes properly
2020-05-22 11:43:17 +01:00
Kegsay
24d8df664c
Fix #897 and shuffle directory around (#1054)
* Fix #897 and shuffle directory around

* Update find-lint

* goimports

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2020-05-21 14:40:13 +01:00
Neil Alexander
8adc128225
Keyserver skeleton (#1032)
* Keyserver skeleton

* Indentation
2020-05-14 14:05:14 +01:00