Update sonic.py

Sonic buffer accepts 20.000 bytes not unicode characters, since the chunking here is on unicode characters, sending 20.000 characters will overflow sonic's buffer.
UTF-8 can take up to 6 bytes, so sending less than (20.000 / 6) rounded minus should be ok.
This commit is contained in:
jdcaballerov 2021-01-20 14:51:46 -05:00 committed by GitHub
parent d8f6d4d517
commit 14df0cbb7c
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -5,7 +5,7 @@ from sonic import IngestClient, SearchClient
from archivebox.util import enforce_types
from archivebox.config import SEARCH_BACKEND_HOST_NAME, SEARCH_BACKEND_PORT, SEARCH_BACKEND_PASSWORD, SONIC_BUCKET, SONIC_COLLECTION
MAX_SONIC_TEXT_LENGTH = 20000
MAX_SONIC_TEXT_LENGTH = 2000
@enforce_types
def index(snapshot_id: str, texts: List[str]):