Commit graph

25 commits

Author SHA1 Message Date
jdcaballerov
14df0cbb7c
Update sonic.py
Sonic buffer accepts 20.000 bytes not unicode characters, since the chunking here is on unicode characters, sending 20.000 characters will overflow sonic's buffer.
UTF-8 can take up to 6 bytes, so sending less than (20.000 / 6) rounded minus should be ok.
2021-01-20 14:51:46 -05:00
Nick Sweeting
326fe69eea
fix lint error 2020-12-12 12:35:32 -05:00
jdcaballerov
9b6afa36a3
Update archivebox/search/backends/ripgrep.py
Co-authored-by: Nick Sweeting <git@sweeting.me>
2020-12-12 08:36:08 -05:00
jdcaballerov
aa53f4f088
Update archivebox/search/backends/ripgrep.py
Co-authored-by: Nick Sweeting <git@sweeting.me>
2020-12-12 08:36:01 -05:00
jdcaballerov
24d4c44624 Add ripgrep configs 2020-12-12 07:36:31 -05:00
Cristian
e82161a768 refactor: Remove setup_django from search 2020-12-11 16:43:48 -05:00
Nick Sweeting
e90cf05141 fix lint errors 2020-12-11 16:51:11 +02:00
Cristian
9aac09a5e1 feat: Patch setup_django so we can use an inmemory db in specific commands 2020-12-08 18:42:25 -05:00
Cristian
8d22ebf988 feat: Remove walrus operator (we still need to support python3.7) 2020-12-06 12:23:02 -05:00
jdcaballerov
172197ae01 refactor: Remove if LENGTH and use text chunker for every input 2020-12-06 01:14:39 +02:00
jdcaballerov
5a6b814c79 Add exception handling for indexable content reader 2020-12-06 01:14:38 +02:00
JDC
15fbd81480 Change MAX_SONIC_TEXT_LENGTH 2020-12-06 01:14:38 +02:00
JDC
db9c2edccc Add log print for url indexing 2020-12-06 01:14:38 +02:00
JDC
0acf479b70 Partition long strings in chunks for sonic 2020-12-06 01:14:38 +02:00
JDC
caf4660ac8 Add indexing to update command and utilities 2020-12-06 01:14:37 +02:00
JDC
23a9beb4e0 Add ignored extensions in ripgrep search 2020-12-06 01:13:39 +02:00
JDC
95382b3812 Add ripgrep rg search backend and set as default 2020-12-06 01:13:39 +02:00
JDC
4eeedae815 Exception handling for indexing and searching 2020-12-06 01:13:39 +02:00
JDC
fb67d6684c fix: Return empty QuerySet instead of list 2020-12-06 01:12:47 +02:00
JDC
823df34080 Use QuerySets for search backend API instead of pks 2020-12-06 01:12:47 +02:00
JDC
f383648ffc Use a generator for snapshot flush from index 2020-12-06 01:12:47 +02:00
JDC
47daa038eb Implement flush for search backend after remove command 2020-12-06 01:12:47 +02:00
JDC
c2c01af3ad Add config for search backend 2020-12-06 01:12:47 +02:00
JDC
5f6673c72c Implement backend architecture for search engines 2020-12-06 01:12:46 +02:00
JDC
b1f70b2197 Initial implementation 2020-12-06 01:12:45 +02:00