Commit graph

4 commits

Author SHA1 Message Date
ahrav
68f28a0e34
Filter unique detectors by keywords in chunk (#1711)
* pre filter detectors that include the keywords in the chunk.

* Optimize the engine to prevent iterating overing all detectors.

* use sync.Map for concurrent access.

* lint.

* use correct verify.

* allow versioned detectors.

* Break apart Start.

* cleanup.

* Update benchmark.

* add comment.

* remove Engine prefix.

* update comments.

* use regular map.

* delete the pool.

* remove old code.

* refactor ahocorasickcore into own file.

* update comments

* move structs to ahocorasickcore

* update comments

* fix

* address comments

* exported some methods and constructor since it will need to be be used by the enterprise pipeline as well

* remove extra log
2023-10-23 08:02:01 -07:00
Miccah
fb76eaf17b
Use heuristic to choose the most likely UTF-16 decoded string (#1381)
* Use heuristic to choose the most likely UTF-16 decoded string

* Assume ASCII and include valid BE and LE bytes

* Remove unused code

* Assume ASCII and return nil when not utf16

---------

Co-authored-by: bill-rich <bill.rich@gmail.com>
2023-06-13 17:00:40 -07:00
ahrav
abdff53d5d
optimize utf-8 decoder (#1275)
* optimize utf-8 decoder.

* remove string conversion.
2023-04-20 16:52:34 -07:00
Bill Rich
d3b24fa592
Replace plain decoder with utf8 (#922) 2022-11-15 09:36:01 -08:00