Ignoring HTML Tags in Typesense Document Search
TLDR Shouvik inquired about avoiding HTML tags in Typesense searches. Kishore Nallan and Ricardo suggested storing HTML in non-searchable fields. Kishore Nallan proposed adding an HTML-skip flag at indexing, to which Shouvik agreed, and initiated an issue tracking on Github.
1
May 01, 2021 (33 months ago)
Shouvik
02:05 PMKishore Nallan
02:06 PMShouvik
02:12 PMShouvik
02:13 PMShouvik
05:45 PMMay 02, 2021 (33 months ago)
Ricardo
06:16 AMhttps://typesense.org/docs/0.20.0/api/collections.html#with-pre-defined-schema
"Your documents can contain other fields not mentioned in the collection's schema - they will be stored on disk but not indexed in memory."
That said your
query_by
will define what gets searched on.Kishore Nallan
09:53 AMShouvik
01:29 PMShouvik
01:30 PMKishore Nallan
01:47 PM1
Shouvik
01:49 PMShouvik
02:03 PMTypesense
Indexed 3011 threads (79% resolved)
Similar Threads
Using Highlights in typesense-go
Oliver worried about using highlights involving HTML tags in `typesense-go`, as they mix trusted and untrusted content. Jason advises HTML sanitization before ingesting data and using arbitrary strings as highlighters.
Issue with escapeHTML and Search Highlighting
Digamber is having trouble with the search highlighting not working when escapeHTML is set to false. Kishore Nallan and Jason try to help but the issue remains unresolved.
Discussing Typesense Search Highlighting Capabilities
Jack enquiries about getting highlight data to include all fields in an object on Typesense. Jason clarifies that only specific fields in 'query_by' will be returned, which resolves the issue for Jack.
Resolving HTML Content Search Issues
Ramy encountered issues with HTML content search within tags. Jason initially suggested adding special characters to the `token_separators` config but later recommended storing plain text of the HTML content. Ramy appreciated the advice. Ed also weighed in.
Adjusting Text Match Score Calculation in TypeSense
Johannes wanted to modify the Text Match Score calculation in TypeSense to improve search results returns. With counsel from Jason and Kishore Nallan, various solutions were proposed, including creating a Github issue, attempting different parameters, and updating Docker to a new version to resolve the matter.