new_in_town
08/11/2024, 2:38 PMsymbols_to_index
token_separators
2. On field level:
stem
- as far as I understand: make no sense to use both stemming and embeddings/LLM. Is it correct?
3. HTML Content
In such field definition:
{
"name": "embedding",
"type": "float[]",
"embed": {
"from": [
"title",
"content"
],
"model_config": {
"model_name": "ts/e5-large-v2"
}
}
}
should I remove HTML tags from fields "title" and "content" ?
4. Highlighting
I am doing Hybrid Search, and on the client side i set this:
'query_by': 'title, content, embedding, organization.name',
'vector_query': 'embedding:([], alpha: 0.19, distance_threshold:0.25)',
As I understand it:
the highlight snippets are generated only in case of keyword match.
In case a document found by semantic search - there is no highlight.
Is it correct?new_in_town
08/11/2024, 5:21 PMKishore Nallan
08/12/2024, 7:51 AMnew_in_town
08/12/2024, 10:08 AMThe transformed query is used for both keyword search and embedding.Thanks, Kishore! And stopwords and synonyms ?
Kishore Nallan
08/12/2024, 10:08 AMnew_in_town
08/12/2024, 10:13 AM