I’m using hybrid search, and when I search for "mo...
# community-help
a
I’m using hybrid search, and when I search for "modern night dress" I only receive one result. However, when I search for "night dress modern", I get more than ten results. Could someone explain how this discrepancy might occur?
f
Are you using Typesense cloud or self hosting?
a
self hosting
f
Could you post a small reproducible example similar to this? https://gist.github.com/jasonbosco/7c3432713216c378472f13e72246f46b
a
Copy code
### Run Typesense via Docker ########################################
set -x

export TYPESENSE_API_KEY=xyz
export TYPESENSE_HOST=<http://localhost:8108>

docker stop typesense-repro 2>/dev/null
docker rm typesense-repro 2>/dev/null
rm -rf "$(pwd)"/typesense-data-dir-repro
mkdir "$(pwd)"/typesense-data-dir-repro

# Wait for Typesense to be ready
docker run -d -p 8108:8108 --name typesense-repro \
            -v"$(pwd)"/typesense-data-dir-repro:/data \
            typesense/typesense:29.0.rc31 \
            --data-dir /data \
            --api-key=$TYPESENSE_API_KEY \
            --enable-cors

# Wait till typesense is ready.
until curl -s -o /dev/null -w "%{http_code}" "$TYPESENSE_HOST/health" -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" | grep -q "200"; do
  sleep 2
done

curl -s "$TYPESENSE_HOST/debug" \
       -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" | jq


curl -s "$TYPESENSE_HOST/collections" \
       -X POST \
       -H "Content-Type: application/json" \
       -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
       -d '
          {
             "name": "products",
             "fields": [
               {"name": "title", "type": "string" },
               {"name": "tags", "type": "string[]" },
               {"name": "product_types", "type": "string[]" },
               {"name" : "embedding","type" : "float[]","embed": {"from": ["title","tags","product_types"],"model_config": {"model_name": "ts/all-MiniLM-L12-v2"}}}
             ]
           }' | jq

curl -s "$TYPESENSE_HOST/collections/products/documents/import?action=create" \
  -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
  -H "Content-Type: text/plain" \
  -X POST \
  -d $'{"id": "1","title": "Bow Print Nightie","tags": ["bow print","nightie","short sleeve","t-shirt dress","casual dress","navy dress","womens nightwear","sleepwear","night dress","knee length dress"],"product_types":["nightie","dress","sleepwear"]}
{"id": "2","title": "Star Print Nightie","tags": ["star print","nightgown","night dress","sleepwear","short sleeve","casual nightie","star pattern","women\'s nightwear","lightweight nightgown"],"product_types": ["nightgown","sleepwear","nightdress"]}
{"id":"3","title":"Star Nightie","tags":["star nightie","star dress","night dress","nightgown","sleepwear","womens nightie","casual dress","t-shirt dress","star print dress","white dress","brown star","short sleeve nightie"],"product_types": ["nightie","dress","sleepwear"]}
{"id":"4","title":"Bobble LED Lamp","tags":["LED lamp","bobble lamp","white lamp","modern lamp","decorative lamp","table lamp","night light","bedroom lamp","living room lamp","accent lighting","textured lamp","unique lamp","bulb lamp","small lamp"],"product_types":["Table Lamp","LED Lamp","Accent Lamp","Night Light"]}' | jq



curl -s "$TYPESENSE_HOST/multi_search" \
        -X POST \
        -H "Content-Type: application/json" \
        -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
        -d '
          {
            "searches": [
              {
                "collection": "products",
                "q": "modern night dress",
                "query_by": "title,tags,product_types,embedding",
                "include_fields":"title",
                "vector_query":"embedding:([], distance_threshold: 0.22)"
              }
            ]
          }'  | jq

docker stop typesense-repro
docker rm typesense-repro

### Documentation ######################################################################################
# Visit the API reference section: <https://typesense.org/docs/28.0/api/collections.html>
# Click on the "Shell" tab under each API resource's docs, to get shell commands for other API endpoints
please try searching with the keywords 'modern night dress' and 'night dress modern'
f
This has to do with the distance threshold. You have set it to 0.22, which won't ever match a semantic match (lowest I've seen is around 0.40). The only matches are coming from keyword matches. The reason you're only getting one match vs 3 has to do with how Typesense will drop the tokens. By default, if there's no exact match for "night dress modern", Typesense will first drop the "modern" token and re-search on "night dress", which has matches. You can change that behavior with
drop_tokens_mode
.
a
Hi @Fanis Tharropoulos, thank you for the clarification.
🙌 1