I am trying to get documents with exact matches Here is my c typesense #community-help

I am trying to get documents with exact matches. H...

Rohan Bin Khokon

08/25/2025, 1:43 PM

I am trying to get documents with exact matches. Here is my code:

Copy code

self.typesense_client.collections[self.default_collection_name].documents.search({
    'q': '*',
    'filter_by': 'url_without_anchor:=`' + url +'`'
})

There is the schema:

Copy code

{
  "name": "url_without_anchor",
  "type": "string",
  "facet": true,
  "index": true,
  "store": true
}

Problem is when I am searching for url: /en/products/transducers/inertial-sensors/inertial-measurement-units--imu-/3dm-cv5-imu It returns /en/products/transducers/inertial-sensors/inertial-measurement-units--imu-/3dm-cv5-imu/p-xxx /en/products/transducers/inertial-sensors/inertial-measurement-units--imu-/3dm-cv5-imu/p-yyy etc. But search is working fine for following urls • /en/products/transducers/force/c10 • /en/products/instruments/sound-vibration-daq/microphone--calibration/9721-b For those it returns exact matching document, not /en/products/transducers/force/c10/p-.... Please help to to identity the problem, or better way to query. If you required more information about my setup let me know. Thanks in Advance

Jason Bosco

08/26/2025, 2:17 AM

Could you try setting

"token_separators": ["/", ".", "-"]

in the collection schema?

Rohan Bin Khokon

08/26/2025, 7:41 AM

@Jason Bosco token separators are already added.

Jason Bosco

08/26/2025, 2:35 PM

Could you share a set of curl commands like this that replicates the issue with a minimal collection and a few sample documents?

Rohan Bin Khokon

08/29/2025, 8:24 AM

Please find the schema, documents, curl, response in this zip

scrappy and TS.zip

Rohan Bin Khokon

08/29/2025, 8:31 AM

Please, let me anything you found. I am counting on you

Fanis Tharropoulos

08/29/2025, 8:35 AM

Hey Rohan, The documents file you shared seems to be from a search response. Could you use the export API to export the documents as they are indexed in the collection?

Rohan Bin Khokon

08/29/2025, 9:08 AM

hbkworld_documents.jsonl

Rohan Bin Khokon

08/29/2025, 9:08 AM

this contains full collection, nothing changed

Fanis Tharropoulos

09/01/2025, 9:30 AM

Could you share how many documents it should be returning?

Fanis Tharropoulos

09/01/2025, 10:02 AM

You need to add

symbols_to_index

. That's what's causing the issue.

Rohan Bin Khokon

09/01/2025, 10:58 AM

Okay, Let me try this. and I'll let you know.

Rohan Bin Khokon

09/03/2025, 8:53 AM

Still not working 🙁

Fanis Tharropoulos

09/04/2025, 11:29 AM

Could you provide a script that indexes the data, creates the collection and searches? It worked on my testing.

Rohan Bin Khokon

09/04/2025, 1:20 PM

call we have a short call?

Fanis Tharropoulos

09/04/2025, 2:38 PM

Due to our bandwidth, we can't provide calls to users in the public Slack community. If you need prioritized support, you can sign up for a support plan on Typesense Cloud: https://cloud.typesense.org/support-plans

Rohan Bin Khokon

09/08/2025, 7:19 AM

Tried all the solution you have mentioned earlier. Still not working. Can you suggest me an workaround for this ?

Fanis Tharropoulos

09/08/2025, 7:21 AM

Like Jason mentioned, we'd need a reproducible example like this. On my testing, adding

to your collection's

symbols_to_index

parameter fixed the issue. There may be other factors to your current setup that affect this

Rohan Bin Khokon

09/08/2025, 10:58 AM

Reproduced the issue using the following file. Note: Please check the last two curls

typesense-repro-steps.sh

Fanis Tharropoulos

09/09/2025, 10:51 AM

Hey Rohan, thank you for this. I've identified the issue being related to the truncation logic that occurs if a word token is larger than 100 characters when indexing and filtering. We'll mention you when we have a fix

Rohan Bin Khokon

09/09/2025, 1:39 PM

@Fanis Tharropoulos Thank you so much, this really comes as a relief.

Rohan Bin Khokon

09/12/2025, 9:25 AM

Hello @Fanis Tharropoulos, Do you have an estimate on when I can expect the fix to be available? Our client dying for this 😵‍💫

Fanis Tharropoulos

09/15/2025, 8:50 AM

It's this PR: https://github.com/typesense/typesense/pull/2549. You can set notifications by subscribing to the updates of the PR.

👍 1

4 Views

Open in Slack

Previous Next