I m having trouble with <https typesense org docs 28 0 api s typesense #community-help

I'm having trouble with <query_by_weights>. I hav...

Scott Nei

05/06/2025, 7:29 PM

I'm having trouble with query_by_weights. I have 3 code fields (barcode, SKU, manfacturer_number) and I want to search all 3 equally. But even if I use query_by_weights with equal values, I get results for exact matches of earlier items. Product 1: • SKU: 12345 • MFR#: [some string] • Barcode: [some string] Product 2: • SKU: [some string] • MFR#: 12345 • Barcode: [some string] In that example if I search for "12345" and have "query_by=sku, mfr#, barcode" and "query_by_weights=100,100,100" I still get back just 1 hit for Product 1. If I make the weights "100,120,100" then I get 1 hit for Product 2. Based in the documentation, I expected that equal weights would return both items since they're both an exact match. Is there some other nuance I need to account for, or some bug?

Alan Martini

05/06/2025, 8:50 PM

Hi @Scott Nei! What typesense version are you using?

Willie

05/06/2025, 8:54 PM

I work with Scott. Here is your answer

✅ 1

Alan Martini

05/06/2025, 9:40 PM

@Willie and @Scott Nei, Im trying to replicate the issue you are having. The following code worked as expected in the version given. Could you please review and confirm me is the exact situation as yours? Also, can you give me your cluster name? I will try checking the configurations in the schema to see if some other field or configuration could be interfering. Lastly, if you'd like, you can give me a curl replicating the issue you are having so we can look closely.

Alan Martini

05/06/2025, 9:40 PM

Sem título.sh

Scott Nei

05/07/2025, 7:22 PM

@Alan Martini Here are the cURLs, sanitized for security: • cURL 1 returns 4 items, that matched on a value from the SKU array. • cURL 2 returns 9 items, that matched on a value from the mfrNumber array. • They both include query_by_weights set to equal values, and only the order of the query_by field is different. • I can make cURL 1 return the same 9 documents if I change the weights to 100,100,120,100,100 and emphasize the mfrNumber over the sku. cURL 1:

Copy code

curl --location '<https://yuke>...-1.a1.typesense.net/collections/products_prod/documents/search?q=42750&query_by=barcodes%2Cskus%2Cmanufacturer.mfrNumbers%2Cdescription%2CsupplierDescriptions&query_by_weights=100%2C100%2C100%2C100%2C100' \
--header 'accept: application/json, text/plain, */*' \
--header 'accept-language: en-US,en;q=0.9' \
--header 'origin: <https://app.abc.com>' \
--header 'priority: u=1, i' \
--header 'referer: <https://app.abc.com/productSearch=42750>' \
--header 'sec-ch-ua: "Google Chrome";v="135", "Not-A.Brand";v="8", "Chromium";v="135"' \
--header 'sec-ch-ua-mobile: ?0' \
--header 'sec-ch-ua-platform: "Windows"' \
--header 'sec-fetch-dest: empty' \
--header 'sec-fetch-mode: cors' \
--header 'sec-fetch-site: cross-site' \
--header 'user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36' \
--header 'x-typesense-api-key: abc'

cURL 2:

Copy code

curl --location '<https://yuke>...-1.a1.typesense.net/collections/products_prod/documents/search?q=42750&query_by=barcodes%2Cmanufacturer.mfrNumbers%2Cskus%2Cdescription%2CsupplierDescriptions&query_by_weights=100%2C100%2C100%2C100%2C100' \
--header 'accept: application/json, text/plain, */*' \
--header 'accept-language: en-US,en;q=0.9' \
--header 'origin: <https://app.abc.com>' \
--header 'priority: u=1, i' \
--header 'referer: <https://app.abc.com/productSearch=42750>' \
--header 'sec-ch-ua: "Google Chrome";v="135", "Not-A.Brand";v="8", "Chromium";v="135"' \
--header 'sec-ch-ua-mobile: ?0' \
--header 'sec-ch-ua-platform: "Windows"' \
--header 'sec-fetch-dest: empty' \
--header 'sec-fetch-mode: cors' \
--header 'sec-fetch-site: cross-site' \
--header 'user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36' \
--header 'x-typesense-api-key: abc'

Alan Martini

05/07/2025, 7:24 PM

Thank you @Scott Nei

Scott Nei

05/08/2025, 2:02 PM

@Alan Martini Could you advise on what you've found today? We are in a Beta launch phase for our production users now, but we plan to fully rollout this new search engine next week. But I don't think we can release with this inconsistency. If there is a discrete bug, we could temporarily get around it by creating a new concatenated field of "codes" so they all get equal weight. Or if there is a parameter we need to adjust or something to differently to get the behavior we want, we'll adjust that. But as it is I don't know if there is some broader issue that might stop us from releasing next week.

Alan Martini

05/08/2025, 4:28 PM

Hi @Scott Nei, It looks like the problem isn’t in

query_by_weights

itself, but rather in

query_by

. Changing the order of itens in it seems to change the results. The team is actively investigating the root cause. The concatenated approach you mentioned is a good work around for now!

Scott Nei

05/08/2025, 4:57 PM

@Alan Martini My only concern with the concatenated workaround now, is not knowing the root cause. It would resolve this immediate issue, but if the root cause has other side effects we just haven't stumbled on yet, I'm hesitant to add a quick fix for this scenario and launch with the risk of other scenarios revealing themselves. Could you let me know as soon as you have some progress on the root cause, or an expected time frame for it?

Scott Nei

05/09/2025, 11:45 AM

Hi @Alan Martini , just checking in again. Is there progress on root cause, to confirm if a simple concatenation fully works around the issue?

Alan Martini

05/09/2025, 5:32 PM

Hey @Scott Nei, We’ve been trying to reproduce the issue on our test dataset for several hours but haven’t had any luck so far. It seems like it might be something specific to your data, as we haven’t seen similar reports from other users. To get to the bottom of this, we’re now cloning your cluster into a debug environment on our side with enhanced tracing enabled. This will help us narrow down the root cause. Timing-wise, we probably won't be able to get to an actual fix until later next week.

👍 1

Alan Martini

05/09/2025, 9:15 PM

Hi @Scott Nei, We’ve identified a better fix for the issue you’re running into. Setting the

max_candidates

parameter to

should yield the expected results. Here’s more info on the max_candidates parameter. It looks like the order of the

query_by

values is affecting the default value (

) of

max_candidates

some way, which we're looking into separately.

Scott Nei

05/10/2025, 12:42 PM

@Alan Martini did that technique work for you? I tried it last night in our app by including it in our preset, and there was no difference. I’ll try again with the simplified cURL I shared above.

Alan Martini

05/10/2025, 4:08 PM

Hey @Scott Nei, It did! I will share with out the test we made, one moment

Alan Martini

05/10/2025, 4:14 PM

Output:

Copy code

--------------------------------
Querying '42750' skus with max_candidates
Amount of hits: 77
--------------------------------
Querying '42750' description with max_candidates
Amount of hits: 15
--------------------------------
Querying '42750' skus,description with max_candidates
Amount of hits: 92
--------------------------------
Querying '42750' description,skus with max_candidates
Amount of hits: 92

If you can share the code snippet your app is using, I can help debug it with you.

Sem título

Scott Nei

05/12/2025, 1:23 PM

I tried again this morning and it is working with max_candidates. Maybe I misspelled something, or hit a cache somewhere. But I think we're unblocked for now. Is there a bug issue for this? Should query_by + query_by_weights normally not need this max_candidates parameter to do what we want?

Alan Martini

05/12/2025, 9:45 PM

Hi @Scott Nei, After a deeper look, it turns out that what you observed is actually the expected behavior in Typesense. By default,

max_candidates

is set to 4, and this value applies across all fields listed in

query_by

. That means once 4 candidates are found in total, the search stops, and priority is given based on the order of fields in

query_by

. So when you set the weights to

100,120,100

, you were explicitly boosting the second field, which caused Typesense to prioritize it—leading to the results you saw. Given the structure of your dataset, adjusting

max_candidates

is the right move. The way it's currently working aligns with how the system is designed.

👍 1

thankyou 1

2 Views

Open in Slack

Previous Next