#community-help

Querying with Not-in in Typesense

TLDR Masahiro inquired about using not-in queries in Typesense. Kishore Nallan explained how to conduct such queries by using the "-" operator in the query string, and assisted Masahiro with issues stemming from a high number of exclusion tokens. The problem was eventually resolved by switching to the multi_search endpoint.

Powered by Struct AI

1

1

1

1

Apr 22, 2021 (34 months ago)
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
06:31 AM
Is it possible to use not-in query in Typesense?
In algolia, it is possible with
index.search('query', {
  facetFilters: 'Name:-John'
})
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:34 AM
The "-" operator is only supported for query strings at the moment, and not inside filters.
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
06:36 AM
query_by ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:36 AM
Correct
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
06:41 AM
let documents = {
  'id': '124',
  'company_name': 'Stark Industries',
  'num_employees': 5215,
  'country': 'USA'
},
{
  'id': '125',
  'company_name': 'Apple',
  'num_employees': 20000,
  'country': 'USA'
},

Let’s say I have these documents.
When I want to retrieve the first data by filtering
company_name !==Stark Industries
How can I achieve this?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:42 AM
?q=-stark -industries will exclude both those tokens from search results.

1

Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
06:45 AM
query_by?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:46 AM
Query by on the field you wish to query, which in this example is company_name
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
06:49 AM
So, would query look like this?
let searchParameters = {
        'q':'company_name',
       'query_by':'?q=-stark -industries',
      }
   const result =  await client.collections('users').documents().search(searchParameters)

Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:50 AM
No, like this:

let searchParameters = {
        'q':'-stark -industries',
'query_by':'company_name'
      }
   const result =  await client.collections('users').documents().search(searchParameters)
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
06:52 AM
Working, thanks!
How many parameters does typesense allow for q ? above example has 2 parameters.
06:53
Masahiro
06:53 AM
maximum number of parameters.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:53 AM
q is a search query which is really a lust of keywords / tokens you are searching for. When a keyword is prefixed with a - it just tell typesense to treat it as an exclusion explicitly. You can have as many tokens as you want, but as you add more tokens, things can become slower.

1

06:54
Kishore Nallan
06:54 AM
Look into adjusting the drop_tokens_threshold and typo_tokens_threshold for performance.
06:54
Kishore Nallan
06:54 AM
Details are in docs.
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
06:54 AM
OK! Thank you so much!!!
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:54 AM
👍
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
07:19 AM
One more question, does q parameter support string[]?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:19 AM
Yes you can query array of strings.

1

Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
08:16 AM
const userIdList = [''] // This contains 850 user ids
let searchParameters = {
        'q':`${userIdList}`,
       'query_by':'userId',
      }
      try{
        const result =  await client.collections('users').documents().search(searchParameters)
        console.log(result['found']);
      }catch(e){
          console.log(e);
      }

When I run this code in Typesense cluster,
These errors were emitted.
Request #1619079225148: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1619079225148: Sleeping for 0.1s and then retrying request...
Request #1619079225148: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1619079225148: Sleeping for 0.1s and then retrying request...
{ Error: socket hang up
    at createHangUpError (_http_client.js:332:15)
    at TLSSocket.socketOnEnd (_http_client.js:435:23)
    at TLSSocket.emit (events.js:203:15)
    at TLSSocket.EventEmitter.emit (domain.js:448:20)
    at endReadableNT (_stream_readable.js:1145:12)
    at process._tickCallback (internal/process/next_tick.js:63:19)
  code: 'ECONNRESET',
08:17
Masahiro
08:17 AM
When the id List was below 850, everything worked fine.
(The number of documents are around 2500)
I extended ‘connectionTimeoutSeconds’: to 100 but did not work.
08:21
Masahiro
08:21 AM
Additional info,
When uploading documents, the process was suddenly shut down with the same message above around 2500 documents(tried to upload 10,000 docs)
Request #1619079178557: Request to Node 0 failed due to "ECONNABORTED timeout of 2000ms exceeded"
Request #1619079178557: Sleeping for 0.1s and then retrying request...
Request #1619079178557: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1619079178557: Sleeping for 0.1s and then retrying request...
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:23 AM
Query should be string but is sent as a list in your example.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:25 AM
What I meant is you can query a field whose type is a string[]. The q parameter must be a string. The query_by parameter must also be a comma separated string of field names.
08:26
Kishore Nallan
08:26 AM
You can try these things locally on your box using a Docker container.
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
08:28 AM
Ok, So when I want to filter value from array of strings, how can I achieve this?
08:28
Masahiro
08:28 AM
workarounds for q:${userIdList}``
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:31 AM
I presume you want to use the exclusion operator also?
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
08:31 AM
Yes
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:32 AM
Send a space separated list of IDs, like user1 user2 -user3 (here user3 is getting excluded)
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
08:33 AM
OK, I will try
08:47
Masahiro
08:47 AM
const ids =  list.toString().replace(/,/g,'');
   
    let searchParameters = {
        'q':`${ids}`,
       'query_by':'userId',
      }
      try{
        const result =  await client.collections('users').documents().search(searchParameters)
        console.log(result['found']);
      }catch(e){
          console.log(e);
      }

I changed my code.
After changing up to 870 user id was excluded.
08:48
Masahiro
08:48 AM
However, after 880 user ids, an error was shown
Request #1619081145364: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1619081145364: Sleeping for 0.1s and then retrying request...
Request #1619081145364: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1619081145364: Sleeping for 0.1s and then retrying request...
{ Error: socket hang up
    at createHangUpError (_http_client.js:332:15)
    at TLSSocket.socketOnEnd (_http_client.js:435:23)
    at TLSSocket.emit (events.js:203:15)
    at TLSSocket.EventEmitter.emit (domain.js:448:20)
    at endReadableNT (_stream_readable.js:1145:12)
    at process._tickCallback (internal/process/next_tick.js:63:19)
  code: 'ECONNRESET',
08:49
Masahiro
08:49 AM
output of ids
 -00DNZbtTCITocafJcAQQiwpNoHl1 -00WKEWPCPMhYNpUMWFxFOIDPyCr2 -00ZF115hfFRoUc7jKIfULuzQNWx1 -00bOGtDLzJY7X5KoMw3kTOuz7SG3 -00uuZMTjXiMTdYokCaFJakxqblI3
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:51 AM
Can you try setting num_typos to 0? The q field is not technically meant to be used this way with so many tokens. So what I have suggested might not work for large values. In any cases, I think you should run Typesense locally on your machine and check the logs to see if some error shows up when you query this way. Or, the query might just be taking a really long time and timing out because of the number of tokens used.
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
08:53 AM
not working…
08:53
Masahiro
08:53 AM
hmm…
08:54
Masahiro
08:54 AM
let searchParameters = {
        'q':`${ids}`,
       'query_by':'userId',
       'num_typos':'0',
      }
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:58 AM
So is the query containing only exclusion tokens?
08:58
Kishore Nallan
08:58 AM
Or does it also contain some userIds NOT prefixed by - ?
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
08:59 AM
Yes, only exclusion tokens.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:00 AM
How many records are there in the collection being queried?
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
09:00 AM
around 2500
09:01
Masahiro
09:01 AM
I consider using typesense for my company’s production app.
So, if upgrading cluster will solve the problem, I will pay for that.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:01 AM
Hard to say exactly what's going wrong. Not sure if the dataset can be shared. But if you can, email me (even another sample data set that exhibits the same problem is fine) and the exact query used and perhaps I can find out what's going wrong.
Masahiro
Photo of md5-366dff6b5f9b1a7d0f404fdc3261e573
Masahiro
09:02 AM
Or can I use DM on Slack?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:02 AM
Yeah sure
09:44
Kishore Nallan
09:44 AM
For others following this thread, this turned out to be an issue because of the restriction on the total length of the query params in a GET request. Switching to the multi_search endpoint (which uses POST) worked: https://typesense.org/docs/0.19.0/api/documents.html#federated-multi-search

1

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community

Similar Threads

Querying and Indexing Multiple Elements Issues

Krish queried fields with multiple elements, which Kishore Nallan suggested checking `drop_tokens_threshold`. Krish wished to force OR mode for token, but Kishore Nallan admitted the feature was missing. Krish was able to resolve the issue with url encoding.

34
12mo

Phrase Search Relevancy and Weights Fix

Jan reported an issue with phrase search relevancy using Typesense Instantsearch Adapter. The problem occurred when searching phrases with double quotes. The team identified the issue to be related to weights and implemented a fix, improving the search results.

6

111
8mo

Querying with Typesense-Js and Handling Null Values

michtio was querying using typesense-js and receiving fewer results than expected. Kishore Nallan suggested using different query parameters. Further discussion led to the handling of 'null' values and filtering syntax in the search queries. The thread ended with Jason offering migration support from Algolia to Typesense.

4

39
17mo

Troubleshooting TypeScript Error with Typesense

GM experienced an error with Typesense in TypeScript, requiring help to correct the issues. Jason helped propose solutions and adjustments to the code. Ultimately, they were able to resolve the errors and successfully implement a search function.

4

38
14mo

Inconsistent Search Results with Typesense

David reported inconsistencies with infix searching using Typesense, despite no change in configuration. Upon review, Jason could not consistently reproduce the issue and suggested potential fixes including a debug build on the user's cluster. The issue remains unresolved.

6

59
1mo