Typesense Query Suggestions Throttling Mechanism Discussion
TLDR Arad asked about potential abuse of Typesense's query suggestions feature. Jason explained how query uniqueness gets determined based on the
analytics-flush-interval. It was mentioned creating two GitHub issues about implementing a request limit and ignoring certain search analytics.
Oct 09, 2023 (1 month ago)
countof a query?
Imagine if a user tried to abuse the system by sending requests for the same query 100 hundred times in quick succession, would that cause Typesense to increment the
analytics-flush-intervaland the aggregation key
We consider unique search terms after a 4 second delay after the last keypress
X-TYPESENSE-USER-IDwill "collapse" into one, so to speak? And is that collapsing limited to the duration of
analytics-flush-interval? Meaning that if
analytics-flush-intervalis, like, 5 seconds, If the same user (with the same
X-TYPESENSE-USER-ID) sends the same query twice, once now, and once 10 seconds from now, that will increase the
X-TYPESENSE-USER-IDis used to group keypresses in a search-as-you-type experience.
So for eg, if the user types in
termone letter at a time, the search queries will show up to Typesense as:
Typesense will wait for 4s after the last query
termfor a given
X-TYPESENSE-USER-IDand collapse the previous searches for
analytics-flush-intervalis actually independent from the the
4sinterval. flush interval is when the collected analytics logs are analyzed and the aggregation I mentioned above is performed across all users, across all search terms
But, this is actually not based on analytics-flush-interval like I mentioned earlier (my bad - I edited that out), but based on a fixed 4s window.
So there isn't a throttling mechanism specifically for preventing abuse of this kind. The type of thing I was thinking about was more along the lines of having a limit for the amount of times the
countof a query is incremented within N seconds/minutes/hours. So that, for example, even if there's a 1,000 requests for the same query within the span of 3 hours, all that just increments
So I'll probably have to use a custom collection for this that my app populates on its own, according to whatever custom heuristics it may have (given all the requests to Typesense actually go through our backend, it shouldn't be too difficult to implement this,)
One last question: Is there a way to tell Typesense (e.g. via a query string parameter when sending a request to the search endpoint) that it should ignore that particular search in terms of analytics and not store its query in the queries collection?
Typesense doesn't have this... But I think that will be a useful feature to support.
> Is there a way to tell Typesense (e.g. via a query string parameter when sending a request to the search endpoint) that it should ignore that particular search in terms of analytics
This is not possible at the moment, but I was thinking about this myself recently.
Could you create two GitHub issues, so we can track these?
Indexed 3015 threads (79% resolved)
Troubleshooting Typesense API Analytics Query Suggestions
Md was confused about implementing Typesense's Analytics Query Suggestions and experienced issues with collections returning no hits. Assistance from Kishore Nallan eventually led to the identification that analytics had to be enabled. They also discussed tracking duplicate and empty queries, resulting in Md creating a Github issue.
Typesense Capabilities and Troubleshooting Queries
A had issues with refinement lists and analytics in Typesense. Jason provided a possible solution and recommended the analytics widget. They clarified import size limits and helped identify a filter issue in A's query. Upgrade options are in Typesense's roadmap.
Fetching All Docs from a Collection in Typesense
Julian asked if all docs could be fetched from a Typesense collection, and Kishore Nallan explained there's a 250 result limit due to performance considerations. Andrew suggested using the export function, explaining their operations and performance.
Issue with Search Duration on Typesense Database
Robert reported an issue about query time delay when adding a `filter_by` constraint in a large Typesense database. Kishore Nallan explained that this happens due to the order of operation and also promised to look into this issue further. Robert withdrew his interest in sponsoring the improvement due to moving from the project.
Integrating Semantic Search with Typesense
Krish wants to integrate a semantic search functionality with typesense but struggles with the limitations. Kishore Nallan provides resources, clarifications and workarounds to the raised issues.