Threading Problem During Multiple Collection Creation and Batch Insertion in Typesense
TLDR Johan has a problem with creating multiple collections and batch-inserting documents into Typesense, which is returning results from different collections. Kishore Nallan helps troubleshoot the issue and suggests a potential local race condition, which is fixed in a later build.
Jul 05, 2022 (16 months ago)
Johan
07:28 AMDoes anyone know if there’s a potential threading issue with creating multiple collections and batch inserting documents?
Kishore Nallan
07:30 AMJohan
07:31 AM{
"facet_counts": [],
"found": 3,
"hits": [
{
"document": {
"_createdAt": 1643040319,
"_publishedAt": 1654612352,
"_updatedAt": 1654503876,
"id": "8e0c59b8-5d01-5cce-b024-8648da3399d3",
"type": "list"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1654079543,
"_publishedAt": 1654612352,
"_updatedAt": 1654503857,
"id": "a0b26cc5-376a-57c2-b0ad-a3bb57ee59cd",
"type": "list"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1643113413,
"_publishedAt": 1654612352,
"_updatedAt": 1654076197,
"id": "fd7cbcf8-3b52-5567-829f-c0349b33c930",
"type": "page"
},
"highlights": [],
"text_match": 100
}
],
"out_of": 3,
"page": 1,
"request_params": {
"collection_name": "list",
"per_page": 10,
"q": "*"
},
"search_cutoff": false,
"search_time_ms": 0
}
Johan
07:31 AMKishore Nallan
07:32 AMpage
and list
collections have the same schema?Johan
07:35 AMlist
:{
"created_at": 1657005437,
"default_sorting_field": "_updatedAt",
"fields": [
{
"facet": false,
"index": true,
"infix": false,
"locale": "",
"name": "_createdAt",
"optional": false,
"sort": true,
"type": "int64"
},
{
"facet": false,
"index": true,
"infix": false,
"locale": "",
"name": "_updatedAt",
"optional": false,
"sort": true,
"type": "int64"
},
{
"facet": false,
"index": true,
"infix": false,
"locale": "",
"name": "_publishedAt",
"optional": true,
"sort": true,
"type": "int64"
},
{
"facet": true,
"index": true,
"infix": false,
"locale": "",
"name": "dataset",
"optional": false,
"sort": false,
"type": "string"
}
],
"name": "list",
"num_documents": 3,
"symbols_to_index": [],
"token_separators": []
}
Johan
07:35 AMpage
:Johan
07:35 AM{
"created_at": 1657005437,
"default_sorting_field": "_updatedAt",
"fields": [
{
"facet": false,
"index": true,
"infix": false,
"locale": "",
"name": "webContent.metaTitle",
"optional": true,
"sort": false,
"type": "string"
},
{
"facet": false,
"index": true,
"infix": false,
"locale": "",
"name": "_createdAt",
"optional": false,
"sort": true,
"type": "int64"
},
{
"facet": false,
"index": true,
"infix": false,
"locale": "",
"name": "_updatedAt",
"optional": false,
"sort": true,
"type": "int64"
},
{
"facet": false,
"index": true,
"infix": false,
"locale": "",
"name": "_publishedAt",
"optional": true,
"sort": true,
"type": "int64"
},
{
"facet": true,
"index": true,
"infix": false,
"locale": "",
"name": "dataset",
"optional": false,
"sort": false,
"type": "string"
}
],
"name": "page",
"num_documents": 23,
"symbols_to_index": [],
"token_separators": []
}
Johan
07:36 AMKishore Nallan
07:36 AMlist
collection and see what happens when you index.Kishore Nallan
07:37 AMlist
collection, then an error will be thrown.Johan
07:37 AMJohan
07:41 AMRequestMalformed: Request failed with HTTP code 400 | Server said: Field
page is not part of collection schema.
Kishore Nallan
07:41 AMJohan
07:41 AMJohan
07:42 AMKishore Nallan
07:44 AMField X is not part of collection schema.
error message is returned as part of schema change :thinking_face:
Johan
07:44 AMKishore Nallan
07:45 AMJohan
07:47 AMKishore Nallan
07:48 AMKishore Nallan
07:51 AMa) Either a client side error where code erroneously sent the wrong document type to the collection.
b) Some race condition inside Typesense that sent the document to the wrong collection.
In both cases, if schema mismatch happens, an error should be thrown. So I'm surprised to see it getting indexed fine now.
Johan
07:51 AMKishore Nallan
07:52 AMJohan
07:52 AMJohan
08:39 AMJohan
08:40 AMJohan
08:40 AMyarn run-test
multiple times it will start to mix collections in response:Johan
08:41 AMcurl -H "X-TYPESENSE-API-KEY: xyz" "<http://localhost:8108/collections/product/documents/search?q=*&query_by=dataset>" | jq
Johan
08:41 AM{
"facet_counts": [],
"found": 23,
"hits": [
{
"document": {
"_createdAt": 1622619587,
"dataset": "global",
"id": "global:8fd389b3-d63a-5a63-b616-7b3320293100",
"type": "productEntryCategory"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1654261822,
"dataset": "global",
"id": "global:6504ace7-7273-5590-8ee8-b263a302d365",
"type": "product"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1654159883,
"dataset": "global",
"id": "global:403d678a-0b39-5caa-8aa2-e583b3737cdb",
"type": "product"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1622619587,
"dataset": "global",
"id": "global:3b385fef-0611-5178-b300-006889e071bc",
"type": "productEntryCategory"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1622619587,
"dataset": "global",
"id": "global:45fdca58-958c-578a-aafa-a09e110b0af4",
"type": "productEntryCategory"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1654088646,
"dataset": "global",
"id": "global:2fc86262-f5b8-5fa7-8010-240f95dae313",
"type": "product"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1622619587,
"dataset": "global",
"id": "global:c7471607-589b-5d20-90e6-92011d1eb194",
"type": "productEntryCategory"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1628508407,
"dataset": "dataset1",
"id": "dataset1:f82356a0-ca90-5efa-9f4c-6b58d9e35a3f",
"type": "author"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1622620250,
"dataset": "global",
"id": "global:72fa23bf-a5d1-5028-ba2e-801ce8841219",
"type": "productEntryCategory"
},
"highlights": [],
"text_match": 100
},
{
"document": {
"_createdAt": 1653301312,
"dataset": "global",
"id": "global:c44292ef-c1ae-58a9-8df7-af01956f6149",
"type": "product"
},
"highlights": [],
"text_match": 100
}
],
"out_of": 23,
"page": 1,
"request_params": {
"collection_name": "product",
"per_page": 10,
"q": "*"
},
"search_cutoff": false,
"search_time_ms": 0
}
Johan
08:44 AMcollectionNames.forEach(async (name) => {})
instead of for (const name of collectionNames)
. The forEach statement will spawn multiple promises and not wait for the old ones to finish, but the for loop will work with async.Kishore Nallan
09:21 AMKishore Nallan
10:06 AMclient
object being shared across all the async functions. Can you try instantiating the client object inside the async function?Jul 19, 2022 (15 months ago)
Kishore Nallan
03:12 AMKishore Nallan
11:18 AM0.24.0.rc20
build.Typesense
Indexed 2779 threads (79% resolved)
Similar Threads
Handling Kinesis Stream Event Batching with Typesense
Dui had questions about how to handle Kinesis stream events with Typesense. Kishore Nallan suggested using upsert mode for creation/update and differentiating with logical deletion. After various discussions including identifying and resolving a bug, they finalized to introduce an `emplace` action in Typesense v0.23.
Typesense Server Bulk Import/Upsert Issue Resolved
Adam was confused about the discrepancy between the successful responses and the actual indexed data while working with a custom WP plugin integrating with Typesense. The issue was a bug related to fetching documents in the wrong order, not a Typesense problem.
Troubleshooting Indexing Duration in Typesense Import
Alan asked about lengthy indexing times for importing documents to Typesense. Jason suggested various potential causes, including network connectivity and system resources. They later identified the problem to be an error in Alan's code.
Resolving Document Upsert Error
S had an error while trying to upsert a document. Kishore Nallan suggested using a different build (`0.12.0 rcs13`). Issue was resolved but S had other issues relating to "id" usage and INI file parsing, awaiting Kishore Nallan response for these.
Troubleshooting 400 Error When Upgrading Typesense Firestore Extension
Orion experienced a `400` error after updating the Typesense Firestore extension, causing issues with cloud functions. They traced the issue back to a data type conflict in their Typesense collection schema after updating. With help from Jason and Kishore Nallan, they resolved the issue by recreating the collection.