Harrison Burt
01/12/2022, 9:00 PMJason Bosco
01/13/2022, 12:53 AMWhen you use the wrong method, TypeSense returns a 404 not a 405 Method Not Allowed. which was a nightmare when you wonder when you're thinking you've put the wrong url in.Interesting, hadn't considered HTTP code before, and I've certainly been surprised by some 404s I've seen when I used GET instead POST or vise versa. re: JSONL for bulk imports - the reason we use that format is for performance reasons primarily and then to reduce memory consumption during an import. If the input was an array of say 1M documents, we would have to first JSON parse the entire array before we can start indexing. Whereas if it's in JSONL format, we can JSON parse line-by-line and do a streaming import. JSON parsing is unfortunately a very resource heavy operation, so the smaller the JSON string, the better the performance. Also, when we do a streaming import like this, we don't have to store the entire string in memory, to then parse it all at once. Instead we can parse line-by-line. This avoids any big memory spikes during indexing. Now of course, we eventually index everything in memory, but having to hold the entire json parsed dataset in memory and then loop through it to index it almost doubles memory requirements which we don't want.
Personally I think the docs for the bulk import should be next to the single doc upload, because realistically when you first set the system up, you're probably going to be importing in bulk no?Great point. Will address this shortly.
The bulk imports return 200 OK and just ignore a invalid payload,The HTTP response should contain
{success: true}
or {success: false, error: X}
for every document that was sent in the import. The reason we respond with a 200 is because there might be some documents which were indexed successfully and others that error out, which is what is indicated in the response body. Returning some other error code when a subset of documents errored out and others succeeded felt off, which is why just return a 200.
The 200 is really to indicate that the server processed the whole import. Whether each record went through successfully or not is indicated in the response body.Harrison Burt
01/13/2022, 8:55 AMThe HTTP response should containor{success: true}
for every document that was sent in the import. The reason we respond with a 200 is because there might be some documents which were indexed successfully and others that error out, which is what is indicated in the response body. Returning some other error code when a subset of documents errored out and others succeeded felt off, which is why just return a 200.{success: false, error: X}
The 200 is really to indicate that the server processed the whole import. Whether each record went through successfully or not is indicated in the response body.That does make sense, although I think that could do with being made a little more obvious in the docs 😅 Goes on about JSON import via the api, then some version using
cat
then something about csv's then it briefly mentioned that behaviour when I read through it again now 😅