Issues with Importing Typesense Collection to Different Server
TLDR Kevin had problems migrating a Typesense collection between Docusaurus sites on different machines. Jason advised them on JSONL format, handling server hosting, and creating a collection schema before importing documents, leading to successful import.



Aug 15, 2023 (1 month ago)
Kevin
12:44 PMWe have successfully integrated Typesense with Docusaurus on localhost, where both Docusaurus and the Typesense server are running on the same machine and where the typesense/docsearch-scraper docker job has been previously run on the same machine and scraped the localhost Docusaurus site. We would like to move the collection that was created by running typesense/docsearch-scraper to a Typesense server running in the test environment but we are having problems. details in thread.
Kevin
01:51 PM1.) exported the localhost typesense collection
2.) changed the Docusaurus URLs in the collection JSON file to those of the Docusaurus site in the test environment
3.) created a collection in the Typesense server in the test environment
4.) converted the collection JSON file to a JSONL file and
5.) attempted to import the JSONL file to the newly created collection on the Typesense server on the test platform.
Unfortunately, nothing was imported. Here is a sample of the error messages displayed:
{"code":400,"document":"\"symbology\"","error":"Bad JSON: not a properly formed document.","success":false}
{"code":400,"document":"\"etc\"","error":"Bad JSON: not a properly formed document.","success":false}
{"code":400,"document":"\"etc\"","error":"Bad JSON: not a properly formed document.","success":false}
{"code":400,"document":"\"docs-default-current\"","error":"Bad JSON: not a properly formed document.","success":false}
Would anyone know if this is the correct approach? Typesense is a great tool, but it does not appear to be possible to import a collection by itself, at least via curl.
Or maybe this is a problem with scraped docusaurus sites? Maybe the exported JSON collection file needs to be modified in some way prior to conversion to JSONL?
Typesense meets our security needs, but we do need to test it thoroughly first.
Thank you all!
NOTE: If possible, we would have scraped the test Docusaurus site, but it is behind a login and password and Cloudflare Zero Trust (CF), Google Identity-Aware Proxy (IAP) and Keycloak (KC) are not used.
Jason
02:59 PMCould you share the first two lines from the JSONL file you’re trying to import into the new collection?
head -2 your-exported-documents.jsonl
Aug 16, 2023 (1 month ago)
Kevin
07:06 AM"6.5"
"6.5"
"default"
{"lvl0":null,"lvl1":null,"lvl2":null,"lvl3":null,"lvl4":null,"lvl5":null,"lvl6":null}
[{"lvl0":null,"lvl1":null,"lvl2":null,"lvl3":null,"lvl4":null,"lvl5":null,"lvl6":null}]
{"lvl0":null,"lvl1":null,"lvl2":null,"lvl3":null,"lvl4":null,"lvl5":null}
{"lvl0":null,"lvl1":null,"lvl2":null,"lvl3":null,"lvl4":null,"lvl5":null}
Kevin
07:23 AMKevin
08:00 AMJason
01:37 PMKevin
01:42 PMKevin
02:06 PMNow when I attempt to create a collection through an import, the screen returns the following:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 11.4M 0 24 100 11.4M 1 817k 0:00:14 0:00:14 --:--:-- 253k{"message": "Not Found"}
Kevin
02:16 PM11.4M
in the screen output corresponds to the size of the JSONL file.Kevin
02:17 PMJason
02:21 PMJason
02:21 PMKevin
02:22 PMKevin
02:23 PMJason
02:47 PMKevin
02:57 PMKevin
03:05 PMJason
04:59 PMAug 17, 2023 (1 month ago)
Kevin
02:16 PMKevin
02:20 PM{"code":400,"document":" \"current\"","error":"Bad JSON: not a properly formed document.","success":false}
{"code":400,"document":" ],","error":"Bad JSON: [json.exception.parse_error.101] parse error at line 1, column 3: syntax error while parsing value - unexpected ']'; expected '[', '{', or a literal","success":false}
{"code":400,"document":" \"weight\": {","error":"Bad JSON: [json.exception.parse_error.101] parse error at line 1, column 11: syntax error while parsing value - unexpected ':'; expected end of input","success":false}
{"code":400,"document":" \"level\": 0,","error":"Bad JSON: [json.exception.parse_error.101] parse error at line 1, column 12: syntax error while parsing value - unexpected ':'; expected end of input","success":false}
{"code":400,"document":" \"page_rank\": 0,","error":"Bad JSON: [json.exception.parse_error.101] parse error at line 1, column 16: syntax error while parsing value - unexpected ':'; expected end of input","success":false}
{"code":400,"document":" \"position\": 57,","error":"Bad JSON: [json.exception.parse_error.101] parse error at line 1, column 15: syntax error while parsing value - unexpected ':'; expected end of input","success":false}
{"code":400,"document":" \"position_descending\": 1","error":"Bad JSON: [json.exception.parse_error.101] parse error at line 1, column 26: syntax error while parsing value - unexpected ':'; expected end of input","success":
Kevin
02:20 PMKevin
02:21 PMJason
04:13 PMAug 18, 2023 (1 month ago)
Kevin
07:35 AM{
"content": "6.5",
"content_camel": "6.5",
"docusaurus_tag": "default",
"hierarchy": {
"lvl0": null,
"lvl1": null,
"lvl2": null,
"lvl3": null,
"lvl4": null,
"lvl5": null,
"lvl6": null
},
Kevin
08:02 AMJason
04:53 PM{"id": "124", "company_name": "Stark Industries", "num_employees": 5215, "country": "US"}
{"id": "125", "company_name": "Future Technology", "num_employees": 1232, "country": "UK"}
{"id": "126", "company_name": "Random Corp.", "num_employees": 531, "country": "AU"}
Jason
04:53 PMJason
04:54 PMKevin
05:48 PMKevin
06:11 PM


Aug 21, 2023 (1 month ago)
Kevin
07:51 AMJason
05:15 PMJason
05:15 PMTypesense
Indexed 2764 threads (79% resolved)
Similar Threads
Troubleshooting Issues with DocSearch Hits and Scraper Configuration
Rubai encountered issues with search result priorities and ellipsis. Jason helped debug the issue and suggested using different versions of typesense-docsearch.js, updating initialization parameters, and running the scraper on a Linux-based environment. The issues related to hits structure and scraper configuration were resolved.



Resolving "Bad JSON" Error during Typesense Collection Creation
Cassandra had an error creating a Typesense collection. Kishore Nallan resolved this by suggesting to input JSON data directly, not as a file.
Typesense Server Bulk Import/Upsert Issue Resolved
Adam was confused about the discrepancy between the successful responses and the actual indexed data while working with a custom WP plugin integrating with Typesense. The issue was a bug related to fetching documents in the wrong order, not a Typesense problem.

Troubleshooting Typesense Docsearch Scraper Setup Issue
Vinicius experienced issues setting up typesense-docsearch-scraper locally. Jason identified a misconfiguration with the Typesense server after checking the .env file, and recommended using ngrok or port forwarding for development purposes. Vinicius successfully resolved the issue with port forwarding.


Large JSONL Documents Import Issue & Resolution
Suraj was having trouble loading large JSONL documents into Typesense server. After several discussions and attempts, it was discovered that the issue was due to data quality. Once the team extracted the data again, the upload process worked smoothly.
