Issues Importing Data to TypeSense Python Client
TLDR Mehdi was having trouble importing data to a TypeSense Python client. Kishore Nallan released a new Python client version to fix Mehdi's encoding issues.
Jun 30, 2021 (31 months ago)
Mehdi
01:32 PMI'm new to TypeSense
I have a Python client and I'm trying to import data to it using this code:
with open(jsonl_path, encoding='utf-8') as jsonl_file:
client.collections[name].documents.import_(jsonl_file.read(), {'action': 'create'})
It's returning:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 3340888-3340891: Body ('تونس') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.
When I do
jsonl_file.read().encode('utf-8')
it returns TypeError: Object of type bytes is not JSON serializable
How can I solve this?
Kishore Nallan
01:34 PMMehdi
01:43 PMKishore Nallan
01:59 PMKishore Nallan
02:22 PMKishore Nallan
02:29 PM.encode('utf-8')
but it will work now.Mehdi
02:58 PMKishore Nallan
03:44 PMTypesense
Indexed 3011 threads (79% resolved)
Similar Threads
JSONDecodeError Issue in Typesense Collection Creation with Python
Sai encountered a JSONDecodeError while creating a schema. Kishore Nallan tested the same setup but did not face any issues. Sai will check on their side and get back.
Resolving JSONL File Import Issues in Python
Jon struggles importing a large JSONL file using Python, encountering decode errors and size restrictions. Kishore Nallan instructs to use curl for imports under 10GB, and references an update to the Python client which could more capably handle large imports.
Issues with Importing Typesense Collection to Different Server
Kevin had problems migrating a Typesense collection between Docusaurus sites on different machines. Jason advised them on JSONL format, handling server hosting, and creating a collection schema before importing documents, leading to successful import.
Typesense JSON Decoding Issues
Philip faced issues with JSON decoding errors when redeploying typesense. Jason suggested checking the Typesense logs and trying with curl, providing ways to diagnose the issue. Philip planned to investigate further.
Troubleshooting Write Timeouts in Typesense with Large CSVs
Agustin had issues with Typesense getting write timeouts while loading large CSV files. Kishore Nallan suggested chunking data or converting to JSONL before loading. Through troubleshooting, they identified a possible network problem at AWS and found a workaround.