#community-help

Issue with Inbuilt Model for Creating Embeddings

TLDR Shikhar experiences issues generating embeddings for all documents using an in-built model. Kishore Nallan suggests a re-run or trying to debug the issue, but the problem remains unresolved.

Powered by Struct AI
Aug 16, 2023 (3 months ago)
Shikhar
Photo of md5-0127660b2110fb7180070e21a1b29eef
Shikhar
04:15 AM
Does using an inbuilt model for creating embeddings takes time to create for all the documents, around 4000 of them, in my case, some have the embedding and some don't. Can someone help me out?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
04:56 AM
👋 are you saying it takes a long time to create embedding for your docs using the built-in model?
Shikhar
Photo of md5-0127660b2110fb7180070e21a1b29eef
Shikhar
05:14 AM
Yes, I mean I used auto embedding but I didn't get all the embeddings
05:14
Shikhar
05:14 AM
What could be the case ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
05:18 AM
Can you post a full reproduceable example? That will help me figure out what's going wrong.
Shikhar
Photo of md5-0127660b2110fb7180070e21a1b29eef
Shikhar
08:38 AM
This is embedding schema
Image 1 for This is embedding schema
08:40
Shikhar
08:40 AM
As you can see some of the tools have the embedding
and some do not?
Image 1 for As you can see some of the tools have the embedding
and some do not?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:57 AM
If you re-run import are the values fixed or the same set of records are not populated?
Shikhar
Photo of md5-0127660b2110fb7180070e21a1b29eef
Shikhar
09:02 AM
Havn't tried rerun, but what could be the problem, it should have gone through the process right?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:09 AM
Yes this is unexpected, which is why I'm trying to debug what could be happening here. For e.g. if a particular type of records are failing.

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community

Similar Threads

Issues with Embeddings on Collection with 80K Documents

Samuel experienced issues when enabling embeddings on a large collection, leading to an unhealthy cluster. Kishore Nallan suggested rolling back to a previous snapshot, advised on memory calculations for OpenAI embeddings, and confirmed that creating a new cluster should solve the problem.

1

39
2w

Issues with Cluster Upgrade and Embedding Field

Gustavo had issues upgrading their cluster and their embedding field wasn't being filled. Jason helped to solve the upgrade issue and advised re-indexing the documents to solve the embedding field issue. Both problems were successfully resolved.

8

72
3mo

Discussing Indexing and Embedding Performance in Typesense

Dima had queries about indexing with embedding in Typesense. Kishore Nallan and Jason provided solutions, including reducing documents sent in an API call and running embeddings on a GPU. They facilitated Dima with the latest RC.

1

12
4mo

Utilizing Vector Search and Word Embeddings for Comprehensive Search in Typesense

Bill sought clarification on using vector search with multiple word embeddings in Typesense and using them instead of OpenAI's embedding. Kishore Nallan and Jason informed him that their development version 0.25 supports open source embedding models. They also resolved Bill's concerns regarding search performance, language support, and limitations in the search parameters.

11

225
4mo

Optimum Cluster for 1M Documents with OpenAI Embedding

Denny inquired about the ideal cluster configuration for handling 1M documents with openAI embedding. Jason recommended a specific configuration, explained record size calculation, and clarified embedding generation speed factors and the conditions that trigger openAI.

2

12
3mo