Hi does Typesense support vector search on nested fields I m typesense #community-help

Hi, does Typesense support vector search on nested...

Said

01/25/2025, 11:16 AM

Hi, does Typesense support vector search on nested_fields? I'm currently struggling to get a working solution getting this error message: `{'results': [{'code': 400, 'error': 'Field

chunks.DenseVec

does not have a vector query index.'}]}` The current schema would look something similar to this one:

EXAMPLE_SCHEMA = {

fields: [

{"name": {"name": "id", "type": "string"},

{"name": "Date", "type": "string", "sort": True},

{"name": "Document_Summary", "type": "string", "locale": "de", "stem": True},

{"name": "Dense_Summary_Embedding", "type": "float[]","num_dim": 3072},

"name": "chunks",

"type": "object[]",

"optional": True,

"fields": [

{"name": "Chunk_ID", "type": "string"},

{"name": "DenseVec", "type": "float[]","num_dim": 3072},

{"name": "Chunk", "type": "string", "locale": "de", "stem": True}

]}],

"token_separators": [";", ",", ".", ":"],

"default_sorting_field": "Date",

"enable_nested_fields": True,

"symbols_to_index": ["+", "-", "@", "/"]

Conducting search on e.g. Dense_Summary_Embedding works fine, but it gives the error when trying to run it on the chunks embeddings using this code:

Copy code

typesense_results = self.ts_manager.client.multi_search.perform({
            "searches": [{
                "q": "*",
                "collection": self.collection_name,
                "vector_query": f"chunks.DenseVec:([{','.join(str(v) for v in vector_query)}], k:{max_candidates})",
                "exclude_fields": "Dense_Summary_Embedding, chunks.DenseVec"
            }]}, {})

Technically one could flatten the entire thing, but then we would have quite a lot of duplications. Any ideas?

Kishore Nallan

01/25/2025, 11:23 AM

Please post a fully reproducible example using curl using this template so that we can investigate: https://gist.github.com/jasonbosco/7c3432713216c378472f13e72246f46b

Said

01/25/2025, 11:43 AM

Hi there, just create a rather simple example with the template where we also get the same error

Copy code

export TYPESENSE_API_KEY=xyz
mkdir -p "$(pwd)"/typesense-data

docker run -p 8108:8108 \
  -v "$(pwd)"/typesense-data:/data typesense/typesense:27.1 \
  --data-dir /data \
  --api-key=$TYPESENSE_API_KEY \
  --enable-cors
  
  
curl "<http://localhost:8108/collections>" -X POST \
  -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "nested_test",
    "fields": [
      {"name": "id", "type": "string"},
      {"name": "title", "type": "string"},
      {"name": "Summary", "type": "string"},
      {"name": "Dense_Summary_Embedding", "type": "float[]", "num_dim": 5},
      {
        "name": "chunks",
        "type": "object[]",
        "fields": [
          {"name": "text", "type": "string"},
          {"name": "vector", "type": "float[]", "num_dim": 5}
        ]
      }
    ],
    "enable_nested_fields": true
  }'
  
  
curl "<http://localhost:8108/collections/nested_test/documents>" -X POST \
  -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "id": "1",
    "title": "Test Document",
    "Summary": "This is an example summary of document 1",
    "Dense_Summary_Embedding": [0.1, 0.2, 0.3, 0.4, 0.5],
    "chunks": [
      {"text": "First chunk", "vector": [0.1, 0.2, 0.3, 0.1, 0.2]},
      {"text": "Second chunk", "vector": [0.4, 0.5, 0.6, 0.4, 0.5]}
    ]
  }'
  
  
curl "<http://localhost:8108/multi_search>" -X POST \
  -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "searches": [
      {
        "collection": "nested_test",
        "q": "*",
        "vector_query": "chunks.vector:([0.1, 0.2, 0.3, 0.1, 0.2], k:10)"
      }
    ]
  }'

And the error message: `{"results":[{"code":400,"error":"Field

chunks.vector

does not have a vector query index."}]}`

Said

01/25/2025, 12:39 PM

Would be great to know why vector search does not really work out, as Text search does work on the provided example, e.g here:

Copy code

(base) said@said-HP-Laptop-17-cp3xxx:~$ curl "<http://localhost:8108/multi_search>" -X POST \rch" -X POST \
  -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "searches": [
      {
        "collection": "nested_test",
        "q": "chunk",
        "query_by": "chunks.text"
      }
    ]
  }'
{"results":[{"facet_counts":[],"found":1,"hits":[{"document":{"Dense_Summary_Embedding":[0.1,0.2,0.3,0.4,0.5],"Summary":"This is an example summary of document 1","chunks":[{"text":"First chunk","vector":[0.1,0.2,0.3,0.1,0.2]},{"text":"Second chunk","vector":[0.4,0.5,0.6,0.4,0.5]}],"id":"1","title":"Test Document"},"highlight":{"chunks":[{"text":{"matched_tokens":["chunk"],"snippet":"First <mark>chunk</mark>"}},{"text":{"matched_tokens":["chunk"],"snippet":"Second <mark>chunk</mark>"}}]},"highlights":[],"text_match":578730123365187705,"text_match_info":{"best_field_score":"1108091338752","best_field_weight":15,"fields_matched":1,"num_tokens_dropped":0,"score":"578730123365187705","tokens_matched":1,"typo_prefix_score":0}}],"out_of":1,"page":1,"request_params":{"collection_name":"nested_test","first_q":"chunk","per_page":10,"q":"chunk"},"search_cutoff":false,"search_time_ms":0}]}

Kishore Nallan

01/25/2025, 1:11 PM

This is wrong:

Copy code

{
  "name": "chunks",
  "type": "object[]",
  "fields": [
    {
      "name": "text",
      "type": "string"
    },
    {
      "name": "vector",
      "type": "float[]",
      "num_dim": 5
    }
  ]
}

You can't nest fields this way in the schema. You have to use dot notation to refer to nested fields. E.g.

chunks.vector

Said

01/25/2025, 1:34 PM

I dont quite understand, how would the indexing of multiple different chunks then work?

Copy code

(base) said@said-HP-Laptop-17-cp3xxx:~$ curl "<http://localhost:8108/collections>" -X POST \ost:8108/collections" -X POST \
  -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "nested_test_new2",
    "fields": [
      {"name": "id", "type": "string"},
      {"name": "title", "type": "string"},
      {"name": "Summary", "type": "string"},
      {"name": "Dense_Summary_Embedding", "type": "float[]", "num_dim": 5},
      {"name": "chunks.text", "type": "string"},
      {"name": "chunks.vector", "type": "float[]", "num_dim": 5}
    ],
    "enable_nested_fields": true
  }'
{"created_at":1737811871,"default_sorting_field":"","enable_nested_fields":true,"fields":[{"facet":false,"index":true,"infix":false,"locale":"","name":"title","optional":false,"sort":false,"stem":false,"store":true,"type":"string"},{"facet":false,"index":true,"infix":false,"locale":"","name":"Summary","optional":false,"sort":false,"stem":false,"store":true,"type":"string"},{"facet":false,"hnsw_params":{"M":16,"ef_construction":200},"index":true,"infix":false,"locale":"","name":"Dense_Summary_Embedding","num_dim":5,"optional":false,"sort":false,"stem":false,"store":true,"type":"float[]","vec_dist":"cosine"},{"facet":false,"index":true,"infix":false,"locale":"","name":"chunks.text","optional":false,"sort":false,"stem":false,"store":true,"type":"string"},{"facet":false,"hnsw_params":{"M":16,"ef_construction":200},"index":true,"infix":false,"locale":"","name":"chunks.vector","num_dim":5,"optional":false,"sort":false,"stem":false,"store":true,"type":"float[]","vec_dist":"cosine"}],"name":"nested_test_new2","num_documents":0,"symb(base) said@said-HP-Laptop-17-cp3xxx:~$ curl "<http://localhost:8108/collections/nested_test/documents>" -X POST \ctions/nested_test/documents" -X POST \
  -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "id": "1",
    "title": "Test Document",
    "Summary": "This is an example summary of document 1",
    "Dense_Summary_Embedding": [0.1, 0.2, 0.3, 0.4, 0.5],
    "chunks.text": "First chunk", "chunks.vector": [0.1, 0.2, 0.3, 0.1, 0.2],
    "chunks.text": "Second chunk", "chunks.vector": [0.4, 0.5, 0.6, 0.4, 0.5]
  }'
{"message":"A document with id 1 already exists."}

Could you maybe send a working example?

Said

01/25/2025, 1:35 PM

If document A is index by Title and Summary, and I would like it to also have the content in multiple chunks (chunks one with an embedding and chunk two has another one, wouldnt this approach just overwrite the old one as one can see in above example?)

Kishore Nallan

01/25/2025, 4:12 PM

We don't have a way to index an array of vectors.

Said

01/25/2025, 5:11 PM

Still thx for your reply, I kind of thought so. Until that is supported, I guess each doc-level information + chunk in chunks will be used as an own document. After doing that the hybrid search works now, even though it creates duplicates on document level.

Open in Slack

Previous Next