Hello folks, I am trying to implement Many-to-man...
# community-help
u
Hello folks, I am trying to implement Many-to-many Join in my collections. But I got stuck at one use case. Referring the example given in the documentation of typesense. https://typesense.org/docs/26.0/api/joins.html#many-to-many-relation I want to search in the "documents" collection and get all the matched documents with all the users joined with the documents. For example, If document-1 is mapped with user-1, user-5 and user-9 and document-1 gets matched with the query, Then the document-1 received in the search result should also have the entire data of above 3 users from the "users" collection in an array form automatically. What should I do to achieve this?
j
CC: @Harpreet Sangar
h
This would help you get all the related users:
Copy code
{
  "q": "*",
  "collection": "documents",
  "filter_by": "$user_doc_access(id:*)"
}
I forgot to add
include_fields
parameter, the query will be similar to the example mentioned in the docs.
Copy code
{
  "q": "*",
  "collection": "documents",
  "query_by": "title",
  "filter_by": "$user_doc_access(id: *)",
  "include_fields": "$users(*) as user_identifier"
}
the entire data of above 3 users from the "users" collection in an array form automatically.
This to be specific:
Copy code
"include_fields": "$users(*: strategy:nest_array)"
u
If a document have single user mapped, then it gives a single object, It becomes array of object if users are >1. Is there any way to get users data in an array of object form irrespective of number of users ?
h
Yes, with
strategy:nest_array
you'll get an array always.
u
Here is the implementation, For more than 1 users, it gave an array of objects. Reff. Document id 10.
But, For a single user, it gave an object. Reff. document id 9.
h
Seems like a bug. I'll check it out.
u
But...
If I do like this (removed *), then It works! 🥲
Copy code
"include_fields": "$users(: strategy:nest_array)"
😅 1
h
There should be no difference between these two.
1
u
It seems like secret feature! 😬
h
Definitely 😆
Also, could you try with
$users(id : strategy:nest_array)
u
Tried. Works similar as with *.
h
Okay, thanks. Looking into this.
u
Until then, should I implement "without *" option ?
h
How about
Copy code
"include_fields": "$user_doc_access($users(*: strategy:nest_array) : strategy:nest_array)",
This should work fine.
Even better would be
Copy code
"include_fields": "$user_doc_access($users(*: strategy:merge) : strategy:nest_array)",
u
A VERY SERIOUS ISSUE WITH THIS:
Copy code
"include_fields": "$user_doc_access($users(*: strategy:nest_array) : strategy:nest_array)"
It stopped my docker container!!!!! Here are the logs,
Copy code
I20240502 06:39:58.191576   393 raft_server.h:60] Peer refresh succeeded!
E20240502 06:39:58.582654   107 backward.hpp:4200] Stack trace (most recent call last) in thread 107:
E20240502 06:39:58.582688   107 backward.hpp:4200] #14   Object "", at 0xffffffffffffffff, in 
E20240502 06:39:58.582696   107 backward.hpp:4200] #13   Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7a06fcdffa03, in __clone
E20240502 06:39:58.582703   107 backward.hpp:4200] #12   Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7a06fcd6eac2, in 
E20240502 06:39:58.582710   107 backward.hpp:4200] #11   Object "/opt/typesense-server", at 0x57a61c205843, in execute_native_thread_routine
E20240502 06:39:58.582715   107 backward.hpp:4200] #10 | Source "include/threadpool.h", line 57, in operator()
E20240502 06:39:58.582720   107 backward.hpp:4200]       Source "/usr/include/c++/10/future", line 1592, in ThreadPool [0x57a6196e5b1c]
E20240502 06:39:58.582726   107 backward.hpp:4200] #9  | Source "/usr/include/c++/10/future", line 1459, in _M_set_result
E20240502 06:39:58.582733   107 backward.hpp:4200]     | Source "/usr/include/c++/10/future", line 412, in call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>()>*, bool*>
E20240502 06:39:58.582739   107 backward.hpp:4200]     | Source "/usr/include/c++/10/mutex", line 729, in __gthread_once
E20240502 06:39:58.582746   107 backward.hpp:4200]       Source "/usr/include/x86_64-linux-gnu/c++/10/bits/gthr-default.h", line 700, in _M_run [0x57a619896a73]
E20240502 06:39:58.582751   107 backward.hpp:4200] #8    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7a06fcd73ee7, in 
E20240502 06:39:58.582757   107 backward.hpp:4200] #7  | Source "/usr/include/c++/10/future", line 572, in operator()
E20240502 06:39:58.582762   107 backward.hpp:4200]       Source "/usr/include/c++/10/bits/std_function.h", line 622, in _M_do_set [0x57a6196e4d32]
E20240502 06:39:58.582768   107 backward.hpp:4200] #6  | Source "/usr/include/c++/10/bits/std_function.h", line 292, in __invoke_r<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<_Fn, _Alloc, _Res(_Args ...)>::_M_run<std::_Bind<HttpServer::process_request(const std::shared_ptr<http_req>&, const std::shared_ptr<http_res>&, route_path*, const h2o_custom_req_handler_t*, bool)::<lambda()>()>, std::allocator<int>, void, {}>::<lambda()>, void>&>
E20240502 06:39:58.582774   107 backward.hpp:4200]     | Source "/usr/include/c++/10/bits/invoke.h", line 115, in __invoke_impl<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<_Fn, _Alloc, _Res(_Args ...)>::_M_run<std::_Bind<HttpServer::process_request(const std::shared_ptr<http_req>&, const std::shared_ptr<http_res>&, route_path*, const h2o_custom_req_handler_t*, bool)::<lambda()>()>, std::allocator<int>, void, {}>::<lambda()>, void>&>
E20240502 06:39:58.582779   107 backward.hpp:4200]     | Source "/usr/include/c++/10/bits/invoke.h", line 60, in operator()
E20240502 06:39:58.582785   107 backward.hpp:4200]     | Source "/usr/include/c++/10/future", line 1397, in operator()
E20240502 06:39:58.582790   107 backward.hpp:4200]     | Source "/usr/include/c++/10/future", line 1456, in __invoke_r<void, std::_Bind<HttpServer::process_request(const std::shared_ptr<http_req>&, const std::shared_ptr<http_res>&, route_path*, const h2o_custom_req_handler_t*, bool)::<lambda()>()>&>
E20240502 06:39:58.582796   107 backward.hpp:4200]     | Source "/usr/include/c++/10/bits/invoke.h", line 110, in __invoke_impl<void, std::_Bind<HttpServer::process_request(const std::shared_ptr<http_req>&, const std::shared_ptr<http_res>&, route_path*, const h2o_custom_req_handler_t*, bool)::<lambda()>()>&>
E20240502 06:39:58.582801   107 backward.hpp:4200]     | Source "/usr/include/c++/10/bits/invoke.h", line 60, in operator()<>
E20240502 06:39:58.582808   107 backward.hpp:4200]     | Source "/usr/include/c++/10/functional", line 499, in __call<void>
E20240502 06:39:58.582813   107 backward.hpp:4200]     | Source "/usr/include/c++/10/functional", line 416, in __invoke<HttpServer::process_request(const std::shared_ptr<http_req>&, const std::shared_ptr<http_res>&, route_path*, const h2o_custom_req_handler_t*, bool)::<lambda()>&>
E20240502 06:39:58.582818   107 backward.hpp:4200]     | Source "/usr/include/c++/10/bits/invoke.h", line 95, in __invoke_impl<void, HttpServer::process_request(const std::shared_ptr<http_req>&, const std::shared_ptr<http_res>&, route_path*, const h2o_custom_req_handler_t*, bool)::<lambda()>&>
E20240502 06:39:58.582823   107 backward.hpp:4200]     | Source "/usr/include/c++/10/bits/invoke.h", line 60, in operator()
E20240502 06:39:58.582830   107 backward.hpp:4200]       Source "src/http_server.cpp", line 706, in _M_invoke [0x57a619898993]
E20240502 06:39:58.582836   107 backward.hpp:4200] #5    Source "src/core_api.cpp", line 474, in get_search [0x57a61980b81f]
E20240502 06:39:58.582842   107 backward.hpp:4200] #4    Source "src/collection_manager.cpp", line 1888, in do_search [0x57a6197b7aa2]
E20240502 06:39:58.582849   107 backward.hpp:4200] #3    Source "src/collection.cpp", line 2680, in search [0x57a61973df68]
E20240502 06:39:58.582854   107 backward.hpp:4200] #2    Source "src/collection.cpp", line 5497, in prune_doc [0x57a61972baa2]
E20240502 06:39:58.582860   107 backward.hpp:4200] #1    Source "src/collection.cpp", line 5310, in include_references [0x57a61972a152]
E20240502 06:39:58.582867   107 backward.hpp:4200] #0    Source "src/collection.cpp", line 5173, in prune_ref_doc [0x57a619728d3e]
Segmentation fault (Address not mapped to object [0x28])
E20240502 06:40:02.481997   107 typesense_server.cpp:137] Typesense 26.0 is terminating abruptly.
And My postman SS,
Same thing happened with this also,
Copy code
$user_doc_access($users(*: strategy:merge) : strategy:nest_array)
h
Okay, thanks for testing and reporting the issue. I'll let you know when I've fixed this.
👍🏼 1
u
Hello, Is there any update on the issue ?
h
Hey @Urvis I have other work on priority. I'll pick this issue up next.
u
Okay
h
I tried to reproduce the issue with:
Copy code
curl "<http://localhost:8108/collections>" -X POST -H "Content-"type": application/json" \
       -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d '{
         "name": "Repos",
         "fields": [
              { "name": "repo_id", "type": "string" },
              { "name": "repo_stars", "type": "int32" }
         ]
       }'
curl "<http://localhost:8108/collections>" -X POST -H "Content-"type": application/json" \
       -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d '{
         "name": "Links",
         "fields": [
              { "name": "repo_id", "type": "string", "reference": "Repos.repo_id" },
              { "name": "user_id", "type": "string", "reference": "Users.user_id" }
         ]
       }'
curl "<http://localhost:8108/collections>" -X POST -H "Content-"type": application/json" \
       -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d '{
         "name": "Users",
         "fields": [
              { "name": "user_id", "type": "string" },
              { "name": "user_name", "type": "string" }
         ]
       }'

curl "<http://localhost:8108/collections/Users/documents/import?action=create>" \
        -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
        -H "Content-Type: text/plain" \
        -X POST \
        -d '{"user_id": "user_a","user_name": "Roshan"}
            {"user_id": "user_b","user_name": "Ruby"}'

curl "<http://localhost:8108/collections/Repos/documents/import?action=create>" \
        -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
        -H "Content-Type: text/plain" \
        -X POST \
        -d '{"repo_id": "repo_a","repo_stars": 5215}
            {"repo_id": "repo_b","repo_stars": 2133}'

curl "<http://localhost:8108/collections/Links/documents/import?action=create>" \
        -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
        -H "Content-Type: text/plain" \
        -X POST \
        -d '{"user_id": "user_a","repo_id": "repo_a"}
            {"user_id": "user_b","repo_id": "repo_a"}
            {"user_id": "user_a","repo_id": "repo_b"}'

curl '<http://localhost:8108/multi_search>' -X POST -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d '
  {
  "searches": [
    {
      "collection": "Repos",
      "q": "*",
      "filter_by": "$Links(id:*)",
      "include_fields": "$Users(user_name, strategy:nest_array)",
      "exclude_fields": "$Links(*)"
    }
  ]
}' | jq
I get:
Copy code
{
  "results": [
    {
      "facet_counts": [],
      "found": 2,
      "hits": [
        {
          "document": {
            "Users": [
              {
                "user_name": "Roshan"
              }
            ],
            "id": "1",
            "repo_id": "repo_b",
            "repo_stars": 2133
          },
          "highlight": {},
          "highlights": []
        },
        {
          "document": {
            "Users": [
              {
                "user_name": "Roshan"
              },
              {
                "user_name": "Ruby"
              }
            ],
            "id": "0",
            "repo_id": "repo_a",
            "repo_stars": 5215
          },
          "highlight": {},
          "highlights": []
        }
      ],
      "out_of": 2,
      "page": 1,
      "request_params": {
        "collection_name": "Repos",
        "first_q": "*",
        "per_page": 10,
        "q": "*"
      },
      "search_cutoff": false,
      "search_time_ms": 1
    }
  ]
}
I just looked closely at your request. You are sending
Copy code
$users(*: strategy:nest_array)
The fields and strategy are supposed to be separated by a comma not colon.
And I kept copying the same mistake when providing the other examples 🤦🏼‍♂️ Sorry about that.
🥲 1
To re-iterate it should be:
Copy code
$users(*, strategy:nest_array)
👍🏼 1
I'll look into why
Copy code
"include_fields": "$user_doc_access($users(*: strategy:nest_array) : strategy:nest_array)"
crashed. It should've responded with bad syntax error instead.
u
Thanks for the Help, Harpreet. 🎉
🙌🏼 1
Hii, With Reff. to this example https://typesense.org/docs/26.0/api/joins.html#many-to-many-relation Do we have to make separate API calls to add documents in "documents" collection and "user_doc_access" collection to map documents with users? Or is there a way to map documents with users automatically while adding documents ?
h
user_doc_access
collection has to be indexed after both
users
and
documents
collections have been indexed (they can be indexed in parallel).
This is because we map references at index time to speed up the join process during search.
u
So, for example, There is a user data
{"id": 1}
There is a document data
{"id": "doc-id-1", "users": [1]}
So, mapping of both will be like this,
{"user_id": 1, "document_id": "doc-id-1"}
So, to join above 2 data, I will have to make 3 API calls, 1. Add the user data in "users" collection 2. Add the document data in "documents" collection 3. Add mapping data of above 2 data in "user_doc_access" Right ?
h
Yes, correct.
Alternatively, you could also declare a reference field in
document
collection.
users
field can be a reference field.