#community-help

Equitable Distribution of Suppliers in Ecommerce Platform

TLDR Max has a problem with pagination, distribution, and grouping in an ecommerce use case. Jason suggests using "group_by" but notes that dynamic group sizes are not possible.

Powered by Struct AI
Mar 13, 2023 (7 months ago)
Max
Photo of md5-ac5d64b63e48fd1a3cf936c3e2221a2c
Max
10:29 PM
Hi team, I have an ecommerce use case where we have suppliers selling products. We have many different types of suppliers, some can have 90k different products to sell and others can have only 10 products. They usually have products in several categories. The problem of equitable distribution arises when searching products with "multi supplier queries", that is to say where multiple suppliers are supposed to be found in the results (e.g product_price < 50$), which is very often the case on big ecommerce platforms. One important rule for us is that a supplier cannot be "promoted" (kind of) just because he has 150 times more products then the others.
Until migrating to Typesense, here is what we did :
1. We calculate several scores for a product based on metrics like : its rating, its competitiveness, its order growth, is the supplier reactive, and so on, to have a global score.
2. This global score is used to sort products, to propose best scored products to clients according to their search queries.
3. For each search query, we find products that can match from every supplier concerned by the query. Then we distribute each product on a 20 products page where each supplier matched can have maximum 3 products displayed per page. All results are sorted by their product_score.
I am currently trying to reproduce this behavior in Typesense with the grouping parameters and the pagination parameters. I use a small dataset of 9946 products for tests.
My problem is that for a global query, I only find 40 documents out of 9946. With pages of 20 products, the third page is empty (big problem). I should have approximately 500 pages available, with maximum 3 products, for each supplier, per page of 20 products.
1. Am I doing something wrong or is there a parameter that I misunderstood ?
2. If the supplier distribution is not equitable, e.g supplier A have 9900 products in the results, and supplier B has 46, does the grouping parameters allow to have 10 products for supplier A and B on the first 4 pages, then the 5th page has 14 products of supplier A and 6 products of supplier B, and finally all remaining pages have supplier A products
3. More generally, this business rule is a common rule for a ecommerce platform with multiple suppliers proposing their products. Is there a better way to handle an equitable distribution ?
I give the supplier repartition of my test dataset and the query screenshot in the thread
10:30
Max
10:30 PM
Supplier distribution ( supplier_id: nb_product ) : {16: 110, 28: 248, 590: 36, 709: 36, 737: 500, 1018: 24, 1112: 65, 1268: 376, 1345: 500, 1978: 80, 2058: 288, 2167: 70, 2227: 4, 2250: 20, 2538: 20, 2549: 208, 2617: 500, 2627: 45, 2734: 500, 2770: 500, 3271: 500, 3357: 500, 337
2: 45, 3416: 50, 3802: 500, 3904: 107, 4245: 500, 4286: 500, 4323: 73, 4659: 500, 4675: 64, 4696: 164, 4767: 84, 4825: 49, 5244: 500, 5250: 500, 5254: 500, 5258: 500, 5361: 24, 5370: 210}
10:32
Max
10:32 PM
Query :
Image 1 for Query :
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:35 PM
> My problem is that for a global query, I only find 40 documents out of 9946.
IIRC, found parameter in a group query counts the number of groups, not the number of documents within each group
10:35
Jason
10:35 PM
So if you expand grouped_hits in the response, you’ll see 3 documents per supplier
10:35
Jason
10:35 PM
and as long as each supplier has at least one product, they will show in the search results
10:36
Jason
10:36 PM
In general, group_by is indeed the parameter you need to build this functionality of only returning X results (group_limit) for a given attribute (in your case supplier_id)
Mar 14, 2023 (7 months ago)
Max
Photo of md5-ac5d64b63e48fd1a3cf936c3e2221a2c
Max
01:46 PM
Hi Jason thank you for your quick response.
Can you explain why starting at page 15 I do not have any results ?
Image 1 for Hi <@4L6c7> thank you for your quick response.
Can you explain why starting at page 15 I do not have any results ?
01:47
Max
01:47 PM
Page 14 I have only 1 supplier
Image 1 for Page 14 I have only 1 supplier
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
02:06 PM
It sounds like there are only 40 unique suppliers in your dataset. So the pagination is based on the group rather than the individual items within the group
Max
Photo of md5-ac5d64b63e48fd1a3cf936c3e2221a2c
Max
02:07 PM
Yes I just understood this 5mins ago, my bad
Is it possible to base pagination on items within the group whereas the group itself ?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
02:08 PM
No it’s not possible, but you can calculate the pages you’re talking about based on the group_limit parameter you’ve set
Max
Photo of md5-ac5d64b63e48fd1a3cf936c3e2221a2c
Max
02:14 PM
Ok!
Do you have any suggestion/use case that would help me with the third and last question in the initial message of this thread ?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
03:16 PM
&gt; More generally, this business rule is a common rule for a ecommerce platform with multiple suppliers proposing their products. Is there a better way to handle an equitable distribution ?
group_by would be the best way to achieve what you’re looking to do
Max
Photo of md5-ac5d64b63e48fd1a3cf936c3e2221a2c
Max
03:18 PM
How should I implement it to fit my use case ? Which is to have X suppliers maximum per page of 20 products for a query
03:18
Max
03:18 PM
And without post search processing on the frontend
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
03:32 PM
When you group_by, you can set the per_page parameter which controls the total groups per page. So in your example, you’d set per_page to X, and then set group_limit to whatever 20 / X evaluates to
Max
Photo of md5-ac5d64b63e48fd1a3cf936c3e2221a2c
Max
04:46 PM
I'm sorry if I wasn't clear but I don't think this works for my use case. Like you said the pagination is based on the groups and not on items per group, so I won't be able to have the same supplier on several pages for equitable partition

For instance, imagine a search query with the following (non realistic) results :
Supplier A : 100 documents
Supplier B : 100 documents
Supplier C : 100 documents
Supplier D : 100 documents
Supplier E : 100 documents

My use case needs the following rule :
Maximum 3 products per supplier on each page (20 products per page)

So the results should follow this repartition:
Page 1 : 4 products for each supplier
Page 2 : 4 products for each supplier
Page 3 : 4 products for each supplier
etc.
04:46
Max
04:46 PM
theoretically, the repartition would look something like :
Image 1 for theoretically, the repartition would look something like :
04:47
Max
04:47 PM
This means being able to group by a field per page, and not globally on the grouped by field. Is that possible ?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
11:43 PM
This style of dynamic group sizes is not possible. You might be able to do this in two queries:

1. Do the query without a group_by, but facet on the supplier_id, which will give you the total number of suppliers.
2. In the next query, based the total number of suppliers obtained from query 1, set the value of group_limit dynamically