We've recreated this issue on our non-prod cluster...
# community-help
s
We've recreated this issue on our non-prod cluster which is not HA. I've also had some success in reproducing the issue: 1. Create products collection. (products_prod_1.0) 2. Create the prices collection. (productPrices_prod_1.0) a. This collection has a reference back to products, with async_reference = true 3. Trigger some incremental updates to the prices collection. a. This is while both collections are blank, and so the async_reference option is needed. b. We use a POST to create new documents. 4. Do a bulk update of both collections. a. This uses the import option with action=upsert. We do groups of 2500 products to the product collection, then prices for those same products go to the prices collection. 5. During the bulk updates, querying the products collection with a left join to prices begins to fail with this message: "Failed to join on `productPrices_prod_1.0`: No reference field found." a. This doesn't happen immediately. It appears to happen when the bulk update process eventually comes to a batch of 2500 products that includes products I triggered price updates for at the very start. b. We haven't been able to reproduce the failure of the bulk load on this non-prod cluster, but instead get this error when querying the collection. If I repeat those steps above with collection names that don't include dots, and only underscores, things work as expected. Is it possible that when POST creates price documents that don't have a reference in the products collection yet, that the naming convention with dots isn't navigated well? Seems like the price is created with a phantom reference to to a collection that will never exist, because the reference collection name isn't handled well. But the search with GET and bulk import with PUT both seem to handle the referenced collection name with dots OK. So when the product document gets created it is in the correct collection name. But the prices still hold a reference to a non-existent document that now doesn't match the real reference? Just trying to make sense of what could explain how we're seeing with bulk import not working in HA and search break in single clusters when we have dots in our referenced collection names. For now I'm going to proceed with collection names that don't use dots, and see if we can find any scenarios where this issue happens again.