That said. I’m creating embeddings off images. Not captions, but images.
Is this the best way to find images by a description - let’s say I want to find images by searching for „sunset“.
Or should I caption the images and create embeddings off that caption instead?