Essentially, for all existing images, you’d generate embeddings with CLIP and index them in Typesense.
Then when a user uploads a new image, you would again generate embeddings for it with CLIP, then send those vectors to Typesense to do a nearest neighbor search and display the image URLs to the user