Hi all I have been testing out Conversational RAG over the w typesense #community-help

Hi all, I have been testing out Conversational RAG...

Ryan Jones

09/02/2024, 6:36 AM

Hi all, I have been testing out Conversational RAG over the weekend and have it mostly working. One issue that I cannot seem to get passes though, Is that when I ask it questions about my demo environment, quite often, when I ask multiple questions, it will respond to the question I asked before. but other times, it will answer correctly. I can't quite tell if this is an issue on my end, or to do with the assistant itself. When i look at the logs, it shows the correct question was sent, but the wrong answer was received (the answer to the question before) Has anyone else had this happen, or could provide any insight on where to look? Thanks!

Jason Bosco

09/02/2024, 4:39 PM

Interesting. I wonder if the context window has run out?

Jason Bosco

09/02/2024, 4:39 PM

CC: @Ozan Armağan Any ideas?

Ryan Jones

09/02/2024, 9:40 PM

Hey @Jason Bosco I wouldn't think so as in this case, it is only the 5th question with this conversation_id before this happens, and quite a few times iit happens after only 2 or 3 questions. Ahh, after looking more closely, it seems it has to do with the returned documents being of low quality / miss-match First Query Question: Who agreed to order more food for the holiday menu? Response: Your high-priority items related to negative reviews about Bobs Italian are as follows: 1. Cold Lasagna and Rude Service ◦ The lasagna was cold in the middle, and the bread was stale. The waiter was rude and dismissive when the issue was raised. 2. Burnt Pizza and Poor Service ◦ The pizza was burnt and nearly inedible. The staff did not offer any apologies or compensation. 3. Burnt Fish ◦ The fish was burnt and nearly inedible. The staff did not offer any apologies or compensation. These reviews reflect dissatisfaction with the food and service at the restaurant. Documents Returned: 1. Cold Lasagna and Rude Service (Vector Distance: 0.165) 2. Burnt Pizza and Poor Service (Vector Distance: 0.167) 3. Average Experience (Vector Distance: 0.180) 4. Burnt Fish (Vector Distance: 0.180) 5. Loved the Experience (Vector Distance: 0.186) 6. Fantastic Dinner Experience (Vector Distance: 0.192) 7. Gluten-Free Menu Options (Vector Distance: 0.198) 8. Holiday Menu Planning (Vector Distance: 0.209) 9. Basil Stock Check (Vector Distance: 0.221) 10. Tomato Delivery Delay (Vector Distance: 0.256) Second Query Question: Who agreed to order more food for the holiday menu? Response: Luca proposed a new seafood pasta dish, and Bob agreed to order more seafood supplies for the holiday menu. Documents Returned: 1. Holiday Menu Planning (Vector Distance: 0.140) 2. Gluten-Free Menu Options (Vector Distance: 0.160) 3. Basil Stock Check (Vector Distance: 0.195) 4. Fantastic Dinner Experience (Vector Distance: 0.200) 5. Loved the Experience (Vector Distance: 0.212) 6. Staff Meeting (Vector Distance: 0.221) 7. Average Experience (Vector Distance: 0.224) 8. Cold Lasagna and Rude Service (Vector Distance: 0.225) 9. Burnt Pizza and Poor Service (Vector Distance: 0.234) 10. Burnt Fish (Vector Distance: 0.236) I used OpenAI text embeddings 🤔

Ozan Armağan

09/04/2024, 12:40 AM

@Ryan Jones can you try using text-embedding-3 models from OpenAI?

Ryan Jones

09/04/2024, 12:44 AM

@Ozan Armağan No problem, I can try that, would you suggest the small or large embeddings model?

Ozan Armağan

09/05/2024, 5:52 AM

Small should be enough

Ryan Jones

09/09/2024, 10:58 AM

Thanks @Ozan Armağan I changed the embeddings model, and after some code changes, I was getting the correct responses, but then today, I ended up getting the same response to the first question I asked, many times. So I think the issue is on my end, and how I am handling conversation history possibly. I had not come across this with other RAG platforms before, but they are also more "plug and play" than the Typesense implementation, which is totally understandable . I am just trying to figure out the best way to use a small (at this point) collection to to get accurate documents / answers most times 🙂

Ozan Armağan

09/09/2024, 11:42 AM

Can you also please try increasing

max_bytes

for your conversation model?

Ryan Jones

09/09/2024, 11:44 AM

I have used

for all of my models so far, what does that number relate to?

Ozan Armağan

09/09/2024, 11:45 AM

It relates to the context window, in other words max number of bytes (characters) can be sent to the LLM in a single prompt.

Ozan Armağan

09/09/2024, 11:46 AM

Can we set it to a value like 65536?

Ryan Jones

09/09/2024, 11:47 AM

Ah ok, sure I can create a new model at a higher number.

Open in Slack

Previous Next