Hi all, I have been testing out Conversational RAG...
# community-help
r
Hi all, I have been testing out Conversational RAG over the weekend and have it mostly working. One issue that I cannot seem to get passes though, Is that when I ask it questions about my demo environment, quite often, when I ask multiple questions, it will respond to the question I asked before. but other times, it will answer correctly. I can't quite tell if this is an issue on my end, or to do with the assistant itself. When i look at the logs, it shows the correct question was sent, but the wrong answer was received (the answer to the question before) Has anyone else had this happen, or could provide any insight on where to look? Thanks!
j
Interesting. I wonder if the context window has run out?
CC: @Ozan Armağan Any ideas?
r
Hey @Jason Bosco I wouldn't think so as in this case, it is only the 5th question with this conversation_id before this happens, and quite a few times iit happens after only 2 or 3 questions. Ahh, after looking more closely, it seems it has to do with the returned documents being of low quality / miss-match First Query Question: Who agreed to order more food for the holiday menu? Response: Your high-priority items related to negative reviews about Bobs Italian are as follows: 1. Cold Lasagna and Rude Service ◦ The lasagna was cold in the middle, and the bread was stale. The waiter was rude and dismissive when the issue was raised. 2. Burnt Pizza and Poor Service ◦ The pizza was burnt and nearly inedible. The staff did not offer any apologies or compensation. 3. Burnt Fish ◦ The fish was burnt and nearly inedible. The staff did not offer any apologies or compensation. These reviews reflect dissatisfaction with the food and service at the restaurant. Documents Returned: 1. Cold Lasagna and Rude Service (Vector Distance: 0.165) 2. Burnt Pizza and Poor Service (Vector Distance: 0.167) 3. Average Experience (Vector Distance: 0.180) 4. Burnt Fish (Vector Distance: 0.180) 5. Loved the Experience (Vector Distance: 0.186) 6. Fantastic Dinner Experience (Vector Distance: 0.192) 7. Gluten-Free Menu Options (Vector Distance: 0.198) 8. Holiday Menu Planning (Vector Distance: 0.209) 9. Basil Stock Check (Vector Distance: 0.221) 10. Tomato Delivery Delay (Vector Distance: 0.256) Second Query Question: Who agreed to order more food for the holiday menu? Response: Luca proposed a new seafood pasta dish, and Bob agreed to order more seafood supplies for the holiday menu. Documents Returned: 1. Holiday Menu Planning (Vector Distance: 0.140) 2. Gluten-Free Menu Options (Vector Distance: 0.160) 3. Basil Stock Check (Vector Distance: 0.195) 4. Fantastic Dinner Experience (Vector Distance: 0.200) 5. Loved the Experience (Vector Distance: 0.212) 6. Staff Meeting (Vector Distance: 0.221) 7. Average Experience (Vector Distance: 0.224) 8. Cold Lasagna and Rude Service (Vector Distance: 0.225) 9. Burnt Pizza and Poor Service (Vector Distance: 0.234) 10. Burnt Fish (Vector Distance: 0.236) I used OpenAI text embeddings 🤔
o
@Ryan Jones can you try using text-embedding-3 models from OpenAI?
r
@Ozan Armağan No problem, I can try that, would you suggest the small or large embeddings model?
o
Small should be enough
r
Thanks @Ozan Armağan I changed the embeddings model, and after some code changes, I was getting the correct responses, but then today, I ended up getting the same response to the first question I asked, many times. So I think the issue is on my end, and how I am handling conversation history possibly. I had not come across this with other RAG platforms before, but they are also more "plug and play" than the Typesense implementation, which is totally understandable . I am just trying to figure out the best way to use a small (at this point) collection to to get accurate documents / answers most times 🙂
o
Can you also please try increasing
max_bytes
for your conversation model?
r
I have used
16384
for all of my models so far, what does that number relate to?
o
It relates to the context window, in other words max number of bytes (characters) can be sent to the LLM in a single prompt.
Can we set it to a value like 65536?
r
Ah ok, sure I can create a new model at a higher number.