In the previous post on Contextual Precision, we diagnosed a critical problem in our RAG system: poor retrieval quality was causing failures that we also observed in the Faithfulness post. In this first of three related posts, we’re going to dig in a bit. This will be our first extended example of what testing a generative AI really looks like.
Continue reading AI and Testing: Improving Retrieval Quality, Part 1