In the previous post we got set up with DeepEval. Here we’re going to put that tool to use by looking at our first test case and our first quality metric.
Category: AI and Testing
AI and Testing: Evaluation and DeepEval
In previous posts in this series, I’ve largely been talking about how to use local LLMs by writing scripts and, along the way, I’ve been able to shoehorn in some testing ideas. We even wrote a bespoke test script together. In this post, I’m going to focus more specifically on testing by considering the idea of evaluation.
AI and Testing: Personal Marketability
In the posts in this series, I’ve been taking you through a lot of concepts and tooling. That’s going to continue but, for this post, it felt prudent to take a little break and talk about why doing all this can matter. That gets into interviewing and potentially being hired.
AI and Testing: Scaling Tests
In the previous post, we refactored a test case that we have been working on. In this post, we’re going to use that refactored test case and scale it up a bit.
AI and Testing: Refactoring Tests
In the previous post, we refined an AI test case that we had previously created as a testing example. In this brief post, I want to show a refactoring of that code. We will also align on the output of this test.
AI and Testing: Refining Tests
In the previous post I provided an extended testing example where we wrote an “AI test case” together. This post will provide some more test thinking around that initial test case.
AI and Testing: A Testing Example
In this post, my goal is to write a relatively substantive test case and, while doing so, bring together many of the topics talked about in previous posts of this series.
AI and Testing: LangChain and Orchestration
Here I’m going to continue the thread from the previous post, where we started to look at the concept of Runnables, which is really what puts the “Chain” in “LangChain.”
Continue reading AI and Testing: LangChain and Orchestration
AI and Testing: LangChain Messages
In the previous post, we got familiar with LangChain templates and dipped our toes into messages. In this post, I’m going to focus a bit more on those messages since these are the key to communicating with AI.
AI and Testing: LangChain Templates
In this post I’m going to follow the thread from the previous post and dig more into the LangChain ecosystem and start looking at the idea of templates for prompts.
AI and Testing: Local LLMs and LangChain
The previous post covered the concept of Ollama, to get a local LLM on your machine. Here, I’ll focus on using that LLM and introduce two key properties of testability in this context. Doing so will introduce LangChain.
AI and Testing: Ollama and Models
In this post I want to take the initial steps to get some basic tooling available and operating. This is step one if you’re going to work in a technologist context with AI applications.
AI and Testing: Evaluating the Future
As our technocracy continues to grow and as (at least some) technologists continue to push us toward a potentially dehumanized and dehumanizing future, I want to focus on how we can work from within this technocracy to make sure that human experimentation is front and center.