AI – Page 2 – Stories from a Software Tester

AI and Testing: Improving Retrieval Quality, Part 1

written by Jeff Nyman

In the previous post on Contextual Precision, we diagnosed a critical problem in our RAG system: poor retrieval quality was causing failures that we also observed in the Faithfulness post. In this first of three related posts, we’re going to dig in a bit. This will be our first extended example of what testing a generative AI really looks like.

Continue reading AI and Testing: Improving Retrieval Quality, Part 1 →

AI and Testing

AI and Testing: Contextual Precision

written by Jeff Nyman

In the previous post we looked at the Faithfulness metric with DeepEval and got some intuitions in place about how to start thinking about using metrics in general. In this post, we’ll look at a third metric.

Continue reading AI and Testing: Contextual Precision →

AI and Testing

AI and Testing: Faithfulness

written by Jeff Nyman

In the previous post we looked at the Answer Relevancy metric with DeepEval and got some intuitions in place about how to start thinking about using metrics in general. In this post, we’ll look at a second metric that requires no faith but is all about being faithful.

Continue reading AI and Testing: Faithfulness →

AI and Testing

AI and Testing: Answer Relevancy

written by Jeff Nyman

In the previous post we got set up with DeepEval. Here we’re going to put that tool to use by looking at our first test case and our first quality metric.

Continue reading AI and Testing: Answer Relevancy →

AI and Testing

AI and Testing: Evaluation and DeepEval

written by Jeff Nyman

In previous posts in this series, I’ve largely been talking about how to use local LLMs by writing scripts and, along the way, I’ve been able to shoehorn in some testing ideas. We even wrote a bespoke test script together. In this post, I’m going to focus more specifically on testing by considering the idea of evaluation.

Continue reading AI and Testing: Evaluation and DeepEval →

AI and Testing

AI and Testing: Personal Marketability

written by Jeff Nyman

In the posts in this series, I’ve been taking you through a lot of concepts and tooling. That’s going to continue but, for this post, it felt prudent to take a little break and talk about why doing all this can matter. That gets into interviewing and potentially being hired.

Continue reading AI and Testing: Personal Marketability →

AI and Testing

AI and Testing: Scaling Tests

written by Jeff Nyman

In the previous post, we refactored a test case that we have been working on. In this post, we’re going to use that refactored test case and scale it up a bit.

Continue reading AI and Testing: Scaling Tests →

AI and Testing

AI and Testing: Refactoring Tests

written by Jeff Nyman

In the previous post, we refined an AI test case that we had previously created as a testing example. In this brief post, I want to show a refactoring of that code. We will also align on the output of this test.

Continue reading AI and Testing: Refactoring Tests →

AI and Testing

AI and Testing: Refining Tests

written by Jeff Nyman

In the previous post I provided an extended testing example where we wrote an “AI test case” together. This post will provide some more test thinking around that initial test case.

Continue reading AI and Testing: Refining Tests →

AI and Testing

AI and Testing: A Testing Example

written by Jeff Nyman

In this post, my goal is to write a relatively substantive test case and, while doing so, bring together many of the topics talked about in previous posts of this series.

Continue reading AI and Testing: A Testing Example →

AI and Testing

AI and Testing: LangChain and Orchestration

written by Jeff Nyman

Here I’m going to continue the thread from the previous post, where we started to look at the concept of Runnables, which is really what puts the “Chain” in “LangChain.”

Continue reading AI and Testing: LangChain and Orchestration →

AI and Testing

AI and Testing: LangChain Messages

written by Jeff Nyman

In the previous post, we got familiar with LangChain templates and dipped our toes into messages. In this post, I’m going to focus a bit more on those messages since these are the key to communicating with AI.

Continue reading AI and Testing: LangChain Messages →

AI and Testing

AI and Testing: LangChain Templates

written by Jeff Nyman

In this post I’m going to follow the thread from the previous post and dig more into the LangChain ecosystem and start looking at the idea of templates for prompts.

Continue reading AI and Testing: LangChain Templates →

AI and Testing

AI and Testing: Local LLMs and LangChain

written by Jeff Nyman

The previous post covered the concept of Ollama, to get a local LLM on your machine. Here, I’ll focus on using that LLM and introduce two key properties of testability in this context. Doing so will introduce LangChain.

Continue reading AI and Testing: Local LLMs and LangChain →

AI and Testing

AI and Testing: Ollama and Models

written by Jeff Nyman

In this post I want to take the initial steps to get some basic tooling available and operating. This is step one if you’re going to work in a technologist context with AI applications.

Continue reading AI and Testing: Ollama and Models →

AI and Testing

AI and Testing: Evaluating the Future

written by Jeff Nyman

As our technocracy continues to grow and as (at least some) technologists continue to push us toward a potentially dehumanized and dehumanizing future, I want to focus on how we can work from within this technocracy to make sure that human experimentation is front and center.

Continue reading AI and Testing: Evaluating the Future →

Testing AI

Navigating the AI Shift: A Tester’s Mandate

written by Jeff Nyman

It’s very clear that artificial intelligence has become more democratized than at any other time in history. It’s also fairly clear that this democratization will not only continue but likely accelerate. What is the mandate for quality and test specialists in this context?

Continue reading Navigating the AI Shift: A Tester’s Mandate →

Testing AI

AI-Powered Testing: Exploring and Exploiting with Reinforcement

written by Jeff Nyman

There’s a lot of talk out there about using large language models to help testers write tests, such as coming up with scenarios. There’s also talk out there about AI based tools actually doing the testing. Writing tests and executing tests are both a form of performing testing. So let’s talk about what this means in a human and an AI context.

Continue reading AI-Powered Testing: Exploring and Exploiting with Reinforcement →

Text ClassificationThinking About AI

Text Trek: Navigating Classifications, Part 6

written by Jeff Nyman

In this final post of this series, we’ll look at training our learning model on our Emotions dataset. This post is the culmination of everything we’ve learned in the first three posts in this series and then implemented in the previous two posts in this series. So let’s dig in for the final stretch!

Continue reading Text Trek: Navigating Classifications, Part 6 →

Text ClassificationThinking About AI

Text Trek: Navigating Classifications, Part 5

written by Jeff Nyman

This post, and the following, will bring together everything we’ve learned in the previous four posts in this text classification series. Here we’re going to use the Emotions dataset we looked at in the last post and feed it to a model.

Continue reading Text Trek: Navigating Classifications, Part 5 →

Stories from a Software Tester

Twice upon a time, in another space, no distance in any direction from here …

Category: AI

AI and Testing: Improving Retrieval Quality, Part 1

AI and Testing: Contextual Precision

AI and Testing: Faithfulness

AI and Testing: Answer Relevancy

AI and Testing: Evaluation and DeepEval

AI and Testing: Personal Marketability

AI and Testing: Scaling Tests

AI and Testing: Refactoring Tests

AI and Testing: Refining Tests

AI and Testing: A Testing Example

AI and Testing: LangChain and Orchestration

AI and Testing: LangChain Messages

AI and Testing: LangChain Templates

AI and Testing: Local LLMs and LangChain

AI and Testing: Ollama and Models

AI and Testing: Evaluating the Future

Navigating the AI Shift: A Tester’s Mandate

AI-Powered Testing: Exploring and Exploiting with Reinforcement

Text Trek: Navigating Classifications, Part 6

Text Trek: Navigating Classifications, Part 5