I was giving a presentation to developers as well as engineering hiring managers who make decisions around hiring test practitioners. This came about regarding recent decisions in hiring, or rather, lack thereof. Brought up to me numerous times was the idea that testers are not being hired if they even hinted at the idea of testing as distinct from checking. So let’s talk about this.
I’ve talked in the past about my perception that specialist testers need to be cross-discipline associative. And while I’ve implicitly given some ideas about what that means in various posts, here I want to be a bit more explicit.
There’s a lot of talk out there about using large language models to help testers write tests, such as coming up with scenarios. There’s also talk out there about AI based tools actually doing the testing. Writing tests and executing tests are both a form of performing testing. So let’s talk about what this means in a human and an AI context.
One thing that’s often interesting is to define foundational terms within your discipline. It’s often even more interesting when you come across a discipline that seems to struggle with doing so. Is that the case for testing? Well, let’s talk about it.
I often frame whatever role I’m in as a Quality and Test Specialist. It’s not really a term or phrase that our industry agrees upon. Normally people want the word “Engineer” somewhere in their title as if that term somehow wasn’t terribly vague. So let’s dig in to what I mean when I talk about being a specialist.
In this final post of this series, we’ll look at training our learning model on our Emotions dataset. This post is the culmination of everything we’ve learned in the first three posts in this series and then implemented in the previous two posts in this series. So let’s dig in for the final stretch!
This post, and the following, will bring together everything we’ve learned in the previous four posts in this text classification series. Here we’re going to use the Emotions dataset we looked at in the last post and feed it to a model.
In this post, we’re going to look at the Emotions dataset that we briefly investigated in the previous post. Here we’re going to consider the basis of that dataset. Then we’ll load it up and see if what we have to do in order to feed the data to a training model.
In this post, we’ll explore some particular datasets. The focus here is just to get a feel for what can be presented to you and what’s available for you to use. We’ll do a little bit of code in this post to get you used to how to load a dataset.
Here we’ll continue on directly from the first post where we were learning the fundamentals of dealing with text that we plan to send to a learning model. Our focus was on the tokenization and encoding of the text. These are fundamentals that I’ll reinforce further in this post.
Let’s start this “Thinking About AI” series by thinking about the idea of classifying text. Classifying, broadly speaking, relates to testing quite well. This is because, at its core, the idea of classification focuses on categorization of data and decision making based on data. More broadly, as humans, we tend to classify everything by some categories.
Most testers are aware of the idea of a “test case.” What people outside of testing often don’t know is how much debate can swirl around what a test case is or should be. I think it’s great to have discussions about this kind of thing but I also find that there can be a temptation to either simplify it too much or complicate it too much.
Many are debating the efficacy of artificial intelligence as it relates to the practice and craft of testing. Perhaps not surprisingly, the loudest voices tend to be the ones who have the least experience with the technology beyond just playing around with ChatGPT here and there and making broad pronouncements, both for and against. We need to truly start thinking about AI critically and not just reacting to it if we want those with a quality and test specialty to have relevance in this context.
Following on from computing eras but before getting to my “Thinking About AI” series, there’s one intersection I’d like to bring up which is the notion of “people’s computing.” This idea of people being front-and-center of computation, and thus technology, once held sway but has often been in danger from a wider technocracy.
This post will be a bit of a divergence from my normal posting style although very much in line with the idea of stories that take place in my testing career.
We come to the third post of this particular series (see the first and second) where I’ll focus on an extended example that brings together much of what I’ve been talking about but also shows the difficulty of “getting it right” when it comes to AI systems and why testing is so crucial.
This post continues on from the first one. Here I’m going to break down the question-answering model that we looked at a bit so that we can understand what it’s actually doing. What I show is, while decidedly simplified, exactly what tools like ChatGPT are essentially doing. This will set us up for a larger example. So let’s dig in!
The idea of “Generative AI” is very much in the air as I write this post. What’s often lacking is some of the ground-level understanding to see how all of this works. This is particularly important because the whole idea of “generative” concepts is really focused more on the idea of transformations. So let’s dig in!
An interesting discussion came up on LinkedIn recently regarding the idea of whether automated tools “find bugs” and I actually found the discussion around this to be exactly what is wrong with a lot of our testing industry these days. I find testers are fighting more abstract battles and become less relevant as they do so. But maybe I’m the one that’s wrong on that? Possibly! Let’s dig in.
In the first part of this post, I used a simple binary classification task to show some ideas around measures and scores and then provided some running commentary on how the tester mindset and skillset can situate in that context. That post was about depth; this post will be more about breadth.