Justin Searls from Test Double joins the party to talk about patterns he’s identified that lead to failure, minimalism, and of course, testing!
Linode – Our cloud of choice and the home of Changelog.com. Deploy a fast, efficient, native SSD cloud server for only $5/month. Get 4 months free using the code
changelog2020. To learn more and get started head to linode.com/changelog.
Raygun – With Raygun Error and Performance Monitoring you have all the information you need at your fingertips to quickly find and fix errors and performance issues across your tech stack down to the line of code. Get started with a free 14-day trial, head to raygun.com and join thousands of customer-centric software teams who use Raygun every day.
What’s up, JS Party?! It’s a great September day, a great September morn, according to Neil Diamond, and we are excited to catch up with Justin Searls. Justin, hello!
Hey, Nick! Good to talk to you.
Yeah, good to talk to you, too. I’m Nick, I’m gonna be hosting today, and joining me as well is Suz. Suz, what’s up?
What is up?! Thanks for having me back.
Great to have you. And then we have Chris, a.k.a. b0neskull. b0neskull, what is going on?!
Hello, everybody! Yay!
Yay! Now I have September morn stuck in my head, and I’m really excited about that. I’m a Neil Diamond fan, yes! So we’re here talking with Justin, and Justin - the first thing I just wanna dive right into is some of your hobby horses, or specifically, repeated patterns of failure that you see. Do you wanna kick us off with that?
Okay, so we’re just gonna dive right in.
Diving right in!
Yeah, there’s so many different ways that software projects fail… I started my career out doing big enterprise consulting for the public sector or financial institutions, and every project was two years, two million dollars, and big waterfall systems, and complex silos, and stuff… And what that taught me was a lot of ways not to write software well. But what I think is so fascinating is that no matter how much the industry progresses, we keep getting stuck on the same handful of issues over and over again.
One that comes to mind right off the bat is just unclear expectations. Our ability to communicate what software is, what it should do is so, so limited that even just getting a product owner or a product manager, somebody in a business to clearly decide what a piece of software should do, communicate clearly enough to a developer, articulate what is required of that software, and then for the developer to have the listening and questioning and introspection capabilities to successfully translate that into buttons, and screens, and a user experience that makes some kind of sense - that should be the baseline; that should be kind of just table stakes, right? But to do that well puts you in the top 1% of software teams.
[00:04:19.01] And I think a lot about the fact that more nuanced conversations in that sort of exchange, of just “What are you even doing here…?” You know, like “Hey, businessperson, what’s this feature worth to you? If it takes a week, maybe that costs 5k. If it takes a month, maybe that costs 30k. If it goes on for a long time, how much are you willing to sink into this, so that I know when I need to cut and run, or call you up if things take too long?” Those conversations tend not to happen. Things like “How long do you expect this piece of software to live? Does it need to live for one year, three year, five year?” Of course, the default answer, because business people are used to not getting a lot of money to build new things, is that it needs to live forever, but that’s not realistic… So we don’t really think about what level of fidelity we should be building stuff in.
So all this just kind of gets silently presumed in the subtext by both parties whenever we’re talking about building stuff, and the net effect is that when things don’t go well, we have realized that we had mismatched expectations about what we were building, why… Things like the subtext, of like “Well, what was the cost of doing nothing, versus choosing a different path? What’s the opportunity cost if we don’t build this thing?” Those conversations just tend not to happen in most teams.
So do you think that the bigger disconnect is trying to work with a developer or a team to get what is desired built? Or is it really between the product owner and whoever the end user is? So is it building something that was the wrong thing, or is it building something that is the right thing, but it’s not built the way it was supposed to be?
Well, when I was speaking, I was mostly thinking about internal to engineering organizations, internal to businesses trying to build a thing to then get it to users… And of course, building the right thing and building the thing right are both responsibilities of kind of like the whole team… But what I’m really getting at here is that I feel like a lot of teams don’t have the proper communication channels and muscle built up to know what to ask in these conversations that become very routine, very mechanized, of like “Oh, every week we have a planning meeting, we go over this week’s story cards, or GitHub issues or whatever and we triage stuff… But we don’t ever really get into deeper conversations about kicking the tires on stuff.”
For example, right now I’m selling my house. The weakest part of my house by far is the kitchen, which we never remodeled. But had I hired somebody to remodel that kitchen for me, you can bet that I’d be telling them “Look, I’m looking at moving out of there in a month. I want this to look really nice and show really nice, but if you find something and it’s gonna cost me $10,000 more to do it perfectly, please call me first. Please don’t do that.” And it’s because in basic commerce in our lives as consumers, we think really hard about where our money is going, and what level of quality is appropriate, or what the job is. And when we’re getting a service from somebody, we tend to think in terms of “How am I gonna get value out of this?”
But once it becomes about software, the people buying tend not to have any clue what software is, how it works… All they really see are these gigantic dollar numbers that developers either ask for, or earn as salary, and usually get stiff-armed or stonewalled when they start to ask about how long something’s gonna take, or try to get into details… Because developers tend to have built up a lot of defenses about being abused over deadline pressure, and other mechanisms of control that managers use to try to understandably arrive at a certain outcome with a process and with a thing in software that they just don’t really understand.
[00:08:08.12] So given that you’re in a unique position with your company, Test Double, in that you can provide that outside opinion and provide that outside clarity to teams who might be struggling and they’re just not quite sure why, and it could be this specific problem, how do you actually approach that conversation with those teams?
Yeah, that’s a great question. One of the reasons why Test Double as a consultancy - you know, our primary service is we have senior developer consultants who join client teams and work alongside them and integrate with them over the long haul… The biggest reason why that’s our service is because this stuff doesn’t get fixed overnight. Learning how to build trust, learning how to collaborate, learning how to communicate - all of that stuff tends to take a lot of time, and the problem states that got us to where we are, where maybe trust is low, or collaboration isn’t happening, or people don’t feel safe communicating openly and honestly - to unwind that just takes a lot of effort.
So this specific conversation is really – it’s a big chunk of what I guess our sales process is. Just last night I had a friend of my wife’s, Becky, from college; she’s a project manager now, she called me, and the programmer who built 95% of this thing that her company was going to depend on, the programmer just ghosted. And now she doesn’t even know what technology it’s built in. She doesn’t know if they have control of the source code. They don’t know where this thing’s hosted. All they know is they’ve spent a lot of money getting something, but they still don’t have it yet, and now they also lost the person who’s building it.
That’s kind of where my head’s at when I have these conversations - it’s mostly about educating businesspeople and managers to understand better what software is, how software development works at its best, and then how to even understand the failure mode that they’re in. So I could explain the situation that they’re in in the same way that if you got into some kind of legal trouble, a lawyer might have to let you know, like “Hey, you’re in a kind of deep right now, and here’s what you need to be doing next.” That’s a big part of where I feel like we can rely on our experience as software developers by just utilizing a bit of empathy, and trying to make it understandable for somebody else.
Yeah, there’s a great point in the chat actually, from Rebecca, that came in, that says they also think it’s easier for the owner to understand what they’re getting with things that they can see, which I guess means sort of tangible… Whereas something such as underlying performance optimizations can be very difficult for them to see the value of that. I’m guessing that your position, and everybody who works at Test Double, has the opportunity to be able to break those things down, as you were just saying, right?
Yeah… You know, before we were a development consultancy as Test Double, my co-founder Todd and I were both at a consultancy that helped companies adopt agile software development practices, and help entire teams and organizations sort of transform how they worked. And a big part of most of the agile methodology results in some kind of demo happening, with some kind of acceptance ritual, where a business person will click through the screen that they just asked for, and say “Yup, that’s it.” Or “No, this needs more work.” And to get to that point required a lot of technical innovations, to be able to just on a week turnaround and get a deployment out that fast… And obviously, with where we’re at now with push-button everything, it seems trivial…
[00:11:43.15] But I remember even at the time I had one client that was building a bibliography and citation generator. It didn’t have a user interface at all, and really the end product was “Can I tell this other service how to properly cite a source in a research paper?” And so we had to build a little bit of kind of like a visual scaffolding, so that the product owner could log in themselves, test it out, see whether or not green light/red light are things working or not working, based on an agreed set of criteria… Because there wasn’t necessarily a user interface that they’d be able to see. And we took it upon ourselves as engineers to provide that to our customer, the product owner, so that they could make informed decisions without necessarily just relying on us saying “Oh, just trust us.”
Right. Some of that “Just trust us” - there are other points brought up in the chat… One is like the things that are invisible to the product owner, that are part of the development process - like testing, like QA, like observability… In my experience, and in some of the places I’ve worked, those things have all been devalued, essentially, because they’re kind of abstract for a product owner. Do you see that happening a lot?
Yeah. And there are different ways to look at it. When I think in terms of transparency and communication on a team, I try to be really intentional about sharing information that a person can reasonably act on. For example, if my team tracks the number of story points that they accomplish every week, knowing that that’s kind of like a number that only makes any kind of sense as a lagging indicator, that’s like a benchmark of their productivity over time… But then I start publishing every single team’s story points, which are not apples to apples comparisons across an entire organization, somebody somewhere who’s probably got a financial brain as a manager or a leader in a company, is going to try and spreadsheetify it… Because what you’ve just done is give them some kind of data point, and they’re gonna try to optimize it based on what they know.
But if there’s a particular detail in the implementation of how you go about your work - for example testing, like you said; or maybe necessary refactoring before you implement a story - the more that you expose of how you work, without the context of “Hey, you don’t need to understand this. You don’t need to know about this, but just so you know, here’s the 15 bullet points that we walk through every time we ship you something that works”, if you just kind of give them “Hey, I’m gonna spend a day refactoring. Is that okay?”, you’re giving them a decision about something that they probably don’t have context of, that they probably don’t understand… And when you do that, you have to be open to the possibility they’re just gonna say “Well, no, I don’t know why that’s valuable, so could you just build the thing that I asked?”
That’s why I think that a part of the professionalism of just being a software developer is to communicate in terms that give the people that we’re working for the amount of context to actually make an informed decision that they can make. “Oh, this is probably gonna take 3,5 weeks”, and then in our heads we’re doing all the math about how long testing, and refactoring, and any sort of necessary infrastructural changes that we’re gonna have to make are gonna be… And only expose that to them if there’s a decision point that they could be well-informed to make. Otherwise, we’re just inviting unnecessary conflict and potentially kind of kneecapping our own ability to deliver the same kind of level of quality that the customer surely wants.
So kind of providing them with less of an a-la-carte menu of options, because some things we know are just not optional when it comes to delivering something that’s high-quality, right?
Exactly. There’s a reason that when I ordered my Apple Watch I didn’t have a checklist of “Oh, do you want a battery in that?” Apple is intentional about saying “This is the package. Take it or leave it.” And the reason for that is that when they give you an option, they want it to be meaningful. It’s why they don’t list RAM on hardly any of their products anymore outside of Macs.
Do you worry – in the context like for example your kitchen remodel, those kinds of decisions, or the lack of transparency of things like that might lead to a much larger bill, and it was technically necessary, but it doesn’t achieve the goal of selling your house, or making things pretty enough to sell your house…
[00:16:03.04] Yeah… And that’s why I think just tight feedback and showing your work as you go is really important. So whether it’s a kitchen remodel or a piece of software - if I’m paying somebody money and they disappear for three months, and I can’t see whether they’re making progress or not, I have every right to be concerned. But if they’re turning new work around every day, and there’s like a live website up that I can kind of play with and click around, and I have access to, and they’re also giving me on a daily basis “Hey, here’s what’s new today”, almost like a rolling kind of changelog of things - all that stuff works to build up my trust that you’re driven, you’re for real, you’re working on the things that I’m asking… And whenever there’s an issue or anything to discuss – you know, people are pretty adept at picking up the cues as to whether or not you’re hearing them and taking their concerns fully into account.
So continuing our discussion on patterns, another thing that I know you’re interested in, Justin, is minimalism. Why do you think that that’s important, and how do you go about considering minimalism in your designs and in your apps?
Yeah, so I think I’m almost ideologically a minimalist, and it’s a learned thing. I travel very light, in a 19-liter bag that I can live off for months on end. Even here at my house I own exactly two pairs of pants and five T-shirts. I tend to appreciate that added complexity weighs me down. Not just in the sense that I’ve got all these assets to worry about, things to track… Like, in software, if I’ve got a whole bunch of dependencies and I have the actual physical task of having to upgrade them all, it’s the cognitive load on me of all these different things that either I have to keep track of, or that present themselves as ways for stuff to break later. It tends to be something that we don’t think a whole lot about upfront at the moment, where that complexity accretes.
[00:20:20.10] And additionally, Npm also grew up out of that, so lots and lots of small packages, of varying levels of maintenance applied to them… And I think that that put a lot of us, myself included, into a state of like “Well, if there’s a thing that can already do this, and it’s well-tested and it’s well-used in a lot of places, it’s better to just observe a package than build my own little thing.” Even if it’s just one little module, even if it’s not fancy to do, say, role management.
Yeah, I think that’s an interesting point. I’m curious what your thoughts are on it going forward from here. Because if you do choose to go the route of like doing your own at the cost of having to maintain it, do you think that that might cause you trouble later down the line when requirements slightly change, and then you might have to pull it out and put in something else, or…? I guess I don’t know where I’m going with this…
Well, I think that the fallacy of the build versus buy decision, of whether to roll your own code or to adopt a dependency to do it for you is that like “Well, if I write it myself, then I’m on the hook. I own that thing.” But the reality is that whichever way you do it, you own that thing. You’re gonna be the one responsible for figuring out how to make the system keep working as needed over time. And so it’s just a series of trade-offs, like any other.
Over time, as I became more competent as a programmer, I realized that a lot of things that sound hard at first, there’s a pretty straightforward, simple way to implement them… And I got more comfortable just rolling my own solution to lots of common problems, especially once I’d seen role management for the 50th time in a web application; it doesn’t scare me in the same way. I’m not afraid that there’s gonna be some hilarious cacophony of edge cases that only a third-party dependency can provide me…
But additionally - and this is really the most important thing when it comes to failure modes or minimalism - is that the most important thing for every application that I’ve seen in my career is proper, consistent, thoughtful organization of code. If all of the code in your system is super well-named, super-organized, composed of small units, and scaffolded in a way that if I’m just looking at the top-level of your application, I can kind of drill down to the more specific implementationly bits from some higher-level thing, that kind of tells maybe what HTTP route I’m looking at… The better organized a codebase, is, the more stable and maintainable it’s gonna be in the long-term.
If you’re a developer or on a team that has gotten very good at writing very consistent, well-organized code, the concept of writing more of it becomes less scary, because the total cost of ownership of the marginal module goes way down when they’re not so accidentally creative and disorganized.
[00:24:20.03] Yeah, but doesn’t that kind of go against minimalism, too? Because now – I’m thinking back to a JS Party episode a couple of episodes ago, with Ahmad Nassri, where he was basically stating that if you do go that route where you’re building the role management system, now you’re in the business of building your product plus being a role management building team, and having to deal with both of those. Does that go against minimalism at that point?
I think that it comes down to how do you visualize the application. When you’re visualizing the application as the number of lines in your Git repository, then yeah, it seems very counter to minimalism to be building a lot of stuff that is vital to your application, but maybe not core to it… But when you visualize your application as every line that you wrote, plus every line in every single Npm package that you’re sucking in, and you realize that you’re just the tip of this massive glacier - in that broader context, a decision like “Yeah, I’m gonna write 75 lines to handle this role management stuff”, and it’s not a full-blown management solution, it’s just enough for what this particular system needs, I think that pulls in a perspective where you can make a rational and an informed decision to just keep things tidy, keep things clean…
And again, if you’re super-consistent, the marginal complexity and cognitive load of each additional line of code actually goes down significantly.
And in some cases – and it’s not in all cases, obviously, because it depends (trademarked phrase that we love to say). I think you can apply minimalism in the context of both build tools, or like the building process, or just like the actual scaffolding process for teams to run and build and test, and actually code stuff in their development environments can become more minimalist as a result of that… Again, depending on whether you pick any tools from Npm to do that task for you.
Suz, that’s a great point, and I think that the reason that I use the word “ideology” to describe how I think about minimalism is because it’s not just about code; it extends to what we’re even building. What kind of applications do you really like to use as a user? For me, it’s purpose-built, small, lean apps, that probably don’t have a lot of configuration options, that they give me a way to do something… And if I don’t like that way, then I can stop using that app. And if I do like that way, then I can free my mind of 80 different checkboxes and ways to customize.
And the number one vector for complexity in most software applications is what I call the “complaint to checkbox pipeline” of anytime somebody complains about something, it’s “Oh, I wish this worked this way”, or you get a piece of feedback, or your product owner says “Hey, somebody else in the business wants this to work that way”, and the superficial thing to do is “Well, we could just make a checkbox for that. We could just make an option for that.” And you could implement that and make it work that way for that person, but of course, then the cost is that your application becomes not just – there’s not just one more if-else somewhere; your application, the definition of what it is - it’s a combinatorial multiplication problem of all of the options that you’ve done so far. And if you’ve ever maintained a library that took in options and had a whole lot of configurability, you realize it just becomes almost untestable, almost unknowable whether a particular set of configurations is gonna work.
[00:28:20.16] So that’s why I think minimalism as an ethos makes a ton of sense, to just say “Okay, let me hear your complaint. Let me see if I can identify a root cause way in my core design, that I can change how it works to accommodate that, and not just make it better for you with a checkbox, but make it better for everybody who uses this thing. Or maybe I built a thing with my taste, and your taste isn’t suited for the thing that I built, and I’m sorry. You’re gonna have to find another thing, or live with the way that this works.”
Having that kind of backbone requires having autonomy and authority and permission to make hard decisions and tell people no, which of course, is another way that a lot of teams fail.
Mocha has – I don’t know off the top of my head; maybe 25, maybe 30 command line options… And I’ve rejected proposals for more than twice that amount. If I was to add every single option that every single wanted, it would just become – yeah, it’s like an exponential growth of maintenance burden and awfulness.
The way that I’ve found - and this may not be applicable in necessarily your situation, where you’re working with businesses instead of open source - is to develop a way for them to fulfill their own needs by a plugin system, or something like that.
Yeah, that’s what I was thinking too, in terms of a command line utility… That one simple solution to get a lot of people what they need might just be like having a good, consistent output that you can pipe into other commands; like a real Unix philosophy of being able to take it and pipe it in and do whatever you want, just build up these chains… Similarly with like a frontend, or with an application. Maybe your escape hatch is something like you just have a robust API that they can plug into and do things with.
Yeah, I love all of that. Chris talking about Mocha, with having tons of options… We have a test frameworks that we wrote too, called teenytest, and it has – if you ask me how many command line options it has, maybe three… But what it does have is it has a way to write plugins for it that can hook into literally every single thing that it does… And to Nick’s point, because all it does is output TAP (the Test Anything Protocol), it can be piped eight ways from Tuesday, and run in a parallel fashion, and stuff. So it has the extensibility, and it has a set of plugins that you could either roll your own, or pull in, to get the level of bullet point features somebody might be looking for, without necessarily saddling a single codebase with hundreds and hundreds of ifs and elses to consider every single bizarre edge case.
This is exactly why I use Tape, for very similar reasons - few options, easy extensibility, and also outputs TAP, so… I think we’re on the same page there.
I get asked a lot about why I use Tape, and some people haven’t heard of it just because I think it’s been around for a while and it doesn’t have async/await support out of the box. So, Promise support out of the box these days, but that’s easily resolved… So a lot of people haven’t seen it before, and they do say “Why don’t you use Mocha instead?” and I say “Well, this gives me pretty much everything that I need, and no more than that.”
[00:31:51.12] Yeah, there’s a philosophical difference, too. If you look at something like Mocha, which is maybe halfway between Tape and something like Jest, where it’s just like enormous and totally configurable – and some people prefer one thing or the other. I’m kind of curious, how does that relate back to this idea of, you know, saying no to options in the business. Like, what if really what you’re trying to build is this very complex thing that does everything? Is that just a recipe for failure? Because like we’ve seen with Jest - that’s successful and it works for a lot of people.
Getting back to the very first thing that I was trying to describe as being like a failure mode is clear expectation setting… This doesn’t become a problem, because software systems have too many checkboxes. It’s a problem when it’s death by a thousand paper cuts of unplanned features, where taking the time to understand what was gonna be built upfront… To your point, this business has 8,000 permutations that need to be able to be configured for each licensee of the software - if you tell me that on day one, I’m gonna really take seriously that I need to have a data-driven, probably a schema around that piece of data to be able to validate a proper configuration, and a whole subsystem that just focuses on making that configuration sacrosanct. And whatever it looks like, whether it’s a rules engine, or an adapter layer or something, to allow the core functionality of the system to respect lots of variability based on configuration.
But what ultimately happens is people typically build an MVP that works one way, and then on week three they have to add a second way, and then on week seven it’s like “Oh, and it has to work in this third context, too.” All of those just get kind of shoveled into like a router, or a controller, or a branch in a model code… There’s never the rainy day that you’re like “Oh yes, at the fifth checkbox - that’s the moment when I need to zoom out and thoughtfully refactor everything.” Because normally it’s that team that would struggle with rearchitecting something, especially when it’s still new.
So I guess the last point to wrap up this section is how do you instill that mindset of constraints to the business owners or the stakeholders in the project?
I think that the number one thing is to make sure that you’re doing everything that you can to support them and that they be empowered to actually make a decision. The hardest thing about being a product owner, or when companies struggle with strong product management - the number one reason that they fail is that they are not really fully willing to give that person the ability to make decisions. So if that person has the authority to and has been empowered by and has permission to say no, then a lot of good stuff can happen, because then you can kind of work with them and negotiate with them in good faith in an ongoing way, and develop rapport, and develop a way of working together.
But if what they’re really doing is just kind of a human router of thousands of other business interests, and they just kind of have to say yes to everything by default, you’re gonna arrive at outcomes that look more like this one.
And I think back a lot to an old quote from Jony Ive with (I think it was) probably the second or the third iteration of the iPod… I remember at the time the kinds of things people were really angry that the iPod couldn’t do, stuff like it didn’t have an FM radio tuner, which sounds quaint now… What Jony said in this interview was like “We are very careful about adding new things to our products. Because once you add a new thing, you can never take it away.” And that is the same way that most software that we write for businesses work. The permission to add a new feature, to add a new test, to accommodate somebody’s need is usually given… But once we forgot why we did that, we never feel safe just taking it away, and the pain that that might cause somebody down the stream is always higher than we bargained for.
Let’s just hope that’s not true with the touch bar.
[laughs] I think I’m on your team there, Nick.
Alright, so the last section, in our talks on minimalism, we did invoke Jest, Tape, Teeny and Mocha, and we have Justin and Chris here, so I think there’s probably a lot of testing knowledge here… So I just have to ask Justin - what’s your approach here, what are your thoughts on testing? Are you pretty good at it?
[laughs] Well, we named the company Test Double as sort of a joke. If you’re not familiar, Test Double is just a bit of jargon that a fella named Gerard Meszaros coined when he was writing a book called xUnit Patterns about unit testing… And it just means like a fake thing that you use to stand in for a real thing in your tests. I think he was probably referring to a stunt double… And it’s anecdotes like those that you get with my level of experience, but I’m not sure that high-quality ability to write tests is necessarily that much better… But honestly, these days when I think about testing, my primary focus is on making sure that people are well-situated to get a good return on their investment.
It’s still the case that a lot of systems have no tests, it’s still the case that the people who do have tests are sort of just doing it because a testing sounds good, or because somebody mandated that everything be tested. But beyond that, there’s a tremendous value in asking the question of like “What do we want out of our tests? How much are we willing to invest? Will we know we’re getting a good ROI out the other end? What can we be doing to set ourselves up to be even able to answer that question.”
So thinking about the individual purpose of each test, and kind of talking through that and designing specific tasks for particular uses, as opposed to just sort of like a boolean state of “is tested/is not tested”, those are the sorts of conversations that I find really fun and engaging when I’m working with a new team.
So how do you determine if you’re getting value from your testing efforts?
I’m curious if anyone else on the call has an opinion on that.
I think it’s good if things get caught before they go out to production. I think that’s probably a very obvious statement to make, but it’s valuable if it’s catching things that humans are missing, in my opinion… And then you can always talk about how you can improve things, so that you don’t even have it get to the point where the test failed. That’s definitely the number one advantage that I see - that sort of feeling when you feel so much relief, because you’re like “Oh, thank God that didn’t go out to production like that.”
It’s way better than my answer… I was just gonna say that my test coverage number is green.
I find it valuable when a test that was already written, and so go and you make some unrelated change, and a test breaks somewhere else. That tells me that I’ve done something right in writing that test.
Now, you can go overboard with that almost daily. If I make a change and a test breaks, well then I’ve probably at least written a test that is not useless. As I said, it may go overboard, where it just becomes – you have so many tests that are so tightly coupled to your implementation and you can’t make a move without breaking a bunch of stuff. That’s going too far. But I definitely expect tests to fail, and I am happy when they do… So I’m on the right track there.
Yeah, and that’s actually probably a good way to talk about what’s our purpose with each test that we write… And the way that I arrive at that usually, especially when I’m looking at a test or watching people writing tests on a team, is I ask “Okay, so why should this fail? What is something that could happen, that you would expect to cause this test to fail?” That’ll tell you why that test exists.
One example might be what I call contract tests. Let’s say that you’re in an organization where you depend on a microservice that is managed by a different team, and you have very particular expectations of how that service behaves; maybe you call a certain number of APIs, in a certain order, and you need to expect a certain stateful outcome from that, and you’re the only person who uses it that way. A contract test, where you encode your expectation of how that thing should work, and you actually commit it into their repository, so it runs as part of their suite - when should it fail? Well, when they made a change that violated your expectations of how that service should work. And why is it in their repo, instead of your repo? Because you want that failure to get as close to the person writing the code in the moment that they’re thinking about whatever change broke them, so that they can make the effortless and cheap fix.
Whereas in most organizations, what would happen is either it goes all the way to production and you find out, or it’s something like, you know, it’s in my integration test suite, that I verify, but by then maybe that microservice was deployed weeks ago, and it’s far out of the mental context of the person, right?
So when I’m looking at a particular test, especially one that fails often, those are the kinds of the questions that I’m asking… Because you can be much more targeted. We’re allowed to have as many test suites as we want. We can have as many names, and memes, and patterns of tests as we want to have. They don’t just have to be one gigantic bucket of them, sitting in one folder or two folders, one for unit tests and one for integration tests. We can design them to be purpose-built, and fit into certain classes.
And it’s not just about getting to a 100% coverage number, it’s about making the computer do work for us; like, they serve us, and not the other way around.
[00:43:40.06] This is kind of similar to a thing that Node does, and other projects… Mocha does a little bit of this, but other projects are trying to adopt… There’s a tool Node uses to test itself called CITGM (Canary in the Gold Mine). What that does is it takes the version of Node, like a pre-release version of Node, and it runs the test suites of like – I don’t even know how many… Maybe it’s 50 of the top packages on Npm, or something like that… And it runs those test suites. And if Node does something that breaks its dependence, Node’s gonna know. And in fact, this happened just a couple weeks ago before a Node release, where they made a change in Node that actually failed Mocha’s test suites, and they found that out, not me. So they came to me and said “This is failing. What’s going on here?”
So I went into Mocha and I fixed the bug going forward, but unfortunately, because not everybody’s gonna upgrade Mocha, but they might upgrade Node, it was basically impossible for Node to make that change, so they had to drop the change, so they did… But we’ll do this, where – we just actually started pulling in… Now, we don’t use WebPack, but we started pulling in WebPack as part of our testing suite… And we wanna make sure that when you bundle Mocha with WebPack, it doesn’t fail. We do that with Chai, and RequireJS, for some reason… But that’s a way that open source projects can do a similar thing. Is that kind of the same idea?
Oh yeah, totally. And the funny thing about it, of course - this is an example where a tremendous amount of focus, and sort of the – you know, when you’re working on an open source thing (at least this has been true in my experience), you’re doing it on your own time; for me it mostly comes out of evenings and weekends. No one’s gonna give me an A+ on a performance review because I had 100% code coverage, because my library is really well tested… So it’s a forcing function, so it’s a focusing constraint, to say “I’m not gonna spend time writing tests that aren’t gonna be valuable to me later.” And that ultimately means that I need to have a mechanism for discerning what’s gonna be useful and what’s not gonna be useful or valuable as a test.
But I think for most teams, as you socialize the idea that everything should be tested and then kind of call it a day, you end up in a case where you have a lot of tests that are written that the lowest common denominator between them all is they make sure something works… Or even less, they exercise functionality in the system and it’s not even really clear how much of the system they’re really asserting does work… And when you don’t know exactly – if a test doesn’t encode, either through the type of test it is, or what the test says that it does, or what it’s trying to do… You’ve got all of these ways to express inside of the test when you’re writing, like “This should do this when this happens, because yadda-yadda.” That way, later on, five years from now, some new developer comes then and that test is failing.
The ideal case is that they can know when it’s safe to just start deleting old tests, functionality that isn’t important anymore, or that aren’t providing value… But the status quo for our industry is that once a test gets committed, it becomes somehow sacred, and we never actually feel safe deleting it. So you see these build times just kind of super-linearly grow over time, where after a year maybe it’s like a 30-minute thing, and then after a few years it’s like “Oh man, we’ve got two hours. We’ve gotta parallelize this build.” And no one ever really feels like they have the permission to trim those down, because no one really understands what each test is doing.
Yeah, that’s a good point. That makes me wanna go back through a bunch of old tests and make sure I understand the Why. If the program suddenly decided to disable some feature, what tests then would we expect to be removed, and do I know that even? So if I’m unable to answer that question, well - that seems like kind of a problem waiting to happen.
[00:48:18.28] And another thing that we can think about, too - this kind of gets back to minimalism… I had a really fun experience at a client a few years ago, and they were just doing tests for the first time. It was a group of 40 people, and they were like “Well, we’ve never written a test before. That’s why you’re here. Teach us how to be a good organization”, I guess. And one of my projects there was helping a small group of three QA engineers. They had a lot of technical knowledge, a couple of them knew Ruby, a couple had used Selenium before, and they were asking me, “Okay, so what kind of integration tests should we write?” And I could have put them down the path of just like “Oh, well, see what the app does, look at the criteria, through the specifications, and for each story make sure that you’ve got a thing that clicks through the browser and does it.” And you know what - that’s going to lead to the same outcome, of like two years from now you’re just gonna have hours and hours and hours of redundant-looking things, where like 90% of the actual activity is signing up new users for the 80th time in a test suite…
So I said instead – I went up to a whiteboard and I drew a 5x5 grid, and I said “Okay, there’s 25 boxes in this grid, and each box is a minute. So you have a budget of 25 minutes, and your build is never, ever going to be slower than that, or you’re gonna start throwing away tests. And I want you as a QA department to work with the business to figure out what’s the most important stuff that you need to make sure works before a production deploy”, and that way – they were also adopting something akin to continuous deployment… That way what they could do is guarantee that the business, that testing isn’t going to hold up a deployment for more than 25 minutes. And it also, again, forced a value-based decision criteria for like “Could we make this test faster? Could we make this part of the system faster? Is it really important for us to test this kind of musty admin page, when we could just do that manually, and get stuff out to production faster?” Making those kinds of ROI-based decisions on “Is this worth automating?” was really productive for them.
Should we wrap there?
Yeah, I think that’s good.
Yeah… There’s so much more I wanted to ask, but… [laughter]
Well, I’m happy to come back if you’ll have me.
And we’ll use this as an excuse to call out, if Searls is comfortable, having people reach out to him on Twitter, if they have follow-up questions or something like that… Maybe we could have a comical little “Well, I would love to ask you so many more questions, but if I was to find you on social media, would that be okay?”
Oh, wow. Excellent segue. That was very graceful. Yeah, so I am a wide-open system; I have open DMs. My Twitter handle is my last name, @searls, which phonetically spelling it is difficult… So it’s like Pearls, but with an S instead of a P. And yeah, I’d love to hear from anyone listening to this. If I can help you out in any way, it’s literally my job.
Thank you, Justin.
Yeah, thank you very much. I appreciate it.
Thank you all.
Our transcripts are open source on GitHub. Improvements are welcome. 💚