GOTO - Today, Tomorrow and the Future

Expert Talk: Cloud Chaos & How Contract Tests Can Help • Holly Cummins & Kevlin Henney

November 25, 2022 Holly Cummins, Kevlin Henney & GOTO Season 2 Episode 46
GOTO - Today, Tomorrow and the Future
Expert Talk: Cloud Chaos & How Contract Tests Can Help • Holly Cummins & Kevlin Henney
Show Notes Transcript Chapter Markers

This interview was recorded at GOTO Amsterdam 2022 for GOTO Unscripted. gotopia.tech

Read the full transcription of this interview here

Holly Cummins - Senior Principal Software Engineer on the Red Hat Quarkus Team
Kevlin Henney - Consultant, Programmer, Keynote Speaker, Technologist, Trainer & Writer

DESCRIPTION
Today cloud native and cloud transformation are more than buzzwords. However, most companies and development teams have not yet surpassed all the hurdles that come with moving to the cloud.
Holly Cummins and Kevlin Henney dismantle why many organizations think by adopting microservices to their cloud strategy, they are ‘doing cloud right’ and how ‘Contract Testing’ can help to reduce the risks of microservices deployments.

RECOMMENDED BOOKS
Holly Cummins & Timothy Ward • Enterprise OSGi in Action
Kevlin Henney & Trisha Gee • 97 Things Every Java Programmer Should Know
Kevlin Henney • 97 Things Every Programmer Should Know
Henney & Monson-Haefel • 97 Things Every Software Architect Should Know
Pini Reznik, Jamie Dobson & Michelle Gienow • Cloud Native Transformation
John Arundel & Justin Domingus • Cloud Native DevOps with Kubernetes
Kasun Indrasiri & Sriskandarajah Suhothayan • Design Patterns for Cloud Native Applications
Alexander Raul • Cloud Native with Kubernetes

Twitter
Instagram
LinkedIn
Facebook

Looking for a unique learning experience?
Attend the next GOTO conference near you! Get your ticket: gotopia.tech

SUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!

Intro

Kevlin Henney: Good day or good night wherever you are, whenever you are watching this. My name's Kevlin Henney. We are recording this GOTO unscripted session at GOTO Amsterdam 2022 rescheduled from 2020. And I'm joined by Holly Cummins, who is senior principal software engineer on the Quarkus team at Red Hat. Holly Cummins, welcome.

Holly Cummins: Thank you.

Kevlin Henney: You're going to be giving a talk, which kind of brings a whole load of concerns together. So, could you tell us a little bit about it?

Cloud Chaos & Microservices Mayhem

Holly Cummins: The title of the talk is "Cloud Chaos and Microservices Mayhem," which is kind of reflecting some of the experiences that I've had in my previous life with IBM as a consultant in the IBM garage.

One of the things that I've seen is that we've been talking about cloud for a long time now, but I think as an industry, we haven't quite caught up to it yet. And we're still getting these things where the assumptions baked into our technology don't play well with cloud or... So, there's sort of the technology side but then there's the people side which as we all know is harder. So, we've set up all of these processes to try and make software engineering safe and reduce risk. And they were a good idea until the cloud came along and now they're a terrible idea.

Kevlin Henney: We're seeing, as you said, we're kind of well into the cloud era. I mean, if we can stretch it liberally back to about two decades and say that we've got that, but we are still, if you like, at a relatively youthful stage in terms of the infrastructure of what has become standard in terms of what a developer can rely on.

If we're talking about that, then we're also talking about what are most developers using, their day-to-day languages, their stack, and so on. There's a kind of a, as you said, the people are the harder bit. It's the practices as well as some of the tools.

This is a really big question, practice a little question, what are the challenges that you're seeing? Because it's obviously not a case of like, "Hey, here's cloud. It's just like moving to a different OS. You are familiar with this and therefore this is identical." It's not. What are the challenges? What are the big hiccups or the, you know, that you are seeing?

Holly Cummins: So, some of them, I think we're sort of a bit further along with but there's a whole bunch. So, tracing and logging is something that I think we're kind of getting to grips with. So, I think probably all of us had the experience when we first started doing cloud, that we had everything going out to the logs and then the container died and it took the logs with it. And then we went, wait a minute. I needed those logs. Come back.

We had to learn, whatever you do with your logs, make sure they don't stay on the container. They have to go elsewhere. And then there's the sort of the next step up, which is yes but my application isn't just one container. My application is 600 containers. In order to have any hope of diagnosing a problem, I need to have some sort of correlation between what's going on in all these containers.

So, I feel like we're maybe now entering a bit of a golden age of observability where we are seeing a lot of tooling maturity, and we're seeing sort of the integrations and the convergence that we've got, you know, the open tracing and the open telemetry and they're kind of getting there. I think the adoption is maybe a little bit behind the standards. So, every observability talk I go to the speaker will usually ask, "So, who's using observability tools?" And there'll be sort of two sheepish hands that go up because, so adoption's lagging a bit behind, but at least we know what we have to do on that one. Whereas I think on some of the other ones, we don't really even know what we should be doing yet.

So, things like releasing, I think we're still figuring out. We have a feeling that we should be releasing more continuously than we are, but we're not totally sure how to manage the testing for that. We're not totally sure how to manage the risk for that. We're definitely not sure how to persuade the business that we've got a handle on the risk because that goes back to that year process thing where the business is using the processes that made sense 10 years ago. They may not make sense anymore.

But then you sort of start to get into this sort of horrible world of versioning because even if you get everything right, and you've got the continuous delivery, and you've got the continuous deployment, if someone is consuming you rather than just you're a web app, do they really want you to be doing continuous delivery? How do you guarantee the compatibility? What should we be going back to semantic versioning? Should we... So there's sort of I think a whole bunch of questions there that none of us really know the answer to.

Developers’ mental models

Kevlin Henney: I think it's an interesting thing that you're bringing out there, which is a lot of this is coming down to mental models. It's a collision of mental models. You described exactly that the issues with logging is because as developers, we hold a mental model of how the machine works, how our program works, this is how code works, and I do this.

And we have this mental even when we don't explicitly externalize it, we have a mental model of how all this fits together and how it works, and every developer, depending on their own career path, picks up a mental model. And you don't really realize you have those assumptions until they collide with something that tells you that was an assumption. You know, your assumptions are discovered by contradiction. "Oh, I had assumed that." At that moment, you had an assumption. Oh, so the logs went with the, ah, because it was in the container.

It was contained. Ah, okay. So, therefore. That's an assumption, but also you're talking about the business having particular assumptions as well. You know, here's the things that work for us and in business memories sometimes 10 years feels, you know, depending on how they look at it, it's either timeless, this is how we've always done it, or didn't we just have to adopt a whole load of stuff for people with longer memories. They're going, well, you want us to do stuff again, you know?

But also that idea of meaning that you're bringing out. I think the versioning thing is an interesting one because when we talk cloud, there's a whole, almost like word cloud. You know, you say cloud and actually, there's a whole set of associations, and buzzword bingo is played effectively.

That can be quite bamboozling but it's a case of like there's already a lot of stuff that is that I have to depend on. That has versions, and then I'm offering something. And that exact question that, you know, maybe other people don't want us to continuously well, continuous delivery. The downside is potential for continuous disruption. And normally people say really just stay still, we are building an application here, stop moving around, and that's a big challenge.

Holly Cummins: I think that the idea of mental models actually, I really like as a way of thinking about one of the other things I'm gonna be talking about, which is microservices. And I think that some of the challenges we see with microservices I think have exactly to do with the mental models.

So, the sort of the promise of microservices is that your mental model can get much smaller. Instead of having to hold the whole application in your head, all you have to do is hold the small piece of the application in the head. And as long as it's correct, the system is correct.

But of course, we all know it doesn't work that way. You can't make a correct system by just making lots of correct individual parts. And so, then there's this sort of contradiction between the promise of you have this very small mental model and oh actually, no, you have to hold the whole thing in your head but it's kind of quite hard to reason about and it's actually probably much harder to reason about than it was before the shift to microservices and how do we...what do we do? How do we make this work?

Kevlin Henney: I think that for a lot of developers and the reason people move to microservices is sometimes a, you know, it's, you know, why you move to microservices? Well, because everybody else is. That's not very compelling but it is, I'm gonna say, it's not unknown. There's a lot of people doing that. And so, therefore it means that they...perhaps those people are not really prepared for the surprises.

The fact that you are now suddenly saying, everything that you could rely on, you can't. There's latency, there's failure modes that you never even dreamt of. When I call, if I'm in a code...if I'm in a single process space and I'm calling a method on an object, there really is nothing that can go wrong with the call of that method. I don't sit there going like, "I wonder if the dot doesn't work." It's a method. I take my object and object dot method plus arguments. I'm expecting everything. I have a very simple mental model of how that works. The dot works, the delivery construct works, it's built into the language.

Unless the reference is null, in which case the null handling works, there are no surprises there. It all works. And I have a very simple model. I get the results back. It's all synchronous, single-threaded, utterly, quite genuinely reasonable in the original sense of the word reason.

Then we say, okay, microservices, let's divide this up. And suddenly it's just like, you can't rely on the dot. That whole thing. Actually, the delivery mechanism actually, that's a little more complex as well. Oh, don't assume that these are all synchronous and ordered either. Where's your data, and what's this quality?

All of these are potentially confounded. It's not that they don't...there aren't benefits to be had, but I think sometimes people are going for they underestimate how much of a shift that is to their day-to-day programming model.

Holly Cummins: I think sometimes we end up in this sort of worst of both world situation where you have all of the costs because those are non-negotiable technically. You can do the right thing to manage them and, you get a service, measure something to try and help, but fundamentally those don't go away.

But often you get the cost without the benefits because one of the benefits of microservices is the independent deployability but then that's kind of scary. So, then you still release everything in a batch. And then at that point, it's not like a double win, it's like a double loss.

Kevlin Henney: I think that is the interesting thing where that whole, you know, what's the value proposition? There's certain scaling properties, the independence of deployment, all of these things. And then you sometimes ask people and it's just like, we kind of we don't need to scale. It's just like, well, why are you doing all of this? It's and but we've also got, you know, we need to deploy everything all at once.

Or you talk to people and you have this kind of conversation about how frequently are you deploying. And it comes back every few weeks. Maybe microservices are not where you're at. If you're getting friction in your dailies or multi-, multiple times a day, you need to be working in this space but that's not where people are.

So, there's a little bit of fashion driven. I get the sense that in the microservices space, and I was talking to Fred George about this one, there are people who are genuinely going for the kind of the original vision. That partitioning down, that independence. I understand the costs and it's that kind of there's a trade-off. That is the literal sense of trade-off. It's a trade. Itt's like, I'm prepared to pay the costs because I'm getting these benefits, and I'm actively using these.

There are people who are pursuing that, and then there's other people who are kind of like, well, you know, conference driven development, everybody's talking microservices. But as you said, when you go into these things, and you ask the architectures and the tooling and who's doing this, there's only a small spat of hands but there's a lot of interest. But sometimes people are walking away going like, "Well, we need to be doing this." And they only pay the costs because they don't need those benefits yet. Maybe they do a little further down the line, but they find themselves in a surprising new world of clashing mental models, and they're trying to sell it to the business as well.

Holly Cummins: Yeah. I think often nobody wants to be the one who says it's 2022, but, no, we're not doing microservices because but I think as you say, you have to sort of go back and what problem are we trying to solve? Really what problem are we trying to solve? And sometimes when you drill down, and the problem we're trying to solve is everybody else is doing microservices. It's like, well, is that really a problem?

Kevlin Henney: Yeah. That's right.

Holly Cummins: You decide.

Developer experience

Kevlin Henney: CV-driven development. Again, it's that idea of appropriateness and context. And I think that the modern software development is challenged by this ridiculous amount of stuff that they have to know or could know. I mean the landscape is vast. And if a software developer enters in the mobile world, they may never have to worry about what's going on at the back end, if that's their concentration.

Or embedded developers, they see a very different view. There are so many different parts to this landscape. Therefore it's very easy to say, well, those people over there are doing this. Maybe I should be doing this over here. Maybe you should but maybe you shouldn't.

But I think that brings another kind of...that brings another challenge, the poor developer. The poor developer in the modern landscape because there is so much to know. There's no way anybody can know everything.

I think a lot more these days, people have been talking about the developer experience. Recognizing and it's I know it's something that I've been quite keen on for a while is this idea of as a developer, I am a user of the tooling. As a developer, I am a user of our programming guidelines and conventions and our practices. I'm a client of that. What do I want from it? What can I get from this? What are the things that make life hard for that but also what can be made easier because the move to cloud is...you know, that because there are so many things because literally, it is a shopping list, it's a word cloud of possibilities and technologies. It can be quite confusing.

Holly Cummins: It's such a good question and we're sort of having this conversation about developer experience more and more now, but I think the reason that we're having it is because at the moment the developer experience in our landscape is pretty poor. I think sometimes we end up again with the tension between fashion and developer experience. So, if you look at Kubernetes, for example, it is fiendishly hard and it is much harder than some of its predecessors like Heroku and Cloud Foundry, and it gives you a lot more in terms of the flexibility.

But sometimes we, again, pay that cost, and we didn't actually need those other benefits, but no one wanted to be the person who said, "Actually I'm kind of dumb. So, could I not use Kubernetes please, right? "You know that that's not a conversation you want to have with your management.

So, then we end up sort of all in order to, you know, prove we're tough enough for Kubernetes. We sort of race towards Kubernetes, and now we're sort of doing that reset and say, "Wait a minute, does it have to be this hard? Couldn't we make it a bit easier?"

Kevlin Henney: Just because something can do everything doesn't mean you want to do everything. I mean, that's, I think that's the interesting challenge is that we are normally in development. We are brought up on a steady diet of abstraction and generality and all this kind of stuff. Therefore our platforms inevitably reflect that because they aren't trying to be, in some sense, general.

It's like an operating system. It doesn't know who its users are and it doesn't care. It has to be providing all of this. But to fully know it, and that's the problem is that there are so many different things, and nobody wants all of these things, or rather, no, let me rephrase that, nobody needs all of these things out in one go. They've got this huge overwhelming learning curve. It's massive.

Holly Cummins: The sort of marketing towards what people want to want rather than what they actually need reminds me of a completely different domain when Febreze launched. So, you know, Febreze you sort of spray it around and it, you know, eats the odors by a magical chemical process. And when it launched, it completely flopped.

And the reason was that their sort of marketing, subtext of their marketing was "Are you kind of dirty and a bit gross? If so, this is the product for you." And even though there were people who needed it, nobody wanted to be the one who said, actually, yes, I'm your demographic.

So, they changed how they marketed it, and they changed it. So are you really quite clean? This is sort of a nice little thing that you could do after cleaning as like a little treat, you know, you spray for Febreze round. And so, it was completely inappropriate for the actual capabilities of the product, but people wanted to be the person who was buying it. I think we sometimes see that with some of our technologies as well.

Kevlin Henney: I think that is incredible. I think you've actually hit the nail on the head there because there is this idea that normally whenever anybody is pushing technology, no matter what its specific possibilities are, either you don't... or the people that need to see that, don't see it. This is actually, you are the demographic, this would help you, or it gets marketed more generally or pushed more generally. In other words, hey, this solves a specific problem, but now let's push it to a broader thing.

I think that's a very human thing, but I think, again, it presents us with that challenge as a developer. Let's just say I've graduated. I've maybe I've spent the last few years at university. I've been doing a couple of languages, I've learned a few paradigms, I've got a bit of Java, and got a bit of Haskell, and suddenly discovered that actually, the job market has precisely zero interest in Haskell.

I've got my Java skills in there. And then maybe I did a module on machine learning, maybe I did,  whatever my final year was. I come out into the world, and I'm presented, you know, here's this mobile space, here's this cloud stuff, and then there's this embedded stuff over here, and there are all of these things, and I'm confronted with full stack development.

Again, that term is particularly interesting because of the full stack, where does cloud live in that? Really does that mean I have to know everything about Kubernetes? That kind of thing. How full is my stack? I'm presented with all of these possibilities, and that's hugely daunting. What do I do with my kind of raw programming skills? What do I do with the stuff I already know? It's like, where does that take me?

Internet Legacy

Holly Cummins: I think we have another challenge as well, which is a slightly different one, but I'm sort of quite conscious of it now, which is that we're starting to build up, not just a legacy in our industry, but we're starting to build up a legacy on the internet. So, now if I go and I google, "How do I do a REST service?" I find so many wrong answers that, you know, sort of seem to ignore the possibility of JAX-RS and that kind of thing.

You're doing it at this really low level and it's like, no, don't. There are higher-level APIs, but they just there's all this sort of craft of knowledge that's obsolete that we don't really know how to get rid of yet.

Kevlin Henney: There's geology here, isn't there? There's kind of like there's a kind of time strata and that the internet as a source of wisdom, they accidentally reinforce where the problems are and that becomes quite interesting.

As you said, there are higher-level distractions, but when somebody googles a very specific question, perhaps they don't know that it's one of those things. Sometimes in somebody's question, they're actually asking something else. It's just like, actually, you didn't wanna be operating at this level. We now have this. That's how we do things.

Holly Cummins: What problem are you trying to solve?

Kevlin Henney: Exactly. What problem are you trying to solve? Because you are actually giving me part of a solution. You've anchored on a solution, and we are very solution centric in this discipline. So, you've anchored on a solution. You're asking a question about the solution, whereas actually if somebody took a step back and said, ah, actually we can unask that question. If you use this technology or you use this, where you wanna be pitching is at this level, for which the question you're asking has already been solved and there is no way of waiting. Yeah. And that's not just a case of the stuff on the internet that's old, it's the fact that it's the way we ask our questions, that frame of knowledge. And so, kind of relating to a bit of that and the age and the accumulation of things.

So, a couple of years back, you contributed a couple of pieces to the book that Trisha Gee and I edited, "97 Things Every Java Programmer Should Know," and you had a piece in there, "Java should be fun." And at the moment, so just sort of recapping where you are with Quarkus that's basically trying to bring the Java world to the cloud more comfortably as it were.

And you made a number of really interesting points in the "Java should be fun" piece, one of which was you distinguish between fun and unfun. There are things that are unfun and it's just like boilerplate code, stuff like that. It's certain tedium of things. Then there are other things that are fun. Whether or not they radically change the way we think or whatever is not the issue, they make it engaging for the developer.

It's just like when Java introduced Lambda then suddenly Lambda and streams kind of loosen things up. Maybe they don't really change things at one level, but maybe they do it at another. They make it more enjoyable.

But you also acknowledged the age of Java because it's not young anymore by any stretch of the imagination but it's everywhere, and it's... that's the thing is people are coming out of wherever they are with I mean, it's a classic back-to-work training skill but the world in which the Java of the 1990s... When Java came in, it was just like, hey, you can write uploads. It's like, well, that's... If you tell people now, they don't believe it. It's just like, really? You're supposed to write stuff in your browser. It's just like, but how did that ever work with that? It didn't but it was a marketing ploy.

But then we see how Java has evolved through all of these things. We see it at the back and it's all the way through that stack, and now we're throwing it into the cloud. Isn't that a bit much? Isn't the poor Java programmer, you know, that's bewildering?

Holly Cummins: I think there's sort of, Java in a really good place at the moment. It does feel like it's having a bit of a renaissance, and I think it's now got what I think is probably quite a humane release cadence which is not so fast that it causes terror and panic but not so slow that you're sort of there just sort of way down going, but can't I use, can't I use.

Kevlin Henney: Yes.

Holly Cummins: Then were some of the...sort of the like what Quarkus is doing I think is really interesting because it sort of satisfies a personal belief of mine, which is that often you can solve two problems at the same time, which is really quite nice when that happens.

And we saw a similar thing in the opposite direction when I worked on WebSphere Liberty, which became Open Liberty. The problem we were trying to solve was the developer experience because the classic WebSphere was very much optimized towards running on big iron and optimized towards the sort of administrator experience.

Then when you were developing on it you needed quite a significant piece of hardware and it was a bit slow and it was like, come on, surely, we can do better than this. We made something that had an OSGi kernel. It was really sort of at the same libraries as traditional WebSphere but it was light and fast and delightful to use and it started up in a few seconds. And that was just sort of at when the cloud was just really starting to take off.

And then we discovered, oh, wait a minute. This thing that we wrote to have a really good developer experience happens to have exactly the right characteristics for the cloud as well because it's small in light and starts up quickly.

Kevlin Henney: Right.

Holly Cummins: And with Quarkus, they've kind of gone in the opposite direction and they've said, what would we have to do to make a really good runtime for the cloud? It's gotta be really small. It's gotta start up quickly. And then in order to do that, what they've done is they've shifted to doing a lot of the stuff at build time, rather than at runtime. So, all of the annotation scanning, all of that, that normally happens at runtime and is kind of slow. They said, okay, well, let's do it at build time. And in order to make that work, you've gotta do the Bytecode manipulation.

And then they said, well, wait a minute. While we're doing the Bytecode manipulation, there's all this boilerplate. And we have that boilerplate because to not have it would be too slow, but now that we're doing everything at build time, we can get rid of that boilerplate and get this delightful developer experience.

So, then one of the things they talk about is developer joy, which exactly goes back to Java should be fun when you see this kind of boilerplate, almost always it can be automated away.

Kevlin Henney: So, that's really interesting. Also, there's a kind of about how we move, how things... I thought everything moved in one direction. It's kind of like the, if you, I think if you stand back from the industry and watch it over long time periods, things slosh around.

An example of the sloshing around is everything run on the mainframe. Then it just, then we started having PCs, and then it's kind of sloshed out towards there, and then we sort of reintroduced the network and we see it kind of like, it's kind of, there's this kind of like old style kind of wave machine moving where's the center gravity of my application. And it's just like we're becoming, we're moving a lot more intelligence into what's in my hand, but then now we're realizing is, well, actually I'm gonna move a whole load of other stuff back again.

It keeps going to and from, but also there's another timing thing about when we do things. So Java is a static language but has a very dynamic element to it that is resolved at runtime, optimized at runtime, but then you're saying like, okay, well, let's pull some stuff back into the build stage, which is as it were, it's offline for anybody who's that they don't have to...they don't see that. When you're running that, you don't see that.

That's that development time stuff but it's not runtime stuff. So, if we move some of that back then that means we have a slightly faster run time. This is kind of sloshing to and fro, which you don't get as much with say a dynamic language whereas no, no, we push all of that back to the runtime because perhaps it doesn't cost us as much for certain scenarios, but there seems to be this kind of sloshing movement. And one thing I was looking up on Quarkus was that startup times were really good.

Holly Cummins: Ridiculous.

Kevlin Henney: Yes, really good.

Holly Cummins: Completely stupid.

Is testing an obstacle?

Kevlin Henney: Yeah. And that's the whole thing is there's kind of like literally a latent expectation and expectation of latency with certain things. And that, going back to the developer experience, that changes the feedback cycle, doesn't it? Which kind of which I guess when we look at the bigger developing experience, that kind of takes us back into one of the reasons sometimes people object to or resist subconsciously sometimes say testing is they're saying, well, that's a slow step. That's this, oh, that takes too long.

And that's something you're talking about, isn't it? Is the testing side of things. How does that kind of fit into that cycle as it were? How does it kind of conveniently fit within that? Because people feel there is an obstacle in their way, then they optimize the obstacle out of the way. And if they perceive testing like that, they kind of move it out the way. They don't test.

Holly Cummins: Yes. There is sort of two sides to that. One sort of continuing with the Quarkus side is they've done some really interesting things. Again, I think because they had sort of that problem that they were solving, and then they were in the Bytecode, if you move everything to build time, then your build times aren't gonna be longer.

Then they said, well, we can't. You can't possibly have a development experience with a long build time. So, then they had it all be sort of quite continuous and dynamic. So, you run it in dev mode and it's just instantaneous and that's true for the testing as well.

They've done a quite clever thing where they have this thing called continuous testing, and it uses similar techniques to code coverage. So, instruments your code. It knows what tests relate to what code.

If you change that code, it will run just those tests dynamically. So again, you get that instantaneous feedback, even though you're in this sort of notionally quite static mode. So, that's the sort of the one side but then there's the sort of the psychological side of it as well. And I think part of it has to do with how bad we are at accounting for costs, and if something is visible, we really see it. This is the most innate statement ever.

If something is visible, we really see it. If it's invisible, we don't see it. But I think when we do something like testing, it feels like an upfront cost to write the testing. But if instead sometimes I'll do something, and I'll try to track and I won't write the test. Then I find I'm like in the browser, just going refresh, refresh, change, refresh.

And I'm like, if I had a test running, this would've saved me all of these steps, and I would've known much more quickly actually, whether it was working. But because there was that sort of small upfront cost, it feels like it's more expensive, and even though actually it saves you so much down the line, even before you get to the and I've just regressed it in production and I'm doing a demo in two hours and I need to fix it. All of those kinds of horrible embarrassments.

Kevlin Henney: But I think you're right. The visibility thing is not… if it's visible, you can't see it. I mean, that's not to be underestimated because there's an awful lot. The nature of software tells us that we can't see everything about it. That most of the ideas are genuinely in our heads.

Therefore we don't notice those little things. We don't notice that thing that, you know, because it took time, we don't realize where the time might be saved and because it's also probabilistic. And obviously, it's not guaranteed.

If I write a test, there is no guarantee that I have eliminated all bugs or whatever. That's not a guarantee. So, therefore it's about probabilities. As humans, we are very poor at that, and we're not very good at attributing that, and then exactly you say you can get caught into this loop. If you're not careful, you become so familiar, you don't even question it.

That was I certainly had that with experience years ago when it was one of those trying to break the loop moments is like, we spent all morning trying to do something. Why don't we, you know, what about these other tests?

We don't bother with those tests because they take too long to do. And I said, well, we've just spent all morning repeatedly poking and prodding this, perhaps a test would've saved us most of the morning. And it's that notion of we are not very good at attributing it, but because it had become reflexive by that point, we don't see it anymore. I think there's that becomes, again, it moves from initially annoying and visible to surprisingly invisible.

Holly Cummins: I think it's gruesome but it's completely the frog boiling thing, isn't it? That you don't notice it because it's so gradual. I think that's what we see with the complexity in our industry as well, is that the sort of the complexity builds up and up and up. Then eventually well, they had a disruption where someone will say, wait a minute, it doesn't need to be like this.

This is what spring did many years ago they sent J2E. There's no way it needs to be like this. We can fix it. And then, you know, I think now we're sort of in another reset as well where Quarkus is sort of saying I think it doesn't need to be like this. And I think we're still waiting for that with Kubernetes if it doesn't need to be like this.

Contract testing

Kevlin Henney: There's like a post-Kubernetes is going to happen at some point, there's that. So, I think, but that's an interesting one in that sense of that compounding complexity.

One thing you are also talking about is the testing side of things with looking at the contract testing. Testing between the parts. And that seems quite an intriguing thing because as we're openly talking about say microservices as the parts, I can individually reason about them but the relationships are a little bit looser. They are perhaps a little more...we have a blind spot there because something like a protocol is not a thing in the way that we normally think about code. If this expects something that this is not going to give it, that's kind of fundamental. It's not unit testing but it is still there's a kind of a gap there. How do you think we're moving forward on that?

Holly Cummins: It's a question I ask myself as well. So, on paper and in theory, contract testing fits an absolutely necessary middle ground between unit testing, which doesn't identify the problems with the system, and integration testing, which does but which is so expensive, and, you know, you shouldn't do too much of it. In practice, contract testing in my experience, most people haven't done it and quite a lot of people haven't even heard of it.

It's an open question for me why aren't we doing more contract testing? Because this is clearly solving the problem that we have created for ourselves with microservices, which is I have no idea if the system works, how am I gonna know that? I need something that's in between those two but I think it can be quite hard to get your head around.

In order to do good contract testing, both sides need to talk to each other. And I think that challenges one of the assumptions that we had when we went to microservices, which is I'm going to microservices and that means I don't need to talk to any other team ever. Well, you kind of do because you're kind of going need to deploy together. I think though that's the sort of barrier to contract testing

Kevlin Henney: So, that in other words, we've kind of come back to that kind of classic thing is that ultimately architecture is a description of architecture...as it is with buildings, it is how people are able to flow through space or rather how they were able to work together. That microservices are not gonna be truly independent because I need you to get my work done. This depends on that and so on.

So, there is this idea that we're reflecting something of social complexity. Well, I was going to say social structure but actually just let's cut to the chase, the complexity of people. The fact is what is our understanding between ourselves, you know, to be reflected in code that we can rely on I understand that you are sending something and this means this to me and this means that to you. We're talking about the same things. We have the appropriate expectation over time of how this is gonna behave but that's also a people thing.

I can render it in curly brackets at one level, but again, it comes down to the people thing. That perhaps is the… you refer, you talked about the people's challenges. In other words, it covers mental models but it also covers how we worked together. The architecture reflects that.

Holly Cummins: Then with contract tests, if I make a change that breaks you, what I really want is my build to break. And that I think, well, in theory, I want my build to break. In practice, I don't want my build to break ever. I think people get uncomfortable with contract testing because the idea is that you could do something, and my build breaks. And it's like, well, no, I don't want my build to break because of you. But, of course, if you don't do it at build time, you're gonna do it at production time. And that's where you're gonna have the break, which probably isn't where you want the break.

Kevlin Henney: That's an interesting thing, again, it's the identifying the different roles, the different individuals, but technically the contract captures the space between us. And that's I think something, an organizational challenge, because we have historically, no matter what ownership models people tend to use there is this, well, that's ours and that's yours.

Then what does the contract...This is what I guess I've struggled with various different architectures and approaches over years is conveying to people this is the bit that is shared, and that means that it can either be something that collectively we are really good at, or we collectively disowned it. Collectively disownership as it were like it falls between the gap because we jointly disowned it. We jointly unshare it, so to speak. And the contract's trying to be this thing that is between the two.

Outro

That kind of brings us kind of full circle to the idea of we're kind of like, there are the people, there's the technology. We've kind of taken through the layers and back out through the complexity with some potential solutions. So, hopefully, you found that useful. Please check out Holly Cummins's talk when it becomes available, and thank you very much. Thank you, Holly Cummins.

Holly Cummins: Oh, thank you, Kevlin Henney.

Intro
Cloud Chaos & Microservices Mayhem
Developers’ mental models
Developer experience
Internet Legacy
Is testing an obstacle?
Contract testing
Outro