GOTO - The Brightest Minds in Tech

JVM Performance Engineering • Monica Beckwith & Kirk Pepperdine

Monica Beckwith, Kirk Pepperdine & GOTO Season 5 Episode 3

This interview was recorded for the GOTO Book Club.
http://gotopia.tech/bookclub

Read the full transcription of the interview here

Monica Beckwith - Performance Engineer at Microsoft & Author of "JVM Performance Engineering"
Kirk Pepperdine - Principal Java Engineer at Microsoft

RESOURCES
Monica
https://x.com/mon_beck
https://github.com/mo-beck
https://www.linkedin.com/in/monicabeckwith
https://www.codekaram.com

Kirk
https://x.com/javaperftuning
https://github.com/kcpeppe
https://www.linkedin.com/in/kirk-pepperdine
https://www.kodewerk.com

DESCRIPTION
Kirk Pepperdine and Monica Beckwith delve into the evolving world of performance engineering, focusing on Monica's book, JVM Performance Engineering.

They discuss key advancements in the Java Virtual Machine (JVM) since JDK 8, including garbage collection and cloud-native applications. The conversation underscores the significance of experimental design and benchmarking, advocating for a collaborative approach that blends theoretical knowledge with practical application.

Monica emphasizes the future of performance engineering lies in automation, AI, and machine learning, urging engineers from various disciplines to work together to navigate the complexities of distributed systems effectively.

RECOMMENDED BOOKS
Monica Beckwith • JVM Performance Engineering • https://amzn.to/3BkRoiO
Venkat Subramaniam • Cruising Along with Java • https://amzn.to/4dFuBwU
Markus Eisele & Natale Vinto • Modernizing Enterprise Java • https://amzn.to/3EsEtZ3
Kevlin Henney & Trisha Gee • 97 Things Every Java Programmer Should Know • https://amzn.to/3kiTwJJ
Dave Thomas & Andy Hunt • The Pragmatic Programmer • https://amzn.to/3azvUy3
Joshua Bloch • Effective Java • https://amzn.to/3ygmQJt
Diana Montalion • Learning Systems Thinking • https://amzn.to/3ZpycdJ

Bluesky
Twitter
Instagram
LinkedIn
Facebook

CHANNEL MEMBERSHIP BONUS
Join this channel to get early access to videos & other perks:
https://www.youtube.com/channel/UCs_tLP3AiwYKwdUHpltJPuA/join

Looking for a unique learning experience?
Attend the next GOTO conference near you! Get your ticket: gotopia.tech

SUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!

Intro

Kirk Pepperdine: Hi, I'm Kirk Pepperdine, and I am a principal engineer working at Microsoft. And with me, I have Monica Beckwith. She is also a principal engineer working at Microsoft. So we do work together. That's our disclaimer from the beginning. And I know that Monica has spent an inordinate amount of time. Would you say years?

Monica Beckwith: Writing the book or just...?

Kirk Pepperdine: Yes, just writing this book. What do we call it? "JVM Performance Engineering." That's just been published. So congratulations on getting that out the door. Because...

Monica Beckwith: Thank you.

Kirk Pepperdine: I know you spent a lot of time. Well, the book has been in the making for quite some time. And I guess we're here to talk about it.

Monica Beckwith: Yes. Yes. Thank you for having me. I know the book came out earlier this year. Ever since then, we have been trying to get together to have this kind of GOTO Book Club. This will be a part of the GOTO Series. And I'm glad that Kirk gets to be the one not only asking questions but also working with me on this because it gets very comfortable if we both know the space. And Kirk is one of my mentors. So I feel very honored to be here.

Kirk Pepperdine: It's so kind of you to say... It's safe to say that we push each other and learn a lot from each other in the process. So I guess the question is... The first question I had on my list of questions here is why this book and why now? I mean, there's a bunch of performance books out there written by a number of different people. And what's in this book that you thought you could bring to space?

Monica Beckwith: I've covered this in the very...in the preface of my book. This book is an ode to JVM performance. That's what the book is about. So in the sense that it's a journey. And as the poem says, it's my tech odyssey. And it kind of builds on various JVM systems that I have worked with in my lifetime. I'm kind of like a bard singing tales of heroes and their exploits. So I'm kind of talking about the evolution of the JVM through the lens of performance engineering.

So we conceived this book around the time of this transformation of the JVM and the JDK. So Java 9 was on the horizon or almost released. And that's when the book was conceived. And what I saw was kind of a shift. And the new plane now was the world of modularity and microservices. So it extends beyond the deployment practices, but also explores the JVM and the JDK itself. So it talks about the evolution, like I said, of the JVM and the language, it also talks about the type system and its evolution. Then I talk about modularity and how that influences not only the application, like the use case of Java, but also the JDK itself, how it has gotten modular. It kind of builds on that and kind of weaves in performance engineering practices.

So I talk about logging, I talk about performance engineering principles, the approach, and then I dive into the memory management, the runtime, and kind of like startup and warm-up phases and how that all ties into the future. That's just the way the book evolved. It's a journey of how I evolved in the field of JVM and performance engineering as well. So we're working together with so many things and the JVM has such a big impact. And trying to look at it from the performance engineering lens is something that I wanted to provide.

I'm not sure if many books out there, which are all great in their perspective of deployment, but I wanted to give an under-the-hood perspective of the JVM, of what performance means and how do the various components kind of work together to make the JVM as performant as it is today, and what does the future look like. So that's kind of where I thought the book was much needed to provide that, you know, as an advocate of the JVM and performance engineering. So that's my perspective that I bring into the book.

Kirk Pepperdine: But then why now? Have things changed considerably since...? Well, you said you started this process in JD when JDK 9 was coming out. And from a performance engineering point of view, what's changed since JDK 8?

Monica Beckwith: Great question. I think a couple of things have changed. In the GC world, we started seeing that wanting more control over how we manage and define pauses, GC pauses. We went the route of regionalized and incremental collection. From the JDK perspective, like I said, modularity is at the core of our JDK security encapsulation. Those are becoming the grounds-up approach that we have brought into the JDK.

From the runtime perspective, we're talking about, you know, going into Project Loom space, virtual thread space. We are evolving in that. Or sometimes, you know, we are refining that. And this is all from the performance perspective. These are questions of the ecosystem. Microservices, containers, cloud, you know, these are demanding a growth in the JVM space. And the JVM is delivering that. We're talking about cloud native now. So cloud native is a big deal. And how is JVM at the center of this cloud native revolution?

So all these things are something that I think drove the book to what it is today. But also my personal need for the book was I wanted it to be educational. You know, I want performance engineering to be approachable. You know, and approachable not only for students, you know? My kids are these students. So I want my kid to pick up the book and, you know, go step by step, and at least understand the approach to performance engineering. And then hopefully, they will be, you know, working with the programs.

The programs are real. The examples that I provide are real-world examples like you and I can relate to. And I hope students can relate to that. It's also something that I wanted to write. You know, I wish I had this when I was a, you know, developer, a programmer. I wish I had an approach to JVM engineering and performance engineering kind of amalgamated together. I also want this to become like, you know, a part of our university curriculum. You know, when you learn about software engineering, I want, you know, performance engineering to go side by side. So we have a holistic engineer that, you know, the next generation that understands the concept of experimental design, right, that understands why hardware is so important.

Today, you can spin your container somewhere in the cloud on some machine and, you know, end up deploying it. That's not the end of the story. That's just the beginning of...your code is just the beginning of the story. And that's what I want people to take out of this book. So I think the timing was right and I think here we are.

Kirk Pepperdine: So adding more resources like this to the space actually, I think, helps with the educational process because it's traditionally something that universities haven't really tackled all that well in my experience. You know, what we tend to see is that you'll get a lot of education and algorithms and how things work but very little in the diagnostic space or, you know, how to break problems down, especially when things aren't working.

Monica Beckwith: Kind of demystifying the JVM, I mean, JVM is not just a black box.

Kirk Pepperdine: It's a piece of software.

Monica Beckwith: What does that mean? And how does your software affect this other piece of software that works on this hardware, you know, or through various stacks, works on this underlying hardware? And how do they all interact and work together to give you the performance that you need? And also, performance engineering is like the biggest thing too, right?

Performance Engineering and Its Role in Observability

Kirk Pepperdine: That's my next question, what is performance engineering? I mean, how do we define space?

Monica Beckwith:  I've given many presentations on this, and one of the... You know, so I keep asking myself this question just not when...not only the first time that I actually gave a presentation on performance engineering, but every day, you know, as I'm refining my presentations on performance engineering. So I think it's just I could define it as a systematic approach, right? And it's kind of ensuring that the systems, you know, whatever system, the software system or the design system, or kind of any system that you design meets kind of the broad range of non-functional requirements.

You're designing a system for functional requirements but there's this whole non-functional component that performance...that kind of is the big umbrella of performance engineering. I've given many presentations on this. I refer to this as illities which are like the scalability, reliability, availability, maintainability, security, you know, all these things kind of become the non-functional requirements that we need to tackle. Some of them, you know, tie in with your functional requirements and some of them are something that you add on to your development life cycle so that your system can meet the service level objectives and agreements.

But the main thing that you have to understand, that we all have to understand, is that any component that we design, you know, has to enrich the user experience. And I recently gave a presentation of step-by-step in Java with performance engineering, and one of the biggest things is that how are we doing with respect to the user experience. So SLAs, the functionality even, the maintainability, availability, all these things ties to the user experience. The end product is delivered to a user or for a user, and how do we enrich that experience?

That's my definition of performance engineering. It's about being proactive, kind of designing the test and then continuous improvements just like you would do for your software development life cycle. It's the same thing and it's an integral part of that life cycle. And through this book, I hope people bring that into the very early stage of that life cycle, you know, kind of so that your systems are more reliable and efficient, and kind of like a holistic approach to software and performance engineering.

Kirk Pepperdine: How does this tie into observability then?

Monica Beckwith: That's a very good question. I think measurement ties into observability. We have a component, and how does that component, first of all, integrate into the bigger picture? How do these subsystems talk to each other? And how do these...when a process is deployed, then how does that work? And all these things are answering the question how and the question what, you know? That's where observability helps us, right? So how, you know, you observe it, and then...and what is what you get out of the observability data. And that's how you explain what. You know, performance engineering is all about how, what, and the why, right? And so why, we'll figure it out, but the how and the what is where observability plays a big role. And you, Kirk Pepperdine, have a lot of experience in that space. So if you have anything to add on, please do.

Kirk Pepperdine: I think in your book, you touch on observability in a number of different areas. I think it's an important aspect of a system, obviously. And I like how you address it in the sense that a lot of people don't take into account the impact that observability has on the performance of their application. And I think part of the performance engineering in my mind is understanding what that impact is and trying to minimize it.

So from my perspective, I want to drive a particular behavior, right? And the data doesn't have to be perfect in order to drive that behavior in many, many cases, right? So we look at observability as basically being a property of the system. It's like how can we easily get that information out of the system and then again, how can we use that to drive the behaviors that we want in terms of, like you said, discovering what and then why so that we can take action to improve.

Monica Beckwith: Exactly. I think it's all a part of the fabric. And we're building the story, you know?

Kirk Pepperdine: Right.

Monica Beckwith: And that's where these kinds of mesh together and then we get to the why.

Kirk Pepperdine: Eventually, you get to the why. Generally, not in one step, but you go stepwise towards…

Monica Beckwith: No. I mean, performance engineering is a learning approach, right, or approach to learning in that sense that...

Kirk Pepperdine: Right. You're learning about the behavior of the system. It's rare that you go from...I think it's rare you go from like a single-data set to that's the problem that we need to be solving.

Monica Beckwith: Exactly.

Journey into Performance Engineering: Curiosity, Mentorship, and Hardware Innovation

Kirk Pepperdine: I think there's some gradual step towards it, I think in my experience. So, cool. Well, you know, I guess the other, like, soft question we have here before we get into the really fun ones is like, you know, what led you to performance engineering? Like, what is your personal journey from a high school student who typically doesn't know anything about the world, doesn't know what they want to do to all of a sudden, here I am a performance engineer? It's like, you know, what does that journey look like for Monica Beckwith?

Monica Beckwith: A high school...

Kirk Pepperdine: I guess more importantly, too, it's like if somebody's interested in this, how do they get started? So what from your experiences can you share to say, "Hey, if you're interested?"

Monica Beckwith: So my performance engineering journey, if I sum it up, it's like I don't know much. And that's how I started. I don't know and I want to know. And still today, I'm learning. You know, every day, we do something. We work together, Kirk, and there's like so much learning daily in every new aspect of what we're learning with respect to the Azure stack, with respect to our own needs to help our first-party partners, and we're learning.

We're learning, the various technologies that are very new to me, especially, being working in a different domain but now looking at windows and looking at the hypervisor because that's very... Again, it's different layers of performance engineering and we still keep on... Every time I've felt like I'm building new layers and finding new performance problems that... So it's very exciting. But I do want to emphasize that it is...to me, it was a very natural progression. I never, like, set out, as you said, like a high school student that's like, "I'm gonna be a performance engineer." It was never like that, you know?

Kirk Pepperdine: I'm gonna be in middle management.

Monica Beckwith: But then I feel that I was always meant to end up here, you know? Because it was so natural that it was like, " I should have told that high school student that this is what life was supposed to be for you anyway, you know?" Because it's so...it's just a part of me now, I feel. And so I started with embedded systems, you know, when I started working. And it was about designing and conforming to military standards.

So the performance and the operations against these very restricted operating conditions were key drivers of my design. This experience kind of taught me to empathize with the design board, kind of ensuring failovers and availability, especially of the comms, the communications, right? And we're out there, you know, working in these harsh environments and comms fail, what do we do? You know, what's the backup? What's the time for availability of that? All these things were... And this is from the get-go, like from the very, you know...

Kirk Pepperdine: That almost sounds like life and death in that case, right?

Monica Beckwith: It was a project that, you know, involved sonars and other things. I was on the electronic side of it. So it was very important to be able to get the information and communicate it. It was an unmanned system, our unhumanized system, and it...

Kirk Pepperdine: Do you mean like a drone?

Monica Beckwith: I'm not going to dive into details, but we want to make sure that we get the data when we...as soon as we see something. It was really important. So kind of that empathy kind of drove me to design this robust system. And it had custom OS, custom code, you know, programming, microcontrollers, communication chip, all those things. It was very custom. And so again, tying back to the why, the what, and the how.

So I started the other way, you know? Start with the why. Like why do I need to do this? And then moved on to the how. Like how do I do this? And then the system, the design system with the software, everything became the what, you know? This is what. And so kind of like the three things, how, what, why, they kind of keep on becoming the core of my journey, and they still drive me today.

And it was very interesting because after that, I kind of went and started doing...you know, after that work, for a brief time, I did my masters and I took up a job with the Center of Astronomy and Adaptive Optics at the University of Arizona. The focus was on precision, you know?

Kirk Pepperdine: Yeah. It was a completely different thing.

Monica Beckwith: Mechanical actuators and the design of the secondary mirrors, and how we...it's such a complex system. Like just the astronomy part of it, the adaptive optics part of it. But even the mechanical and electronics and everything, it kind of comes together and it was so thrilling to be a part of this whole...again, we're trying to, you know, remove out the aberrations in the atmosphere when we are capturing a picture. And so again, the what and the why, and then the how we're doing it. Right from designing and cooling down the actuators, which was the chiller system to the electronics board to the communications again, like how do we...or how are we doing the communication?

So I was a part of all those things. And it kind of taught me another performance engineering viewpoint, you know? And then of course, the whole history starts from...you know, the history with JVM and performance starts from when I joined AMD. And I was very fortunate to be a part of this groundbreaking work that was brought to mainstream by AMD. It's called the NUMA, you know, non-uniform memory access. The subsystem and the 64-bit addressable and data space, you know, the 64-bit processing.

We're now bringing compilers and JVM into the 64-bit mainstream land and teaching everything to speak the language of this...not just the addressable space like I said, but the performance and scalability of the NUMA systems, right? So just understanding what the application is supposed to do, you know? Again, just understanding what wasn't enough. We needed to go in the how do we do this, like deep dive in how these instructions actually, you know, get into and get processed by the system, by the hardware, by the architecture, how the data is brought in through the multi-layered subsystems, and then kind of bringing in the why, like why are we doing this, and how can we make it better. So kind of working with those three principles again, driving performance and kind of anticipating issues, and looking at the processor pipeline. You know, we call it front end and back end. So that encompassed both these aspects. And then no looking back.

Kirk Pepperdine: No. I guess it's kind of weird we've had, I guess, some similar experiences and...well, about quite different I guess. But I remember my first conference I guess. So it was like an internal-related one where I was working and one of the guys in a group started with this list of names on slides. And he just calls names on the slides, flipped the next ones, calls names on slides, flip the next ones. He did this a few times, right? Just flipping through the slides. So he just calls the names.

"These are the peacekeepers that have died in these places and our job is to make sure this list doesn't get any longer." And at that point, you just... Something like it was just so impactful, and I realized that, you know, what you did in terms of performance reliability really mattered, could really really make a difference, I think. And from my perspective, I looked at it and said, "Okay. That's transformative in thinking about, you know, not all systems are like that."

I mean, certainly, if you're buying your...you know, ordering your lunch from, I don't know, your food delivery system is a little...not quite the same thing, but still, it sets a mindset, I think, where you said, "Okay, you know, these systems do need to be reliable because they...you know, in many cases, they can have significant impact on people I guess." And I think your astronomy thing was really cool because now you start talking about different hardware components and trying to get everything to work and everything. And it sort of leads into the next question I had which is like, you know, how important is knowledge of the hardware, you know, why should we care?

A lot of software developers will say, "Yeah, I don't really care about the hardware. It's not my thing." And I don't actually disagree with them from that perspective. But I think at some point in time, we do need to care, you know? What would you say about that? Because I mean you're talking about working with ARM, so I think that's the low level. Right now, I think you're debugging chips, right?

Monica Beckwith: Well, one of the things I wanted to go back to what led me to performance engineering, and that's one of the things that I've been very thankful of in the book as well, is I was fortunate to have mentors like you, Kirk Pepperdine, and Charlie Hunt who got me through this journey, you know, in the sense that your insights when I went to your talks and I want to work with Charlie in early days at Sun...actually since AMD and then Sun and Oracle, I kind of just looked at you and your presentations and your talks, and I just absorbed like a sponge. I was like, "Oh, wow." I didn't even... You know, because you're working so close to the JVM and working so close to the hardware and everything, you know, when I go and see these kinds of deployment scenarios that I didn't envision, I was like, "Oh, wow. There's so much to learn."

So it's kind of like the complexities of performance tuning and optimization. You all were the pioneers of kind of, you know, getting people like me interested and then wanting to make the JVM better. This kind of came back to, you know, I was lucky enough to work in the OpenJDK space before it was OpenJDK space, you know, Sun JDK, and kind of bringing your experiences and kind of working hands-on with the JVM, kind of making it work for everybody that has shared those experiences.

It really became a fundamental part of what I wanted to do and how I wanted to shape my performance engineering journey. For those that I've not mentioned in my book, I want to say thank you, Performance Engineering.  I've learned from you all so much and we've worked together in so many spaces that it just...it's humbling. Now going back to your question of the hardware.

Kirk Pepperdine: The hardware, you know?

Monica Beckwith: The most fascinating part of it is that we have this piece of hardware, you know, or we are almost... That's where I come in. I'm not the verification or the chip designer or the subcomponent designer, the SSC designer. I come in that this is what we have, this is what we can do over the next couple of years, and we want your feedback. You know, we want to see how applications and various components, especially the compilers and the JVM, what we need to change, what we need to enhance and how can these subsystems benefit from this stuff that we have today, and how can they benefit from what we're going to do tomorrow.

So that's where I come in. So it's not just about, you know, let's go play with this, but let's identify scenarios where this is going to solve a real problem, you know? The presentation that I was talking about where you, Kirk, said, "Oh. You know, I encountered this in the addressable space." You know, back then, even though we had 64-bit, the transition from moving outside of the 4-gigabyte space...

Kirk Pepperdine: It was painful.

Monica Beckwith: Then the operating systems ate up a big chunk of that, like almost 50% of that space. So how do we reduce that footprint? How do we make sure that it's not all contiguous so we don't have to let just go above the two gig bounds? So that's what drives innovation and performance. And just some simple things like large pages, right, you know, in the hardware. How do we use that? What's the point of having one gig addressable space? What is the importance of having one gig, you know, large pages? And how does that affect performance? And how do we get applications like databases out there to use and be more impactful in performance by those one gig?

That's where I have always, like, straddled... So innovation sometimes, you know, and we'll get into... You know, in the book I talk about the bottom up and the top down, but it's about innovations no matter where you're driving it. You know, sometimes there's an application need, sometimes there's an ecosystem need, and sometimes the hardware architects understand that this is what we need. You know, it's about... I was at ARM and, you know, we always used to talk about parallelism, you know, so instruction-level parallelism and data-level parallelism.

So we are innovating in these two spaces. How does the software take advantage of that with respect to concurrency, with respect to...and just parallel working, you know, optimizing the use of cores, and what does it mean? So it's just fascinating. There's so much out there and, you know, I hope I've captured some...you know, or at least I have made people curious with this book, I made people curious into understanding. Yes, there is a harmony that needs to be obtained here. And I talk in Chapter 5, dedicate this whole thing about harmonizing with hardware, what does concurrency mean, what are the primitives that we use because I want people to understand that nothing works in isolation. And a deployment is not done when you hit the deploy button. A deployment is like, you know, the holistic understanding of where it gets to.

Experimental Design & Benchmarking in Performance Engineering

Kirk Pepperdine: I want to read a part of your book here and just have you comment on it here, right? And it's about experimental design because I think, you know, one of the core activities or an activity that's core to performance engineering is benchmarking. And that I think all falls around what I would call experimental design.

And you write here, "Experimental design is a systematic approach used to test hypotheses about system performance. It emphasizes data collection and promotes evidence-based decision-making. This method is employed in both top-down and bottom-up methodologies to establish baseline and peak or stress performance measurements and also to precisely evaluate the impact of incremental or specific enhancements. The baseline performance represents the system's capabilities under normal or expected loads whereas the peak or stress performance highlights its resilience under extreme loads. Experimental design aids in distinguishing variable changes, assessing performance impacts, and gathering critical monitoring and profiling data." So I guess, you know, experimental design is core. And, you know, so can you just, like, elaborate on, you know, what you were trying...the message you were trying to drive home with that part?

Monica Beckwith: I'm gonna try to break it down a little bit, but I am fascinated by this concept of pie charts.

Kirk Pepperdine: Pie charts? I really like pie. I'm not sure about charts.

Monica Beckwith: I love pie too. But the way...it's about the sum of the components, right? The pie chart represents the sum of its components. And that's what makes it whole. And that's what I was trying to capture in that long sentence, but that's what experimental design is about, you know? So think of it like the complexity and the variables, right? Of course, the reproducibility part of it. Like you design an experiment and it's not reproducible. So that's something that you have to keep in mind. You have to keep in mind the real-world scenarios with respect to benchmarking. There's this whole component of noise and isolation and how we can be true to the real-world scenarios but also work in an isolated setup just to make sure that we are producing... So this all works together, the trade-offs

Kirk Pepperdine: You're talking about load shaping also?

Monica Beckwith: It's amazing. And this is why I said pie chart because you're not... You know, it's the sum of the whole, right, you know? So for example one of the main challenges in experimental design and benchmarking is managing the complexity. And then you have a multitude of variables that you have to...that can influence the result, that will influence the result. And some of these variables, you don't even know. And this is why I was talking about integrating with, you know, the known knowns and the known unknowns as in like the top-down methodology and stuff like that.

Kirk Pepperdine: This makes benchmarking difficult. And most of them are broken in some way. Most people get it wrong.

Monica Beckwith: There are so many talks. Like what was it, "Benchmarking Done Wrong," or, "Are You Marking What You Think You Are?" And, you know, there was this whole...

Kirk Pepperdine: And even in those talks, they're getting it wrong.

Monica Beckwith: Well, you know...

Kirk Pepperdine: So it's like you're getting it wrong. It's like, "Yeah, and you are too," you know?

Monica Beckwith: You and I have worked in this field, especially with benchmarking for so long that, you know, I tell people, you know, if people wake me up in my sleep, you know, and at least like it reminds me of my dad. Like, you know, my dad would be sleeping and we would get up at like 5am trying to ask him some questions because I have a submission or something coming up. If it's a math question, like anything, you know, his eyes will be still closed and he will solve it. Like he was like that. You and I, Kirk Pepperdine, probably like the performance of benchmarking. You know, somebody wakes us up at 5 a.m., you know, and they tell us to write a benchmark or do it, right, given the components and everything else, I think we can...I think we both...

Kirk Pepperdine: Oh, I'd get it wrong. There's no doubt about it.

Monica Beckwith: I'm pretty sure you will get it right.

Kirk Pepperdine: I'll get it wrong. I'll get it wrong.

Monica Beckwith: I know you and you're very meticulous on that. It's because of...

Kirk Pepperdine: Even when you're meticulous, you're still going to get it wrong. I mean, it's just very difficult to get it right. And it's a lot of investigation, I think.

Monica Beckwith: But anyway, that's what experimental design helps you, right? Kind of understand the complexity, the variables. I think you do too, Kirk. I think it's very humble of you to say that, but I've worked with you so I know it.

Kirk Pepperdine: No. I get it wrong. Getting it wrong is not the problem. It's recognizing that it's wrong and then trying to understand why it's wrong, and then making the corrections I think is really the important thing, right? And then I think that's the latter steps that you see get missed out on, you know?

So I think part of getting through the benchmarking process is recognizing that there's some problem here and I guess now you're into a performance investigation and, you know, you talk about bottom up and top down, which one do you prefer, or do you actually mix them together? Like how do you use these bottom-up top-down designs?

Monica Beckwith: So I think as you touched on, Kirk, you know, benchmarking often struggles to simulate real-world conditions, right? When we're talking about experimental design, we're thinking about, you know, trying to capture some real-world scenarios with synthetic benchmarks, you know, kind of make sure that we're capturing the production workload and kind of being more realistic, more reputable.You need to have some understanding or at least a deeper understanding of the system and the workload being tested.

So that brings us to the methodology and the approach, and that's where the top-down methodology or the bottom-up methodology. It's about how or where our knowledge lies, right, in the sense that when we talk about top down, we are talking about the highest level as in starting all the way. Remember when we started this conversation we talked about user experience, right, and the overall system performance. So it starts all the way up there and then drills down into the details of identifying bottlenecks or areas of optimization.

It kind of relies on the big picture of known knowns, right? We know what we know. We know about the application, we know about the ecosystem, we know about the libraries, we know... And sometimes we know about the user experience and expectations and then we know about...that we have a problem, right? As opposed to bottom-up where it starts from the lowest level, example code or system internals, and builds up the understanding of how these details impact the overall performance.

It's like looking under the hood and then looking up to figure out how we can gain the impact. Remember I was talking about large pages and hardware, you know, all those things. So it's about a lower-level detail. Like even from the JVM perspective, designing a garbage collector to be efficient falls into the bottom-up methodology stack, you know? And so we know the garbage collector, we're trying to improve its efficiency in the design of the garbage collector space, and then we want to look up and try to see how we can benefit and what are the real-world scenarios that are going to benefit from that.

And most often than not, and, Kirk, you have experience with that as well, you know, these approaches are complementary, as in they're not mutually exclusive and they're most often they're not used together. You know, you may start with a top-down approach to identify general areas of concern and then bottom up. And that's what it is all about.

Kirk Pepperdine: I flip between them all the time. Because they'll use top down to focus and bottom up to understand, you know, between the two, you know?

Monica Beckwith: Exactly. That's because it's switching from the bigger picture of what you know, and then where you are or what you're trying to improve. It's always like, you know...it's the different views that you can see, you know, if you have a video call, right? You can view the room view or the individual speaker view. That's what it is all about.

What these approaches, the methodologies tell us is how to approach the problem in a systematic fashion and when to use... I think one of the concerns that people have always asked me is, when do we use one versus the other? So the top down like I said, you're dealing with the system, you don't know the...there are unknowns that you want to peel out and you know what some of those unknowns are, but like you've identified a problem and that kind of helps if you have a statement of work and that guides you through okay, known knowns, known unknowns, and stuff like that.

Then the bottom up is we are already aware of the special area...specific area that you need optimization. You already know a particular function is low or you dive into the code or the system component, and it could be even at the hypervisor level, you know, something is happening that we're doing differently or a scheduler...OS level scheduling effects. There's so much things but we know, we've identified, and now we want to drill. So kind of goes hand in hand with each other but also those approaches are very...tackle different problem spaces.

The Future of Performance Engineering

Kirk Pepperdine: Right. So just in wrapping up here, I guess, what you see is the future of performance engineering. Briefly.

Monica Beckwith: I think automation. Let's be honest, right? We are already there. We're talking about AI, machine learning, automated tools, you know? I've had the pleasure of collaborating with various performance engineers who see that and who are contributing towards these kinds of automated tools and trying to understand real-time issues and kind of prediction in a manner.

And this prediction can be at different levels, you know, active, passive, offline. But they kind of help guide the performance engineering effort, you know? The future is also in the way we're evolving with respect to the hardware. Like performance engineering is keeping pace with hardware evolution with respect to accelerators. The last chapter of my book is dedicated to exotic hardware, which is like these kind of accelerators and how the JVM, you know, coming back to the crux of my book is all about the JVM, so the future and how the JVM is evolving and getting into the space of being more compassionate with these kind of accelerators and being able to address that new space.

The evolving tool sets, new tools and methodology, you talked about it, Kirk Pepperdine, as well. You know, we want to be able...the observability, all these advancements that we're doing in the field of performance engineering, you know, observability is at the heart of it. And you're only going to see it being much much better, you know, focusing collaboration across disciplines. We're no longer data scientists alone and hardware engineers alone. We're not working in silos. We're collaborating.

I want to learn...if you're a hardware engineer, I want to learn. I want to see your data engineering, you know, the big data problem. I want to help you be able to, you know, bring the performance engineer and learn from you, and bring the performance engineering expertise between our domain, share them. Distributed systems will always be complex in the sense that new evolution, new technologies will always increase that complexity, add more layers to the complexity. But this is why we all need to collaborate across disciplines so that we help each other understand the complexity and make it easier for everybody to operate and coexist.

Outro

Kirk Pepperdine: Well, thank you for your time, insights, and  just the ability to just sit and to have a nice conversation with you. It's really nice.

Monica Beckwith: Thank you again, Kirk. I mean, I really appreciate you taking the time to read my book, first of all, and also, like, to ask these questions.