HomePodcast

How Honeycomb Is 2Xing Its Engineers with AI

Emily Nakashima shares why engineering managers should become technical, explains their internal mandate to 2x engineers with AI why productivity metrics are the wrong thing to measure, and why observability has never mattered more in a world of non-deterministic AI-generated code.
Hosted by
Ankit Jain
Co-founder at Aviator
Guest
Emily Nakashima
SVP of Engineering

About Emily Nakashima

Emily serves as SVP of Engineering at Honeycomb. A former manager and engineering leader at multiple developer tools companies, including Bugsnag and GitHub, Emily is passionate about building best-in-class, consumer-quality tools for engineers. She has a background in product engineering, performance optimization, client-side monitoring, and design.

Growing Into Leadership

I was a long-time startup person. And the thing I love about startups is that so often, if you think you can have more impact, people will let you do that. That was certainly what I found at Honeycomb when I joined back in 2017: founders willing to take a chance and say, I'm going to let you try to do something you haven't done before, because you've earned my trust up to this point. Not every set of founders is willing to do that. If you're thinking about career progression and trying to pick the next startup you work at, looking at companies that have a track record of giving people a chance to do something new can be a good indicator.

For people in larger companies, the most important thing that some engineering leaders miss is that you really have to have a perspective on not just what is good engineering in your part of the company, but what makes that part of the business really successful. 

Not just how do we keep the product up, how do we ship a great quality experience to customers, but do you track the most important business metrics for that part of the company? And do you have a story for how improving your part of engineering makes those metrics move? The people who can connect those dots are the ones most often given the chance to step up to the next level.

The Flattening of Engineering Org Charts

This is a really interesting moment where we're seeing the same thing happen to the management and IC ladders. Maybe for different reasons, but in both cases we're really losing a lot of roles at the lowest end of the ladder. On the IC side, there's a lot of conversation about it, and we don't know exactly what's going to happen, but we understand that a lot of companies are hiring fewer people overall, so those roles at the lower end are contracting.

For managers, the reasoning is a little bit different. It's been much more about this push for efficiency, especially coming out of the ZIRP era. All of a sudden efficiency is so much more important to engineering teams, and people are seeing places where they can squeeze the manager-to-IC ratio to just be more efficient overall. To me, it really feels like a short-term optimization.

Managers Should Become More Technical

I would actually so much rather see them be a little more embedded in the team's technical work and do more to move engineering work forward. I think the space is changing so fast that managers really benefit from getting as much hands-on experience as possible. I can understand why companies want to do these much flatter org charts and have 20 engineers report to one manager, but I'm also old enough to remember the early 2010s when we did that and it didn't go well. It just means that you stop hearing about concerns from a lot of your employees until it's too late to address them.

If you're going to make a change, I do think you have to pick one. You can all of a sudden have 20 people reporting to one manager, or you can ask them to get more technical.  Asking them to do both is just nonsensical. There are only so many hours in the week.

In general, I think it is a good moment to be asking managers to get a little deeper in the technical work. As engineering managers, so much of our taste and judgment is really formed from the work that we did as ICs. And if the way that we build software is changing, a lot of people are just not going to understand that as well secondhand as they would if they really go get their hands in the work.

The 2X Mandate: Aspirational and Achievable

We did a top-down founder memo last summer, saying that we really believe in this new technology, we really want people to spend time experimenting and learning about it, and that we should all try to 2X our impact with AI over the next year.

You've heard some maybe more outlandish mandates, and I think that one is both aspirational and achievable. Mostly in the early days, we just set employees free with all the tools and said, Hey, go experiment, go figure it out. By maybe at the six-month mark, I think we got most of the upside we were going to get from that approach. 

And then at a certain point it becomes clearer that you need to have systems, processes, a really clear internal AI platform that you're driving people toward.

So that's the phase we've moved into now. It's no longer just, Here's the buffet of every possible tool under the sun, and here's a company credit card you can go charge a bunch of tokens to. We're starting to do more of standardizing. Within the last quarter, we've spun up a team specifically focused on an AI platform for engineering tools. That team is still finalizing its roadmap, but it's centralizing a small set of tools and then making everything work together really well.

Not Focusing on Measuring AI Productivity

There's a lot of fear when you ask, How do you measure AI impact and AI productivity?. So many people, and not just engineers, but so many people across software companies, are worried about what this means for their job. As soon as we did the 2X mandate, the first question we always got from engineers was, How are we going to measure this? How do you know if we're doing enough?

We've really tried to de-emphasize measurement internally because I worry about people trying to game the metrics. I really worry about that company that's trying to 2X their token spend, because there are ways you can do that that return no value to the company.

We actually get a lot of value out of self-reporting. And I think this takes trust; you have to have a relatively high-trust organization. But if I ask engineers on my team, hey, do you think you're at 2X or not? What people give back in terms of self-reporting actually aligns pretty well with what I've seen. When it's working, you can see it, and these measurement questions go away a little bit.

The thing I would like to try is having every engineer self-report once a month or once a quarter and then having the manager do a self-report for the team on how much  they have increased their impact with AI. I would guess that would be one of the most accurate or most valuable measurements we'd have.

The Future of Observability and AI SRE

Observability feels more essential than ever. And I'm not just saying that as someone who works at an observability company. I think people in the observability community understood for a long time that it was a myth that you'd ship code into production and its behavior would be deterministic until you did another release. We have not been living in that world for a long time, but a lot of teams fooled themselves into believing that if it passed the unit tests, they knew exactly what it was doing in production.

A really nice thing for us observability people is that everyone has now acknowledged they don't think that's true. 

What I see is an expectation of fewer chores. Many people never actually wanted to go tag their columns in the tool. They didn't want to go check the graphs after each release. They didn't look at the dashboard every morning. And I love that we're now moving into this world where we can have computers do that stuff now so that people don't have to.

As far as AI SRE, I am optimistic about that tool category with a caveat. Those tools, just like our actual SREs, benefit from tons of context and experience. The ones that are trying to build an AI SRE experience without deep access to all of this observability data about your systems, I think they often demo really well, but do they actually find novel issues in production before your engineers?

I see an AI SRE as making it a great experience to respond to a production incident. If you have to be up at two in the morning, as soon as you get up, it's :This is what's going on, this is what we think happened and why, and this is the code change we think would fix it. And then you, as the human, get to validate that instead of having to put your glasses on and figure out what's going on.

Ready to transform your development workflow?

Transform scattered processes into reliable, collaborative Runbooks.

Join us at The Hangar

A vetted community for developer-experience (DX) enthusiasts.
Learn More