

Ryan is an experienced developer, product manager, and founder with 25 years of experience building developer tools at startups, Microsoft, GitHub, and now Google. Ryan leads product teams responsible for developer onboarding, code authoring, build and deployment systems, logging, observability, and end-to-end developer experiences.
In late 2020, I was leading one of the infrastructure teams at GitHub when we had a massive surge in traffic onto GitHub servers. And we're thinking, is there some kind of takedown? We started doing some research and figured out this was OpenAI.
They were using our public APIs, which was threatening to take down the world's open-source repositories. And they were very reasonable and worked with us to find a more rational way to get copies of some of this publicly available data that is not going to jeopardize the stability and reliability of the GitHub service.
Because of that collaboration, we got investment from Microsoft, early access to some of the GPT models, and started tinkering around with them.
Within the first two weeks or so, we settled on the fact that we already had autocomplete in just about every IDE at the time, and it looked like we could really use this to create a supercharged autocomplete inside the IDE. And that was the initial advent of GitHub Copilot.
We brought GitHub Copilot to market as a technical preview within about three to four months. It was probably one of the most exciting times I've experienced as a product manager. It's not often that you get 1.2 million or so developers signing up for a technical preview within a couple of weeks. It felt like riding a rocket ship with the windows down.
These last five years of autonomous engineering are very similar to the five stages of autonomous driving.
Those first days with GitHub Copilot are stage one, slightly supercharged predictive text. Stage two, simple chat: I've got a panel up in the side of my IDE, I ask a question about the code in front of me, I get a response back. Stage three, I ask a question in natural language or submit a prompt, and the system is able to reason over the entire application. Stage four: agents are running, many of them in parallel, not just initiated by the human at the keyboard. Some of them may even be responding to SDLC events, but still with a fair amount of human oversight. We don't trust them quite enough yet to go completely off on their own.
Stage five, we're all sitting in the back seat watching the agents do their thing. We give them some general direction about where we want to end up, and they figure out the problem from there.
By my read today, the vast majority of the industry is sitting somewhere between stage three and stage four. Many of us are more comfortable using something like Cursor, Claude Code, or Gemini CLI: we're at the keyboard, giving the models instructions, letting them iterate, but very much still in the driver's seat. Our hands are still firmly on the wheel.
The more courageous, more risk-tolerant teams are using agents with a bit more event-driven operation, but often still with some kind of human review after the fact.
By the end of 2026, I think we're going to, as an industry, be planted pretty firmly in stage four, at least in terms of the capabilities. The question is: are the people ready? Are the teams ready? Are the organizations ready?
There's this William Gibson quote: The future is already here; it's just unevenly distributed. We are very much living that right now, and what I see is uneven distribution across two different dimensions.
The first is skills. When I look at the most sophisticated engineers and their ability to work with agents, letting them write code, iterate, and deploy in parallel to farm out tasks, those folks are way ahead of probably 60 to 80% of the population, which is still doing predictive text, still doing single-shot chatbot-style development. We've got a real gap in our overall adoption of these tools, these technologies, and these new engineering practices.
The second is how we're deploying AI across the SDLC. The last five years have been a journey of using AI for authoring code. Code is now cheap, you can generate gobs and gobs of it. But when you look at the rest of the engineering life cycle: designing infrastructure, troubleshooting and investigating live-site incidents, performing retrospectives, cost optimization for deployments, building out and managing CI/CD pipelines—there's a huge other part of the practice of engineering that we've barely touched.
DevOps and SRE is the next frontier of AI. And we have barely scratched the surface on that.
For anyone managing people, especially large teams, the first thing I'd suggest is to find either an individual or a small team that is your A team, your superstar engineers who know how to handle a bit of chaos because they know their craft inside and out. Give them new tools. Let them practice and iterate and fail a little bit, but then succeed a lot. Your goal is to have a team, or individuals, who the rest of your organization already respects and who have credibility. You want them to prove out the new practice and then elevate and hold them up as a model.
Second, you build a group of champions. Once one team has proven out the practice and the toolset, these champions are responsible for going out and taking those practices and holding office hours, leading hackathons, and working with the team to train, enable, and champion the change you want to create.
Third, especially if you've got hundreds of engineers, you want to be very methodical about how you roll out the change. You don't want to just say, Hey everyone, we're going to use these new tools now; get at it. That isn't going to work. People have deliverables; they've got milestones they're working towards.
It takes time to learn a new tool, a new practice. Ideally, you take each unit — think of an engineering director with 30 or so reports — get them all in a workshop together, give them two to four hours to install the tools, practice using them, share tips and tricks, have an a-ha moment. A moment of camaraderie where they all get their first introduction to the tools together.
Then measure adoption across your teams. Not lines of code, not PRs, not productivity, just whether teams are generally adopting the tools. You need to know if the tool is part of the equation or if it is just part of the overall engineering system. Once you've validated adoption—think about 70% or so of the team using the tools—then you can start measuring your engineering fundamentals. Is the number of defects per PR increasing? Is the number of security vulnerabilities increasing? Is your deployment frequency changing in any meaningful way?
What you're looking for here is not so much whether you're accelerating across the board in terms of productivity. What you're mostly looking for is, are you regressing against any of your engineering fundamentals as your team adopts a new tool or a new practice?
The hands-down best thing you can do, whether you're a small team or a large team, is go out there and talk to your developers. You hire your engineers, and most of us pay them handsomely because they're good at what they do. They know their code, they know the application, and they know their engineering systems.
The best thing we ever did at both Microsoft and Google is engineering surveys. Just two questions: How satisfied are you with your engineering systems, and why? Then you look at the people who are dissatisfied, group the verbatims, look for groupings of words. If it looks like the word flaky tests shows up frequently when someone is dissatisfied, then go talk to maybe five to twelve of them, get a bit more color, and drive improvement on that particular pain point over the next three months.
Context really becomes the most important thing that you can invest in, and what I typically look for is this intersection of a few different planes.
First, there is the context of you as a human. What's my personalized way that I like to work? We all have different color themes in our code editors, different keyboard shortcuts we've become married to, different ways of working that suit our personal style.
Then we have the application we're working in. This often gets expressed in an agents.md or something similar, put into the repository.
Then you have your team context. Not just how this application works, but how it connects with the rest of your team's processes and workflows. Then, way up at the top level, the platform, the compliance, policy, governance, and security practices that everyone has to abide by regardless of team. And then, for each individual task you're performing, the task prompt.
Getting engineers to think about things in all of those different layers and how they marry those different levels of context together is probably the biggest shift in the way developer workflows are starting to move.
That requires a radical change in the way developers work. You can't wander up to your keyboard and just start typing code anymore. You really have to think about how you're mirroring context from all of these places together.
If you go far enough out, code reviews will eventually become less relevant. We don't all look at the compiled code after the compiler is done these days. There was a time when you actually did have to inspect the compiled code, 50 years ago.
My team already uses agents to automate a lot of the work of code reviews. That doesn't mean we've eliminated code reviews; it just means we use agents to do the first and second pass to reduce the time necessary for human senior engineer-level inspection of the code. There's always a lot of low-hanging fruit that an agent can take care of. Plus, agents are great at deduping and triaging PRs and getting them to the right people.
Ultimately, the path to minimizing the necessity of code review is having good test-driven development and high specificity in the natural language instructions that can then use automation to validate that the code performs as it should and has the right outcomes. The goal of engineering has always been to automate as much as possible so that we have higher confidence that the code is performing as it should.
As we automate more and more of the code generation and authoring, the intent of the agents at each step is going to become way more important. Measuring the intent and the traceability of intent behind each step will become equally important. You'll want to perform that traceability in a code review, to better understand the decisions that an agent made in order to measure whether it made good choices about the tests it was writing as well as the code it was writing. That will still require human oversight to perform judgment on the code and the decisions that were made.
