I’ve been thinking recently about what sets apart my coworkers who’ve done the best work.
You might think that the main thing that makes people really effective at research or engineering is technical ability, and among the general population that’s true. Among my Anthropic coworkers, though, we’ve restricted the range by screening for extremely high-percentile technical ability, so the remaining differences, while they still matter, aren’t quite as critical. Instead, people’s biggest bottleneck eventually becomes their ability to get leverage—i.e., to find and execute work that has a big impact-per-hour multiplier.
For example, here are some types of work at Anthropic that tend to have high impact-per-hour, or a high impact-per-hour ceiling when done well (of course this list is extremely non-exhaustive!):
- Improving tooling, documentation, or dev loops. A tiny amount of time fixing a papercut in the right way can save hundreds of users hours of debugging or schlepping through inefficient workflows.
- Identifying promising research directions. Things like character and computer use started off as a fraction of one person’s time, taking a bet that some technique applied to some problem was going to work really well, and ended up being a major influence on Anthropic’s research direction.
- System design. This tends to be a small part of the overall time to execute a project, but when done well, it can make the final system way better and save lots of work.
- Collecting and digging into data. It’s almost always the case that, if we start looking at some new source of data that we didn’t previously have, we discover huge
problemsopportunities for improvement, and it changes our prioritization a lot.- Interestingly, even versions of this work that seem very rote are often extremely high leverage. For example, you might think that a task like “look at 100-1000 individual data points for anomalies” could be delegated to a junior hire, but in fact, the people who are most obsessive about this are our most senior researchers and managers—and they often find severe issues by doing it that we wouldn’t have noticed otherwise.
I think of finding high-leverage work as having two interrelated components:
- Agency: i.e. some combination of the initiative/proactiveness to try to make things happen, and relentlessness and resourcefulness to make sure you’ll succeed.
- Taste: you need a good intuition for what things will and won’t work well to try. Taste is important both “in the large” (picking important problems) and “in the small” (picking approaches to solving those problems that will work well); I usually see people first become great at the latter, then the former.
Without taste, you’re likely to work toward the wrong thing. Without agency, even if you work toward the right thing, you’re likely to get nowhere.
How can someone improve at these? Mostly, I think, by practicing: there’s no substitute for good feedback loops with reality. So maybe the most important takeaway here is that you have permission to try to exercise good taste and high agency! But below are some more specific pieces of advice that have helped me improve on these dimensions.
I. Agency
Understand and work backwards from the root goal
One of the easiest ways to get more leverage is to take a goal you’re already trying to accomplish, and figure out a better way to accomplish the same thing. (For example: changing your experiment design to be more directly relevant to the high-level question; finding an 80/20 way of building a tool; just deciding not to do something entirely; etc.) In order to do this, you need to keep careful track of the high-level context and priorities that your specific project is aimed at, so that you can understand and judge the trade-offs.
Obviously, whoever’s supervising your project will make an attempt to point you at the most effective way of achieving the goal that they can think of. But finding the most effective way of doing something takes a lot of time and attention, they probably have a lot of other things to do, and they also have much less detailed context than you, so they’re liable to miss things!
For example, when I did a work trial at Anthropic in late 2022, I was assigned the task of building a custom scheduler for our compute cluster to fix some unfortunate behaviors in the default scheduler that were making researchers’ lives hard. But after seeing the team work for a week and thinking about the system design, I started to worry that this project would end up imposing a high maintenance burden on the team that owned the cluster, which was already overloaded, and I ended up suggesting that we postpone it. It turned out that second-order effect would have made this project net-negative for the root goal of “keep the clusters working well!”
Don’t rely too much on permission or encouragement
A counterintuitive fact about the highest-leverage projects is that they’re often not obviously high-leverage to most people in advance (because if they were, they would already have been done a long time ago). That means people are often skeptical of the value of pursuing them.
That skepticism can be for a couple reasons (in the real world, most projects have some mix of both going on):
- People might underrate the impact that the project will have. For example, I find that people often underestimate the value of work related to metrics, instrumentation, or generally looking at data, because you can’t predict what specific things you would learn or what specific impact you’d have as a result. Because of that, many teams consistently under-invest in looking at data.
- The project might seem too hard to execute in a high-impact way. For example, someone I knew built a system to identify unhealthy hosts in a data center by parsing and categorizing every kernel log message, which ended up saving $X00m/yr (iirc). When he proposed the project initially, others thought it would take a whole team multiple quarters, probably because it’s extremely boring and detail-oriented; but because this person was good at focused, detail-oriented work even when it was boring, he finished it in a couple months.
A personal example: shortly after I joined Anthropic’s inference team, a couple folks on the team proposed completely rewriting our inference service codebase, because they felt the current version was overcomplicated. I was somewhat skeptical of this, based on the standard software engineering heuristic of “most rewrites fail and most people are over-optimistic about how long they will take,” although I didn’t push back very hard because I knew I didn’t have much context. It turned out that the rewrite took about a week, was an immediate integer-multiple efficiency improvement, and made it much easier to make further efficiency improvements. I’m glad that people didn’t listen to my skepticism!
One tactic for addressing this dynamic is that instead of asking people for approval to go do something, you can just tell them what you intend to do (implicitly giving them space to object or course-correct if they feel strongly).
Of course, the same “high leverage projects are non-obvious” phenomenon also means that some of your self-driven high-impact project attempts will probably fail, because you misjudged their impact or difficulty. Because of this, you should think of them as bets that might or might not pay off, and—at least until you’ve proven your ability to make good bets over fairly long time horizons—it’s best to keep these as part of a portfolio with low-risk projects as well, to avoid the situation where you bite off a single high-risk project, it doesn’t pan out, and people lose trust in your ability to pick your own projects.
Make success inevitable
A common trait of high-agency people is that they take accountability for achieving a goal, not just doing some work.
There’s a huge difference between the following two operating modes:
- My goal is to ship this project by the end of the month, so I’m going to get people started working on it ASAP.
- My goal is to ship this project by the end of the month, so I’m going to list out everything that needs to get done by then, draw up a schedule working backwards from the ship date, make sure the critical path is short enough, make sure we have enough staffing to do anything, figure out what we’ll cut if the schedule slips, be honest about how much slop we need, track progress against the schedule and surface any slippage as soon as I see it, pull in people from elsewhere if I need them…
Mode 1 makes you a leaky abstraction—if your project is critical, someone else needs to be constantly monitoring it and figuring out how to resolve blockers. My concept handle for being in mode 2 is “making success inevitable,” because “inevitable” is the bar where other people can stop spending substantial fractions of mental energy on worrying about the project.
People who can be trusted to make something inevitable are really rare, and are typically the bottleneck for how many different things a team or company can do at once. So if someone else is responsible for making your project inevitable, you’re consuming some of that scarce resource; if you’re the one making your own projects inevitable, you’re a producer of that resource, and you’re helping unblock a key constraint for your team.
(Of course in practice you will never achieve full inevitability, but getting closer to it still makes a huge difference!)
II. Taste
Find your angle
I wrote above that you should expect high-impact projects to be non-obvious because if they were obvious, they’d have been done already. This points to another interesting dynamic, which is that it’s quite rare for different people’s “zone of best taste” to overlap very much. Instead, the quality of most people’s taste is highly idiosyncratic and area-specific, where areas can be as localized as e.g. “language model personality design” or “what blog post titles will get upvoted on Hacker News” (one of my own less-useful areas of good taste…)
For this reason, an important signal to keep track of is: where is your taste the best?
I’ve noticed a lot of people underestimate their own taste, because they expect having good taste to feel like being very smart or competent or good at things. Unfortunately, I am here to tell you that, at least if you are similar to me, you will never feel smart, competent, or good at things; instead, you will just start feeling more and more like everyone else mysteriously sucks at them.
For this reason, the prompt I suggest here is: what does it seem like everyone else is mysteriously bad at? That’s probably a sign that you have good taste there.
It’s okay if this prompt doesn’t immediately yield anything; you might just be on a team with a lot of really good people, where you don’t have a really unique angle on anything yet. Even so, a weakened version of this question—where does it seem like the most people are being the least competent?—is a useful gradient signal!
Think real hard
One way of thinking about taste is that it’s about the quality of your predictive models and search heuristics. If I design the experiment this way, what will I find? If I design the tool this way, how easy will it be to use? If I write the doc this way, how much will it resonate with people?
Doing enough search and prediction to come up with great ideas takes time. The first domain that I got some degree of taste in was software design, and I remember a pretty clear phase transition where I gained the ability to improve my designs by thinking harder about them. After that point, I spent a lot of time iterating on many different design improvements—most of which I never implemented because I couldn’t come up with something I was happy enough with, but a few of which turned into major wins.
The easiest way to improve at this is just to try it! Whenever you’re debating what to do, explicitly ask yourself “what do I predict will happen if I choose option A?” and try to unroll the trajectory. Even if you think you’re already intuitively predicting the results of your choices, I’ve found it helps surprisingly much to be explicit—one of my manager role models asks me this (“what do you think will happen?”) every time I ask him for advice and it’s kind of silly how often it helps me realize something new. (For bonus points, revisit your predictions afterwards.)
Things that I’ve found benefit from a lot of thinking time:
- What to work on. Changing your prioritization is often the single biggest lever for improving your leverage!
- “Design” broadly construed—whether that’s experiment design, system design, org design, process design, outlining blog posts, etc.
- How projects could have gone better and what I should learn from them—see below.
(See also: Think real hard. Although if you’re someone whose natural failure mode is to overthink things, it’s possible you should reverse this advice.)
Reflect on your thinking
If taste is about the quality of your predictive models and search heuristics, it’s important to wring out every possible update to these from the data that you get.
For that reason, many of the most effective people I’ve worked with also do the most metacognition, i.e., reflecting on their own (and their team’s) work and thought processes, and figuring out how to improve them. They’re often the people who are most likely to identify improvements to our processes or mental models—things like:
- Wow, we should have added this instrumentation months ago because we learned so much valuable stuff from it. I should probably up-weight my heuristic that “any time you’re not looking at data about something important, you’re probably unaware of some fire.”
- I think we’re working inefficiently because we’re overestimating how close we are to being done; we need to step back and improve our tooling. Also, I’ve noticed we get into this pattern kinda often; I’ll try to notice it earlier next time!
- In retrospect, we could have avoided this problem if we’d thought about X earlier—let’s keep that in mind next time we’re in a similar situation.
- I think our key constraint here is we don’t have enough people with Y skillset. Can we move people around somehow to get more of the right people working on the problem?
Often, they don’t just do their own metacognition but also help drive “group metacognition” by sharing these reflections with the team, scheduling retrospectives, etc. Even if each of the lessons here is small individually, they compound over time to help people and teams become much more effective.
My best habit for encouraging myself to metacogitate more is a weekly review. The format and typical outcomes of my reviews have evolved a lot since I wrote that, but I still do them and find them extremely valuable for helping me improve at whatever I’m currently focused on improving at!
(See also / further reading: Chris Olah’s Research Taste Exercises)
Comments