Definition of Success

How do you define success as part of a software development project? How do you measure success? How do you integrate parameters of success into analytics done within a project?

Agile analytics isn’t a novel concept in any shape or form. Things like feedback loops and process-oriented development seem to integrate flawlessly into the analytics paradigm, at least on paper. Heck, there’s even the Build-Measure-Learn framework for continuous development. It would be difficult to argue that analytics doesn’t have a role in something with measure in the name!

However, past three years of working at Reaktor, one of the world’s top agile technology houses, have introduced me to a whole new set of problems with integrating an “analytics mindset” into an agile workflow, or an “agile mindset” into the analytics process.

The crux of the matter is how to negotiate the time-boxed methodology of a sprint with the ever-fluctuating and contextual parameters of change. To put it succintly: it’s difficult to measure the impact of development work in a way that would establish a robust feedback loop.

One of the main things working against agile analytics is the very thing that fuels sprints in general: the Definition of Done (DoD). In Scrum, one of the most popular agile frameworks, DoD is basically a checklist of things that each task/feature/sprint must pass in order for the sprint to be deemed a success. DoD has a lot of things working for it:

  • It’s negotiated by everyone in the team.

  • It’s not monolithic.

  • It’s adjusted to better respond to an ever-changing business context.

  • It quantifies the success of the agile workflow.

In previous posts and talks on the topic, I have recommended adding Analytics as part of the DoD in any development project. By having analytics as a keyword in the DoD, the idea is that when developing features, the ever-important question of “Should we measure this?" would always be considered. “No” is a perfectly valid answer to this question - it doesn’t make sense to measure everything. But what’s important is that this discussion is had in the first place.

Anyone working in analytics can share anecdotes of times when they wanted to check some usage statistics from the available data, only to discover that the necessary tags were never deployed to begin with. By adjusting the DoD, this can potentially be avoided.

However, having a Definition of Done has proven to be inadequate for establishing goals fuelled and validated by measurement. Even though a sprint or a feature branch can be deemed done by the criteria established in the DoD, it still doesn’t mean that the feature was a success.

Doing the right things vs. doing things right

It’s tempting to think of an equivalence between something being done and something being completed. However, I argue that there’s a distinction between the two. The first can be formalized with tools such as the Definition of Done. The second one is harder to pin down, because it requires consideration of what is successful.

Consider the following story in a backlog:

Implement a single-page checkout flow.

A feature like this could have a very clear Definition of Done, and it would be very easy to validate if this feature can be shipped:

  • Test coverage has been updated to include the new code.

  • Documentation has been written to cover how the new checkout works.

  • Deployment to production is successful.

  • The new checkout is measured in Google Analytics.

  • The new checkout is deployed to 50% of visitors (A/B test).

The new checkout is done when it passes these requirements. The team members working on the new checkout can look at these steps and plan their work accordingly.

But what determines if the new checkout flow is a success? It’s a technical success once deployed to production, but what actually validates that the new checkout brings added value to the organization?

A new set of criteria need to be devised for this purpose. These criteria need to take into account the following things:

  • Success is subjective. Different stakeholders might have different ideas for what qualifies as a success.

  • Success is temporal. What is successful today, might not be successful tomorrow, especially in a constantly evolving and changing business landscape.

  • Success is non-binary. It’s not always possible to pass a true / false verdict for the success of a developed feature. Sometimes there are varying degrees of success.

Definition of Done is a great tool, because it guides us to do things right. However, it lacks the scope to guide us in the prioritization of the tasks at hand. It lacks the ability to determine if we’re doing the right things.

A Definition of Success for the example in this chapter could be something like:

  • This migration is successful for the team when the checkout flow has been deployed to production.

  • This migration is successful for the client when the A/B test shows statistically significant results that the new checkout increases revenue per user.

  • This migration is successful for the end user when the new flow decreases checkout abandonment.

If we take this thought experiment to its conclusion, we’ll find that completion stems from the union of a feature being done and a feature being done successfully. It’s possible for these two to overlap, such as in the success criterion for the team above, and it’s possible that success is impossible to determine for certain features and some particular stakeholders.

Definition of Success

Definition of Success is a series of questions you ask when prioritizing (grooming) or adding new stories to the backlog. The questions are:

  • If this feature is developed, what determines whether it is a success to the end user?

  • If this feature is developed, what determines whether it is a success to the project owner / client?

  • If this feature is developed, what determines whether it is a success to the team?

By asking these questions, you are forced to think about the impact of your project work, not just the outcome. Optimally, impact is something you can measure - something that you can use data and analytics to validate with. But it’s perfectly fine to establish success criteria as something more ephemeral.

Let’s take a look at some examples.

Feature DoS (end user) DoS (client) DoS (team)
Single-page checkout Drop in Checkout Abandonment Increase in Revenue per User Deployed successfully to production
Migrate from AWS to Google Cloud ??? Reduced costs for pipeline management More familiar tools available for pipeline management
Auto-complete site search Find more relevant search results faster Increase in Revenue from site search Improved performance of the search feature
Marketing dashboard More relevant campaigns More visibility to ROI More visibility to ROI

In the first example, establishing success criteria for something with such potential for impact as a new checkout flow is fairly easy. We can determine hard-and-fast metrics for success, with clearly defined goals which need to be surpassed for the feature to be a success.

A very technical task, such as migrating a cloud backend from one service partner to another can be more difficult to align with success criteria. How can the end user validate success for the migration? Sometimes, it’s perfectly fine to not have an answer for all three vectors. In fact, only when you can’t find success criteria for a single vector should you be concerned about the validity of the feature in the first place.

The final example shows that more than one of the three vectors can share the same success criteria. In this case, building a marketing dashboard improves the visibility of marketing efforts, and this is beneficial to the client as well as the team that built the dashboard. The success criterion for the end user is again difficult to pin down. In the “long run”, more visibility to the current marketing efforts should result in better campaigns for the benefit of the end users as well.

Success === Value

In the end, we want to build valuable and value-adding things. We want the value of development work to be, at the very least, a sum of its parts. Optimally, we want to surpass expectations and build features that collaborate to produce unexpected value in unexpected ways.

One thing that unites the two streams of completion and success is value in and of itself. A feature can be considered successfully done if it produces a net increase in value to the parties that have a stake in the project.

But value can be difficult to measure. It can be defined with the same terminology we use when talking about micro and macro conversions in analytics. There is value that can be directly derived from the development of a feature. Typically this would have a currency symbol in front of it. Then there is value that is indirectly inferred, usually after a passage of time.

To refer to the examples earlier in this article, we could say that the new single-page checkout flow has direct value, because we can measure its impact on the revenue generated by the site when compared to the old checkout flow (preferably in an A/B test!).

Similarly, we can say that creating a new marketing dashboard has indirect value, because we hope that by making the data more transparent we can help build better campaigns that will eventually increase the value of our marketing process.

Regardless of how you pin it down, you should always be able to describe good success criteria by using value as a focal point.

Fuzzy success

Here’s one final thought to wrap up this article.

I want to emphasize that Definition of Success is a communication tool rather than a set of strict validation criteria. It simply doesn’t make sense to block development while waiting for some feature to produce enough usable data to determine what its impact was.

Validation of whether a feature was successful or not can take a long time. Think of Search Engine Optimization efforts, for example. It might be months before any statistically (and intuitively) significant results emerge. It wouldn’t make sense to block the development of a critical new feature while you wait for this data.

Thus, because of this fuzzy, multi-dimensional definition of success, it’s so much more difficult to determine whether something was successful than to check if the development work is done.

What to do, then?

Well, I’ll reiterate: Definition of Success is a communication tool. It’s an approach that helps you evaluate development tasks based on their value potential. It’s more important as a discussion topic than as a directive that guides your development work (like Definition of Done can be).

One thing you can do is the exercise described earlier in this article. Set aside time for a session where the whole team takes a good hard look at the backlog of your project. As a team, try to figure out success criteria for each story still waiting for development work. Remember the three questions:

  1. If this feature is developed, what determines whether it is a success to the end user?

  2. If this feature is developed, what determines whether it is a success to the project owner / client?

  3. If this feature is developed, what determines whether it is a success to the team?

You should find that this exercise alone can unearth things about the backlog items you wouldn’t have thought of before.

If you can come up with quantifiable, measurable metrics of success - all the better. Make sure these are being measured when the features are deployed, and make sure the team knows how to access this data.

I think this exercise is good for team-building, too. You are communicating together, as a unit, what the long-term value of your work is. It’s a great opportunity to build motivation, since success criteria often make your overall goals clearer. You want the project to be a success, too, so making sure that all the tasks outlined in the backlog contribute to the success of the entire project makes a lot of sense.

Summary

I wanted to write this article because I think there’s just far too little talk about value and success in daily development work. I bet it applies to marketing teams, too, and not just developers. I bet it applies to business designers, to PR, to recruitment, to business owners, to shareholders, to communities, and to the universal laws that govern our existence!

It’s difficult to talk about success because it’s so subjective. Yet here I am, asking you to deploy a tool designed for general use, where the very focus is on this subjective notion of success.

But I guess that’s my point. It is subjective, but it’s also something you absolutely should devote time to. The discussion should also be extended to the entire team. Everyone involved in project work should understand that there is a whole undercurrent of communication, where team members are actively thinking in terms of value and what it might mean to different stakeholders.

Maybe I’ve lived in a bubble, but it’s striking how rarely long-term, quantifiable success is tabled for discussion in agile contexts. I totally understand the focus on time-boxed sprints, where success is the net outcome of multiple sprints that are all “done” and “complete”. But I think this ignores the shift in many development team dynamics, where designers, analysts, CROs, SEOs, social media managers, business owners, and sales managers are now also part of the mix. Thinking of development work as something that only involves software developers is so last season.

Thus, when these “fuzzier” roles are added to the mix, the discussion around things like value and success must be extended to cover other things than “test coverage at 100%”, “peer-reviewed”, and “well-written code”.

How this can be done systematically and comprehensively is still on the drawing board. But I do hope this concept of a Definition of Success rings true - at the very least as a discussion topic and thought experiment, if nothing else.

Huge thanks to my Reaktor colleagues, especially Aleksi Lumme, Jaakko Knuutila, and Matias Saarinen, for their collaboration on fleshing out the Definition of Success.