When a timely “SCRIM” saves the day; [SC-RI-M]: SCRUM RISK MANAGEMENT .. a “Ghost Overlay”​ Story!

The Waterfallists (a term I have used to describe the hardcore followers of the old methods of managing work) had been traditionally trying to frame Agile as a promoter of reckless, hurried, under-assessed production practices and market gambling – that according to Waterfallists’ belief system, would pose all sorts of high impact Risk elements to the team and the organization.

Interesting enough, the empirical evidence have shown that Agile frameworks (such as SCRUM as the most popular one) have been far more successful in delivering the value that market needs at the time of deployment, with higher accuracy levels in creating business value that Waterfall ever could.

If we want to do a quick comparison between the two schools we can list a number of Risks that Scrum can manage much easier that Waterfall:

  1. Risk of Missing the Market window of delivery: While waterfall projects can take months (if not years) before they can engage customers through production deployment, and as a result are constantly exposed to the ever worsening risk of missing the mark, Scrum’s short delivery cycles allows for targeting the market with early customer engagement, by deploying an incrementally growing product – starting from a minimum viable solution and developing on that based on the feedback you are collecting from your customers at each release. This market re-targeting ability allows for re-alignment with the market at the end of each cycle and starting the next one (with the cycles as short as 1 week up to 30 days maximum).
  2. Risk of Budget Shortfalls: While Waterfall projects’ long duration makes it extremely important to have an accurate estimation of the budget needed to finish the product delivery – which is by the way always error prone and a best guess with a lot of miss – The Scrum’s steady cost visibility and ease of estimating the spending by the team over the course of the short delivery cycles is much easier and far more accurate.
  3. Risk of Over-spending through Over-delivering: Waterfall projects’ long delivery time frame, lack of market experimentation and following a fix-formatted Requirement – that dreads any changes – makes them prone to delivering more than what market needs (and quite possible, delivering wrong functionalities that are not attractive to customers), thus over-spending on additional unnecessary work. This is while Scrum teams’ short cycles of delivery and market feedback gathering through incremental release of minimum viable product pieces, making them far more efficient in discovery of what customers would find attractive and how much should be delivered to sufficiently answer their needs.
  4. Risk of High Cancellation Costs: Waterfall projects’ long duration and lack of discovery ability about the deliverables’ viability on the market means that it can take a long time before the stakeholders come to conclusion that the product they want to have delivered is no longer viable and want to cancel the project after spending a large part of the budget without any gains in return. This is while a Scrum team can easily and quickly abandon spending any more resources on a non-viable solution in the market since they can solicit customer feedback much earlier and decide whether to continue or not.
  5. Risk of Changing Requirements (aka Scope Creep): While Waterfall projects’ dread Scope Creep (i.e. needed updates to the Requirements due to missed details at the Requirement phase or new information that needs a change in the path), they keep running into them as their one-time-Requirement-phase needs to either capture the perfect set of details on what needs to be delivered, or stay prone to Scope Creeps for the entire lifetime of the project (which is always the case). Scope Creeps affect/put pressure on Waterfall projects’ budgets, delivery schedules and quality. Scrum teams easily adopt the new information into their delivery work and re-align the backlog with what is the most updated understanding of the requirement at the time. As their product deployment causes market response, they can bring in the change required to better target the market at the next release.
  6. Risk of Wrong Requirements: For the same reasons we mentioned for Changing Requirements, the Waterfall projects are also prone to having captured a misunderstanding of what the customers needs at the time of information gathering. While such cases are comparable with Scope Creeps, they can be even far more damaging if the mistakes hidden in the Requirements are developed undetected to the later phases of the project while it would need so much re-work to fix the Requirements, update the Designs and re-do the Code before the product can go out. Of course there is always the chance that some of the Requirement mistake stay undetected to the end of the project and end up getting released to the market and letting customers find them out (which now need more work and funding to fix both the product and the damaged reputation of the company). Scrum can fish out wrong Requirements much easier and faster as it would send out the incremental product pieces into the market in short cycles and can re-group and send the needed fixes out by a consecutive release quickly before it becomes a large issue.
  7. Risk of Infeasible Technical Approaches: While Waterfall projects’ long duration and lack of early experimentation would make it a costly and long journey before they find out an untested technical approach would not be a feasible solution for them, Scrum teams can get to that point much faster and with much less spending of resources.

SCRIM (SC.RUM RI.SK M.ANAGEMENT)

is a term – (c) Lean Agile Council – that I have coined for a while to point to the Risk Management Framework Ultra-light [aka Ghost Overlay] that contains the abilities and techniques that SCRUM practitioners can install on top of the SCRUM framework to help them utilize SCRUM’s great power in handling Negative Risks and to exploit and amplify the Positive Risk that exists in the ever-changing landscape of the modern markets.

Scrum, by design, has embedded risk mitigation abilities which virtually remove or significantly diminish the potential impact of some of the key Risks that Waterfall projects face and can lead to their total ruin.

As we mentioned above, one key Risk item that has traditionally broken the back of more than half of failing Waterfall product projects, is the Market Risk, especially when it comes to the danger of delivering, at the end of the project lifespan, something that the market does not need anymore.

Scrum’s iterative nature, and short delivery cycle (that can be between 1 week to a months length), make it strong in responding to Market changes in appetite and need, through early engagement of customers in testing and trying the early versions of the product, soliciting their feedback and observing market responses, and taking corrective or improving actions to re-calibrate the production efforts towards a constantly closely matching product to customer needs.

Scrum’s short cycles also contain Retrospective sessions which is a team’s self-assessment and quick feedback loop for identifying dangers and threats to the success of the team in delivery through the next release, and a great opportunity to brainstorm on decisions against the identified Risk items and the proper responses that the team would like to put in place.

Scrum’s empirical approach – it’s experimentation in the market through incremental delivery of product and early engagement of customers – allows for taking small risks during the short delivery cycles, so we can collect market’s feedback and take corrective action to bring the product in alignment to what customers want.

This approach breaks down large Risks into smaller ones and avoid many future one through gaining clarification on how the product should evolve in the market.

The Daily Stand-up (aka Daily Scrum) meeting of the Scrum Team, allows for a 24 hour pulse check on the production activities of the team which provides a great opportunity to keep track of any raising Risk impacts or catching newly identified Risk items.

Scrum’s Retrospective sessions are also great in assessing the effectiveness of previous action items in Risk removal or mitigation measures and to revamp and update the responses.

We should notice that Risk identification sessions should be separate events with the focus on the entire Scrum Delivery Pipeline to ensure we are not just focused on what is happening – or happened in the case of Retrospectives sessions – in a single Sprint.

We can use the Retrospective sessions to gauge the progress and effectiveness of our Risk Response measures that existed during that Specific Sprint.

Scrum’s flexibility allows it to embrace changing market demands and shifting requirements and exploit that as a Positive Risk in creation opportunities for testing new ideas and testing their success in the market within a short delivery cycle.

This is while unstable and shifting requirements is a severely Negative Risk – a true nightmare – for Waterfall projects as their structure is designed around following the same path they started in delivering the Requirements they identified at the beginning of the project and betting on their stability and thus gambling on their steadiness to the end of the project.

In Scrum, the ownership of Risk management is shared among all members of the Scrum team. Scrum Master, Product Owner and the Development Team are all stakeholders and owners of the Risk items that are identified and processed.

This essentially turns SCRIM (Scrum Risk Management) into a Collaborative Risk Management model where the joint brain power of the team creates much better decisions, leading to much better estimation of impact and the needed Responses leading to a much more successful Risk Management than Waterfall approach.

The SC.RI.M Framework

According to the Oxford Dictionary, the word Risk in its current spelling, entered the English language around mid-17th century.

One key definition of the word Risk according to that dictionary is:

“(Exposure to) the possibility of loss, injury, or other adverse or unwelcome circumstance; a chance or situation involving such a possibility”

This shows that traditionally the word Risk had been in use in reference to its Negative impact as we call here as Negative Risk.

Risks are factors that may affect the outcome of what we are planning to do (aka our deliverables) due to our encounter with uncertainty.

SCRIM establishes an Ultra-Light “Ghost” overlay on top of the SCRUM framework with multiple connecting points [aka “Sniffer Plugins”] to the process.

It is designed to have minimal overhead while enabling live Risk tracking throughout the Sprint, everytime we have a chance of allocating a few moments of the Ceremony to do a Pulse-Check on the Risks that the team is facing (or planning to handle).

This diagram shows the SCRUM framework (as per Scrum.org)

This diagram shows the SCRIM framework sitting on top of SCRUM:

Risk Register (and its corresponding Response stories) feed into the Product Backlog (that is owned and maintained by the Product Owner).

It is important to note that Product Backlog holds all the Risk Response stories, so we are effectively handling – aka Responding – to the Risk, using the Product Backlog, but the Risk Register acts as the light weight tracking table where we can see the trace of Risks coming and going out of our horizon over time and whether we still worry about them to the same extend (or do we have a better measure of them now).

Product Owner brings the needed Risk Responsestories to the Sprint Planning session and presents them – alongside the product stories – to the Development team.

The Development team then review the presented stories and negotiate the work with the Product Owner and make the final decision on what they can bring in (pull into) the Sprint. This includes a selection of Risk Response stories that are ranked as high priority by the Product Owner.

As the Sprint starts and the Scrum team work through the process, they have their Daily Scrum (Daily Stand-up) meetings to review the status of work and progress and discuss impediments and resolutions.

This meeting also serves as a Risk Status Check to allow for sharing any newly identified Risk or any changes in the status of previously identified ones which may need updates to the Risk Register and the Response stories that are in the backlog (or in rather less common cases, Response stories that are being worked on in current Sprint).

If the changes would affect the Sprint backlog beyond what the Development team can absorb, the Scrum team would setup an ad-hoc meeting to assess the impact and revise their plan accordingly.

Once the Sprint is complete, and during the Sprint Review session, another Risk status check will be performed to capture any Sprint ending Risk changes that may have occurred. Sprint Demo also may lead to revelation of new Risks that the Scrum team may want to bring in for assessment.

Sprint Retrospectives are good opportunities to check the quality at which the Sprint ran through the Scrum Process; It is also a good window to assess the effectiveness of any active Risk Response that the Scrum Team had put in place, to decide whether changes are needed in the Risk Register and Response stories, to re-align and strengthen the measures that are put in place.

There will an on-going Risk Status check as it pertains to the product that is active in the market to ensure effective monitoring of market initiated risks and concerns.

Scrum is automatically covering the response for a number of Risks through its nature, and can easily address many other, but before responding to Risks, we need to follow the basic process in their identification, assessment, evaluation and response planning.

[Step One] Identifying the reason to SCRIM ::: Risk Identification

The starting step of SCRIM is to make sure we are actually looking at Risk items. The Scrum Team and your extended stakeholder list (including your business side) can be key contributors to the identification process.

Naturally people with expertise in certain areas are your best sources in finding Risks in their field, but anyone having noticed anything that can potentially cause a deviation from our goal, may have identified a Risk which we would like to assess.

To get that information from stakeholders would be done through talking to them individually, setting up brainstorming sessions and soliciting their feedback during the SCRUM ceremonies.

Product Owner can also ask Business to contribute to the better understanding of the requirements by making an attempt to identify risks they can think of at the time.

Scrum Team can also use past experiences as a source for identifying what can go wrong in similar upcoming scenarios.

It is important to decide whether the identified Risk item is a Positive Risk or Negative Risk.

Positive (aka Helpful) Risk items are the ones that would help the Scrum Team with delivery of the objectives (through amplifying the gain from elements that the team is dealing with): such as a sudden promotion plan by the vendor that would allow for much higher processing capacity in our Development servers for the same budget we have in place).

Negative (aka Harmful) Risk items work in the opposite way, causing hurdles and impediments to our productivity and performance in delivery of the objectives.

The team can then benefit from using SWOT Analysis in classification of the Risks.

SWOT Analysis as per Wikipedia is defined as following:

“SWOT analysis (or SWOT matrix) is a strategic planning technique used to help a person or organization identify strengths, weaknesses, opportunities, and threats related to business competition or project planning.

It is intended to specify the objectives of the business venture or project and identify the internal and external factors that are favorable and unfavorable to achieving those objectives.

Users of a SWOT analysis often ask and answer questions to generate meaningful information for each category to make the tool useful and identify their competitive advantage. SWOT has been described as the tried-and-true tool of strategic analysis.

Strengths and weakness are frequently internally-related, while opportunities and threats commonly focus on the external environment.

The name is an acronym for the four parameters the technique examines:

  1. Strengths: characteristics of the business or project that give it an advantage over others.
  2. Weaknesses: characteristics of the business that place the business or project at a disadvantage relative to others.
  3. Opportunities: elements in the environment that the business or project could exploit to its advantage.
  4. Threats: elements in the environment that could cause trouble for the business or project.”

By combining the Negative/Positive classification with Internal/External distinction, we can create a SWOT Matrix which will further categorize Risk Items that we have found among the 4 classes we learned about here.

Image Source: Wikipedia

This two-dimensional cross-classification help the team with an enriched Risk Identification experience.

What we define under the intersection of “Helpful” and “External”, dubbed as “Opportunities”, shows us the Positive Risk items that we would want to amplify.

These are the Risks that we wish for their manifestation, to take advantage of their positive impact on our goals (they would be expected to have a reverse cost of impact in them and end up saving us money or raise our income from the release).

The items under the crossing of “Harmful” and “External”, dubbed as “Threats” are the Risks that need to sit at the top of our list for further assessment, evaluation and Response planning.

Just remember …

As a Risk item is called-out, the next step is to ensure we are going to verify what we are looking at is the root of the Risk or a outcome / shadow of the main reason for worrying. This way, whatever response we put in place would have the impact we were planning for.

[Step Two] Which one is worth the SCRIM? ::: Assess & Evaluate Risks

Two questions to be answered about each of the identified Risks at this point are:

What is the likeliness of each one to happen?

What is the extent of damage it would cause?

In answering these second questions we need to consider all the damages and losses that would be incurred if the identified risk would manifest itself.

Anything from damages to the company’s reputation, creating legal problems, loss of sales, loss of property and cash and so on, would fall under that category.

This part is rather complex and painstaking, but the more accurate we are in assessing and evaluating the cost of this risk item, the better we can prepare for responding to it.

We also need to consider the cost of responding to the Risk (when it is a re-active response that needs to be triggered under certain thresholds – like the Risk becoming imminent – as this respond may need resources and funding that we need to be aware of).

To measure the impact, we can use Qualitative and / or Quantitative approaches. Quantitative would provide the most accurate prediction of the cost of the risk item but we may not always have enough information at the time of the risk identification to perform that. In such cases, a Qualitative approach (like T-Shirt Sizing or 1-5 level or …) would give us a sense of the magnitude of the cost and would help us in prioritization of the risk items for the response planning.

Our Risk Scoring (aka Risk Rating) can be as simple as multiplying the probability of its occurrence to the estimated cost of its manifestation.

(We may decide to incorporate certain additional factors in our formula that may weaken or worsen the impact of the Risk, depending on our corporate structure and other influencing elements in our response).

Note that we may not be able to plan for any mitigation for some of the identified Risks because they are out of access or beyond our capacity to respond to.

We end up having to accept them as they are (we will talk about it later).

[Step Three] Which Ones to first SCRIM at? ::: Prioritizing the Risks

The outcome of our Risk Assessment and Evaluation provides us with a list of Risk items sorted by their calculated Risk score (or order of magnitude).

Depending on the work that we do, we may end up having to deal with a large list of risks which we would manage by categorizing them as high, medium or low.

Prioritization is best done with participation of your entire team (even with your extended Stakeholders, if you can have them in the meeting), as we need to have as many perspectives in the session as possible to cover as many angles as we can.

Transparency on what we are dealing with and Joint effort through collaboration of all participants (including your customers or their reps as close to their world as possible)

We will then give the higher priority of our time and funding to the higher items and make our way down as we address them until we run out of resources for responding to anymore items.

[Step Four] How do you SCRIM for each of the Risks? ::: Risk Responding

Regardless of how many types of Risks your team would face during their production activities, the type of Response that can be put in place has a limited number of options:

AVOIDING THE RISK:

Is when the team can put changes in place that would draw a path away from the collision course with the identified Risk.

If we have the luxury of being able to avoid a Risk, it would be optimal to take it. Needless to say, Avoiding a Risk does not mean that we are avoiding the cost of Responding to that Risk, but rather we are paying that expense forward – in the form of actions we take for avoidance – to steer away from its consequences!

TRANSFERRING THE RISK:

This is done when we have the option of getting another team or department (or even a 3rd party) worry about Responding to the Risk, or simply worry about compensating us for the costs that Response would incur us.

Examples would be like buying insurance over a certain risky situation (like fire or theft), or leasing services of a 3rd party, such as a Cloud Service Provider – like Amazon Web Service or Microsoft Azure Cloud – to worry about the Risks that would threaten functional availability of infrastructure components (like Web Servers and Networking Equipment) instead of us, and having to deal with it on their own, should they happen.

MITIGATING THE RISK

In many cases, we do not have the luxury of simply Avoiding the Risk, and would not have the option of Transferring the Risk Response responsibilities to someone else.

In such cases, we need a plan for reducing the impact of the Risk, once it manifests itself. This plan may be a combination of changes in delivery timeline, teams’ allocation to product streams, funding for hiring extra hands and other measures.

Since a Scrum Team is designed by nature to be able to quickly respond to changing conditions, and based on the fact that the Scrum delivery cycle is very short, mitigation planning should consider what the team can do to absorb the impact and what needs to be available to the team (in the form of contingencies) to help them weather that storm and survive it.

ACCEPTING THE RISK

This is set aside for Risks that are outside our reach and we cannot Avoid or Transfer them and there is not really anything special we can do about them beforehand.

This means that we accept those Risks as they are and deal with their outcome whenever they happen.

Many Risks, such as geopolitical ones fall into that category. Looming market crashes, wars and natural disasters (that would go beyond insurance coverage) would be in that category as well.

Once we have decided about each Risk’s category and the type of response we can muster for them, the team needs to make sure they are captured in a Risk Log or Risk Roster (or even a Kanban backlog) for tracking and recurring re-visits and re-assessments.

During the Risk Response Planning, Scrum Master ensures that all the Scrum Team members participate in brainstorming on needed action items and mitigation measures that the team needs to take on against them.

Product Owner makes sure the action items are captured with adequate details in the backlog and are presented to the team in the Sprint Planning so we would move towards implementing the measures.

In cases that the selected Risk Response is not something we can implement ahead of time, but would need to execute it if the Risk’s evaluated impact reaches a certain value, the action item can be pulled into the Sprint through that trigger.

[Step Four] Who is responsible to SCRIM at the Risks? ::: Owners of Risk Responses.

Unless the Risk Response is assigned (and owned) by someone on the Scrum team (or the extended Stakeholder circle), it will not be effectively in play when it is needed to execute and go up against the odds that are manifesting that Risk into a clear and present issue.

Risk ownership, in the essence of Scrum (and as inherited from Agile), needs to be Democratized.

This means your team (or extended Stakeholders) need to self-organize to identify who among them is the most relevant person to take ownership of each Risk Response, and the process of recurring re-visit of the Risk item, to see if the Risk is still a Risk, and whether it is still evaluated at the same impact level, and also whether the Risk Response that we put together for it, is still viable to the same effect we assumed it would have.

That may seem a lot to take in for someone to decides to voluntarily accept the ownership of a Risk and its Response, but on the bright side, the re-assessment is a team work and brainstorming will be the power behind re-validation of Risk and its Response.

[Step Five] Are all these Risks still SCRIM worthy (at all? Harder? Less?) ::: Keeping the Risks on your Radar!

Risk Monitoring is the recurring process of re-assessing the Risks that we had identified as SCRIM-Worthy, to see how they measure now.

The re-assessment may also reveal that some of the current owners of the Risk / Response sets are no longer the most relevant person to own them at that point due to changes to their roles and responsibilities in the organization (it is not always about them moving to another department or getting promoted to another position, as they may also take a long leave of absence for personal reasons, including parental leaves, which would effectively turn those Risk/Response settings into orphans that need to be adopted by someone else!)

Risk Kanban Board

Tracking the Risk can be done in a number of way. One effective way is to have a Risk Kanban board to track the current status of each Risk Response and their progress during the lifespan of each identified Risk.

Due to the continued nature of Scrum and its delivery model, tracking the Risks in Kanban, would provide a live and on-going visualization of the Risk side of the production pipeline.

As Risks expire [no longer considered a threat], they would simply be removed (and tossed in a historical parking lot for any future reference for similar cases where we can benefit from our previous assessments and response plannings).

As new Risks are identified and brought in, the action items related to our Responses can be kept in the backlog until the time they are needed to execute.

Risk Burndown Chart

A Risk burndown chart (like the Sprint burndown) may prove useful in certain cases where we want to track the retirement path of specific Risk items, though it would not be too useful if the Risks are entering, expiring, mitigating and closing all the time throughout our Scrum delivery process.

Risk burndown chart was first introduced by John Brothers in 2004 and it was based on using Days as a measure of Risk Impact (for example Risk of QA not having enough time to finish their work by release time would be assumed as having an impact of 5 days, and probability of that ever happening may be 20% so the Risk Exposure (cost of the that Risk) would be estimated as 5 days x 20% = 1 day) and then that Risk and some other Risks impacts measured in days would be put in a chart that would have the Risk Impact (Days) as Y-axis and Your Sprints as X-Axis and overtime would show how the total Risk Impact is declining as the time is passing by (which would only be a meaningful visualizations if no Risk is added or expired and no previous Risk impacts have changed in value).

SCRIM in SCALED AGILE (Enterprise Level)

As the number of Scrum teams within a delivery group rises, the effort required among them to keep their production efforts synchronized grows drastically towards the breaking point.

For a group of N people to have communication channels among them, there needs to be

Number of Communication Channels needed = N x (N-1)/2

Which means if you have 3 Scrum Teams, each team having 7 members, you will need 21 x (21-1)/2 = 210 communication lines. Now if the number of teams grow to 10 this numbers changes from 210 to 2,415 communication channels.

This means if the Scrum teams are going to keep communicating peer-to-peer as before to ensure they are in sync, you would need quite a few hundreds of conversations per day to try to keep the joint effort organized, which is essentially impossible to work as you would wish!

Scaled Agile solutions such as SAFe (Scaled Agile Framework), Scrum @ Scale, Large Scale Agile (LeSS) and Nexus, provide the collaboration structure and intra-team synchronization needed to allow for the organized joint production efforts of several Scrum teams within a delivery group, without having to go through hundreds of communication channels to do so.

(Note: I am not going to compare the pro’s and con’s of these agile scaling solutions in this article, but will get back to it later so stay tuned!)

From SCRIM point of view, these frameworks create the collaborative structure that is needed for effective, early identification of Risks, their assessment and evaluation and their Response planning.

Some would have their own Risk review and evaluation events and some would leave that to the Scrum teams’ discretion on how and when they would want to hold them, but either way, their collaborative structure is a great endorser of SCRIM and successful Risk management through it.

Now is the time to SCRIM at your largest Risk Factor ::: Your Team Members!

Scrum teams are self-managing teams of cross-functional experts who are allowed to pull the work they can in a Sprint, and are trusted to decide the best way to do their work and provide good quality, timely, incremental products.

Your team members are the heart and soul of Scrum and their ability to do all of that is the main factor in success of your delivery structure and that makes the Human Risk your largest Risk factor that you need to mitigate through.

Hiring the best people you can find (not just in technical expertise, but also in people’s skills, team play and collaboration and co-ownership of responsibilities) is key to building up your great Scrum teams.

You would also need to continuously raise their Scrum Maturity level, Technical Ability and Team skills and Morale through Training, Support, Appreciation and Care.

Scrum is nothing without the people who build it so cherish what you have and invest in them.

Cheers

Arman Kamran (The Agilitizer)