Having worked on numerous high profile projects, it is still surprising to find push back from stakeholders and delivery teams not to include dedicated story points and capacity for the not so ‘sexy’ error handling and UI error / warning messages from iteration (sprint) 1.
With the growing acceptance of adding UX design to the project team, providing the vision; through the creation of design prototypes that help secure budgets and buy-in from stakeholders, many of these design presentations concentrate on the happy path scenario only. Story backlogs are created, estimates generated and iterations sliced and diced giving everyone a sense of we can do this, with a clear; agreed roadmap in place.
Sound bites such as ‘Minimum Credible Release’ are likely to be shared, pats on the back, high fives and talk of just how big the bonus will be; may become common place.
But what happens on the run in to production implementation? End to End system testing starts, or worse UAT, tests are passing, confidence rising… Then Bang! A downstream system goes down for maintenance, no body knew on the project team…. The UI can’t cope, starts crashing at will, those in the business responsible for UAT are now panicking as its 2 weeks until launch and the platform is not stable….. That bonus that was a sure bet that morning is starting to move from a deposit on the Ferrari to a child’s toy Ferrari…. How could this of happened????
No-one thought about the what-if scenario, all hands to the pump… We must go live in 2 weeks (I at least want to test drive the Ferrari!)!!!!! This is where Project Management need to ensure that during project initiation and backlog creation; the definition and estimation phase should include representation from a variety of functions such as; QA, Business Analysis, Development and business. Taking the UX design, should lead to a decomposition of each functional area. All of the integration touch-points need to be clearly defined in order that resiliency or error handling stories can be captured from which the QA team can define suitable acceptance tests to aid the in-sprint development and testing.
Taking the time up-front to fully size stories for the functional and non-functional requirements; though increasing the initial estimate, should help to reduce risk factors and potential failure points; especially in the run up to production implementation. There is likely to be pushback from the business from the outset, as estimates are increased, and it is difficult for the stakeholders to visualise the value that is being added to the system. The project team will need to provide a solid case for the investment; calling upon previous experiences and championing the case for the production support team to minimise outages once the product is deployed to production.
If these unhappy paths are not included from the outset, those teams that are “Agile” (quotes intentional!) will soon be forced to support production maintenance, likelihood is that ‘special’ teams boldly given names such as “Rapid Response” or “Tiger” are likely to be formed; tasked principally with ensuring the production environment is stable, and resilient to failover. Creating such teams will impact the project teams velocity against the delivery roadmap; which the stakeholders have provided budget for.
As with all projects; there is likely to be a trade-off; it is very unlikely that a project team will be able to code for all eventualities; the stakeholders may have a high appetite for risk and may decide not to invest fully in unhappy path resiliency. However, the exercise is still a useful tool to use to populate the backlog. Architectural analysis should quickly highlight the main project / integration risks in order that mitigation plans can be agreed sooner, engaging downstream systems earlier should assist the co-ordination of production release dependencies; thus reducing risk still further.
In summary, learning from mistakes to make the next phase / project better is the goal. Providing a robust and resilient system to users and stakeholders will raise confidence in the platform; which will be useful at budget allocation times.