5 Risk Management Lessons From an Avoidable Disaster
Rocket science is tough, and rockets have a way of failing.
Someone once asked people on Twitter to reveal the first big news event they remember as a child as a way to reveal their age.
The event had to be something you not only heard about, but also understood on a basic, if simple, level.
Thousands participated in the fun exercise, and some patterns emerged.
People born in the late 1950s or early 1960s answered the 1969 moon landing. Those born in the late 1970s or early 1980s answered the 1989 fall of the Berlin Wall. Other frequently mentioned events were the Kennedy assassination, Nixon’s resignation following Watergate, and 9/11.
I had no doubt about my own answer: The 1986 Challenger explosion. It was the top answer for those born in the mid-1970s. The tragic images on TV of the shuttle exploding mid-air became etched in my memory. Naturally I was very interested in the Netflix series Challenger: The Final Flight. I highly recommend it.
After watching the series, you understand this was an avoidable disaster. The incident happened because the launch was delayed, and it was very cold the night before the launch. There was a serious risk the O-ring seals of the solid rocket boosters could fail in cold temperatures, and an O-ring seal did fail, resulting in pressurized burning gas escaping from the rocket booster and igniting the external fuel tank.
The Netflix series on the Challenger disaster includes fascinating interviews with key individuals. There are five important risk management lessons we can learn from the incident and interviews. They’re summarized below.
1) Risks Must be Taken to Succeed
NASA suffered some tragic incidents, including the 1967 Apollo 1 fire, the 1986 Challenger explosion, the 2003 Columbia disaster, and a few others. Despite these tragedies, the American space program remains the most successful in history. That’s not a coincidence.
NASA was perfectly aware of risks associated with a reusable space shuttle. They sought to minimize risks, but they knew they could not eliminate them entirely. NASA willfully decided success hinged on taking risks.
Rather than trivializing the tragedies and fatalities, the Netflix series highlights a lesson applicable to everyone: You can’t live or move forward without risk. Being 100% risk-averse leads to stagnation and failure. The key is to calculate risks and manage the trade-offs. NASA knew there were risks with the space shuttle, but they accepted them as the price of success. People can agree or disagree whether the price was too high.
2) Aggressive Schedules and Goals Increase Risks
As you watch the series, it can be tempting to pick a scapegoat for the disaster. But one of the interviewees said about a NASA official: “He’s not a bad guy; he was just put in a bad situation.”
NASA had aggressive targets for space shuttle flights in 1986 and beyond, and the launch of Challenger was already delayed significantly. As a result, NASA officials were under immense pressure to launch Challenger. When organizations set aggressive, or even unrealistic, schedules and goals, it puts employees in a difficult situation where they’re more likely to make mistakes.
Direct factors can increase risks of incidents: physical hazards, employee behavior, equipment malfunction, biological or chemical agents, inefficient processes, etc. But there are also indirect factors, such as an organizational environment set by management that increases pressure and stress on employees. This can lead to a situation where even a well-intentioned and competent professional could make mistakes.
3) Sometimes Middle Management Knows Better
The Solid Rocket Boosters (SRBs) of the space shuttle were manufactured by Thiokol. As a contractor to NASA, Thiokol executives were also under heavy pressure.
Thiokol engineers and managers knew low temperatures could affect the O-rings of the SRBs, causing a risk of explosion. On January 27, the day before the launch, they discussed weather conditions with NASA during a conference call where they expressed their concerns and recommended the launch be delayed again.
Thiokol executives initially supported the recommendation of their own engineers and managers, but they were severely criticized by NASA officials. A second conference call then took place without Thiokol engineers; only Thiokol executives and NASA officials were present. Thiokol executives overruled their own engineers and recommended the launch take place, much to the satisfaction of NASA. Later that night, a Thiokol engineer told his wife that Challenger would explode.
The lesson here is middle management and engineers are closer to operations and therefore have better knowledge about potential risks. Upper management should defer to their judgment instead of overruling them regarding critical safety issues. It’s normal for executives to take risks to meet aggressive schedules or business objectives, but they must also know when to follow the recommendations of middle management, especially when lives are at stake.
4) Trust a Contractor’s Recommendation
Thiokol’s initial recommendation was to delay the launch because of concerns of low temperatures, which was not what NASA wanted to hear. NASA officials then put heavy pressure on Thiokol executives to change their recommendation. The rest is history.
Ironically, it’s the reverse situation in many industries (e.g., oil and gas, construction, manufacturing). The contracting organization puts pressure on a contractor to improve safety practices and raise safety standards. But in the case of the Challenger disaster, NASA put pressure on Thiokol to reduce its safety standards.
If a contractor is more conservative, prudent, and risk-averse, it’s best to follow their recommendation because they know the product or service they’re delivering much better than the contracting organization.
5) Zero-Risk is a Journey, Not a Destination
It’s impossible to be perfect, but aiming for perfection is possible. It’s the same with risks of incidents. Organizations can’t completely eliminate risks, but they can continuously move towards zero-risk.
NASA did a thorough investigation after the Challenger disaster. The SRBs were redesigned to eliminate the problem with the O-rings, and many lessons were learned, both technical and organizational, regarding communications, decision-making, management judgements, etc.
NASA’s space shuttles were grounded for more than two years until September 29, 1988 when Discovery was launched. Unfortunately, disaster struck again when the space shuttle Columbia disintegrated upon reentering the atmosphere on February 1, 2003. An investigation also took place, and many organizational and technical changes were made.
Despite comprehensive investigations where root causes are identified and corrective and preventive actions are taken, risks of incidents can never be completely eliminated. Even if technology is flawless, there will always be a risk of human error. Instead of seeing this as futile effort, the proper mindset to adopt is each incident creates an opportunity to get better and further reduce risks.
That’s what happened at NASA. Changes made after each incident did not eliminate risks of future incidents. But lessons were learned after each event, which helped to further reduce risks. Space shuttle flights continued after the Columbia disaster until the final flight performed by Atlantis, which launched on July 8, 2011. A total of 135 flights took place between 1981 and 2011, of which 133 safely returned to Earth, or 98.5%.
Incidents are never pleasant, but they provide opportunities to learn and improve. And it’s important to capitalize on these opportunities for the sake of all employees, and to demonstrate that those who lost their lives didn’t die in vain.