But some of the biggest engineering failures in history had nothing to do with QA/QC. If you’re looking for examples, NASA has plenty.
1. Space Shuttle Challenger
On Jan. 28, 1986, the Space Shuttle Challenger broke apart 73 seconds into its flight, resulting in the deaths of all seven crew members. The spacecraft’s disintegration was caused by the failure of an O-ring seal in its right solid rocket booster at liftoff, which took place under unusually cold conditions for which the shuttle had not been certified.
One of the engineers working at Morton Thiokol, the company which manufactured the solid rocket boosters, wrote a letter to the company’s vice president anticipating the disaster in July 1985.
“The mistakenly accepted position on the joint problem was to fly without fear of failure and to run a series of design evaluations which would ultimately lead to a solution or at least a significant reduction of the erosion problem,” wrote Roger Boisjoly. “This position is now drastically changed as a result of the SRM 16A nozzle joint erosion which eroded a secondary O-ring with the primary O-ring never sealing.”
“If the same scenario should occur in a field joint (and it could), then it is a jump ball as to the success or failure of the joint because the secondary O-ring cannot respond to the clevis opening rate and may not be capable of pressurization. The result would be a catastrophe of the highest order - loss of human life.”
Boisjoly’s warning went unheeded.
He later published a report entitled Ethical Decisions – Morton Thiokol and the Challenger Disaster. In it, Boisjoly describes the engineering presentation made during a teleconference between Morton Thiokol, the Kennedy Space Center and the Marshal Space Flight Center the night before the launch.
“[After the presentation,] Joe Kilminster [vice president of the rocket booster program] asked for a five-minute, off-line caucus to re-evaluate the data and as soon as the mute button was pushed, our general manager, Jerry Mason, said in a soft voice, ‘We have to make a management decision.’ I became furious when I heard this, because I sensed that an attempt would be made by executive-level management to reverse the no-launch decision,” wrote Boisjoly.
“The caucus constituted the unethical decision-making forum resulting from intense customer intimidation,” Boisjoly wrote. “NASA placed MTI in the position of proving that it was not safe to fly instead of proving that it was safe to fly. Also, note that NASA immediately accepted the new decision to launch because it was consistent with their desires and please note that no probing questions were asked.”
This disaster didn’t happen because of a faulty O-ring. The part performed exactly according to its design specifications and if the shuttle had been launched on a warmer day it wouldn’t have been a problem. In other words, the Challenger was not a QA/QC failure.
Engineers are under enormous pressure to complete jobs under budget and ahead of schedule, which can lead to the temptation to think like a manager first and an engineer second—or not at all. The Challenger represents the severe consequences of giving in to that temptation.
2. Mars Climate Orbiter
In the case of the Mars Climate Orbiter, the cost was $327.6 million. The robotic space probe was launched on Dec. 11, 1998 to study the Martian climate, atmosphere and surface changes as well as act as a relay for the Mars Polar Lander.
On Sept. 23, 1999, communication with the spacecraft was lost during its orbital insertion.
An investigation revealed that the spacecraft’s altitude was significantly lower than the intended 150-170 km. Post-failure calculations demonstrated that the spacecraft’s trajectory would have taken it within 57 km of the surface, where it most likely disintegrated from atmospheric stresses.
The cause of the error turned out to be a discrepancy between Lockheed Martin’s software, which generated results in U.S. customary units, and NASA’s, which was designed to accept metric units. Consequently, outputs in pound-seconds were taken as inputs expected in newton-seconds.
The lesson here—aside from the obvious, which is always check your math—lies in the importance of oversight and the value of communication between customers and providers. After countless cycles of design, development and testing, it’s easy to assume that all the obvious errors have been caught. As the Mars Climate Orbiter illustrates, sometimes it’s the simplest errors that slip through.
However, despite the simplicity of the error, it did not occur because of a failure in QA/QC. The quality professionals who reviewed the Lockheed and NASA software would have found that both were working perfectly. The real problem was interoperability, an issue that remains a major concern across industries today.
3. Space Shuttle Columbia
Post-accident investigations concluded that mistakes during installation were the likely cause of the foam breaking. This resulted in the employees at the Michoud Assembly Facility in Louisiana being retrained in how to apply foam to the fuel tanks.
However, the cause of the accident goes much deeper than incorrectly applied insulation. The Columbia Accident Investigation Board (CAIB) identified fundamental organizational and structural issues in NASA which compromised the safety of shuttle missions.
Safety, timeliness and cost-effectiveness are universal engineering values, as important to manufacturers as they are to NASA. However, these values are often in tension with one another. Balancing them requires careful negotiation and putting that balancing act in the hands of a single person is a recipe for disaster more often than not.
From the Launch Pad to the Shop Floor
The stories of the Challenger, Mars Climate Orbiter and Columbia illustrate how high the price of failure can be in engineering. More importantly, they show that engineering catastrophes are not always the result of failures in QA/QC.
Do you have other examples of engineering disasters that had nothing to do with quality? Comment below.