Chris Clearfield is the CEO of System Logic and the co-author of Meltdown: What Plane Crashes, Oil Spills, and Dumb Business Decisions Can Teach Us About How to Succeed at Work and at Home.
The whole is more complex than the sum of the parts.
That’s the lesson that we can take from the crashes of Lion Air Flight 610 in 2018 and Ethiopian Airlines Flight 302 in 2019, two 737 Max airplanes that are the subject of hearings in U.S. Congress this week. The hearings focused on three issues: a feature of the 737 Max called the Maneuvering Characteristics Augmentation System (MCAS), the question of why Boeing built an aircraft that turned out to have deadly flaws and the role of regulators in allowing a fundamentally unsafe plane to fly.
Boeing designed MCAS to make the Max fly more like earlier generations of the 737, by masking differences caused by the Max’s larger and more fuel-efficient engines. But when MCAS became confused by erroneous input from a malfunctioning sensor, as happened on both Lion Air 610 and Ethiopian Airlines 302, MCAS repeatedly countermanded pilots and pitched the nose of the airplane down.
In his testimony, Boeing chief executive Dennis Muilenburg laid out the technical solutions that would fix MCAS. Going forward, the system will compare data from the two sensors, it won’t repeatedly activate if pilots resist it, and it won’t create more force than a pilot can override by pulling on the airplane’s control yoke.
It seems like Engineering 101: Don’t make an aggressive flight-control system – one that can overpower pilots – reliant on one sensor. How did Boeing, with its tens of thousands of experienced engineers and strong incentives to build safe airplanes, get this wrong? Or, as one senator asked, when did Boeing know that there was a problem, and did the company hide it from regulators to meet an aggressive certification time frame?
These are important questions, but they don’t tell the whole story. That’s because the MCAS problems stem from the underlying challenge of complexity. Boeing followed the certification practices and the rules that the Federal Aviation Administration (FAA) required. The problem, as aviation-safety expert Chris Hart testified, is that following the rules doesn’t guarantee safety. The failures of complex systems such as modern, automation-reliant airplanes are driven by interactions between parts, rather than the failures of individual components – and that’s not what the FAA’s piecewise certification process is designed to uncover.
MCAS was among nearly a hundred changes that Boeing needed to prove were safe. Early in the Max’s design, engineers added MCAS as a niche system that would almost never turn on. But as Boeing expanded its role, engineers crafted an increasingly powerful system without accounting for the knock-on effects. The design had changed massively, but they didn’t think about it any differently, reasoning that capable pilots could manage any issues.
Indeed, Boeing engineers ran tests that showed that pilots could overcome an MCAS malfunction: Once pilots recognized the issue, they merely had to flip two easy-to-reach switches to disable the system – something that crews were already trained to do in several other situations.
But those tests were unrealistic; they happened in the simulator under idealized conditions. When the problem occurred in the real world – that is, when a failed sensor provided erroneous input and caused MCAS to incorrectly engage – the cockpit became awash with conflicting warnings: altitude disagreement, airspeed disagreement and a shaking control stick that indicated that an aerodynamic stall was imminent. Amidst all this, it was possible for a crew to identify and disable the malfunctioning system, but that would have required pilots to overcome their surprise and recognize a common factor in a cacophony of seemingly unrelated warnings. The illusion of effective testing made Boeing engineers overconfident in their understanding of how crews would handle MCAS failure. That overconfidence prevented them from seeing the new, more powerful system as more dangerous.
Boeing’s approach to testing points to another lesson in managing complexity: the need to consult with outsiders. As complexity increases in any business, so does the cost of missing things. And outsiders – especially when they point out uncomfortable flaws in our thinking – become more and more valuable. Boeing’s MCAS tests were done by test pilots who knew the airplane and its systems inside out rather than by crews flying in the real world. By relying on insiders, Boeing made an unintentional bet that every 737 Max crew, even on their worst day, would be roughly as good as their test pilots.
While it might seem like these kinds of challenges are unique to aviation, that couldn’t be further from the truth. The same kind of complexity that Boeing struggled with affects all sorts of companies. Target’s expansion into Canada, for example, was challenged by a complex supply chain, a bet on software that managers didn’t understand and a lack of outsiders giving critical input. Nor are public-sector programs immune, as the fraught rollout of the Phoenix pay system demonstrates.
Amidst these challenges lies a solution. We need to realize that the success of a complex undertaking is about more than getting isolated parts correct. We need to test our systems realistically so that we uncover problems rather than confirm our optimism. And we need to incorporate outsiders who can cause us to rethink our simplifying assumptions. That’s the only way we can learn how to manage the whole system, rather than being stuck in its parts.
Keep your Opinions sharp and informed. Get the Opinion newsletter. Sign up today.