Applying systems thinking

I can't help it: I was trained to apply critical thinking when faced with complex problems. But I also enjoy figuring out how complex things work and how to make them work better. I'll start this chapter with a real-life story of how I used systems thinking to discover and help resolve a series of design problems affecting an advanced manufacturing facility. But bear with me for a moment while I first describe the educational, business, and technical training that allowed me to look at the situation differently than the engineers who were designing and developing the manufacturing plant, its equipment, and its processes.

Benefitting from interdisciplinary studies

The charter of the U.S. Naval Academy is to prepare young Navy and Marine Corps officers for leadership roles in military combat, spanning a diverse set of highly technical disciplines. As a result, every midshipman receives a multidisciplinary education that includes subjects from science, mathematics, engineering, economics, and the humanities.

I was commissioned as a United States Marine Corps officer and later graduated as a flight officer from the Navy's flight school in Pensacola, Florida. I went on to obtain additional training and flew over 2,000 hours in the Phantom F-4 aircraft, both reconnaissance and fighter/attack models, over eight years.

The F4 was an extremely complex aircraft with tens of thousands of separate parts making up the airframe and its electrical and electronic, pneumatic, hydraulic, jet propulsion, fuel, and weapon systems. Everything in the plane was designed to support its combat or reconnaissance missions. Because of the aircraft's complexity, any number of individual component failures could negatively impact the aircraft's mission and safety. Systems design and redundancy helped minimize those issues.

I also obtained an MBA while I was in the Marine Corps, with an emphasis on finance, statistics, and management principles. This was my educational and technical background when I took my first civilian job to work in a high-tech manufacturing firm. Specifically, my assignment was to serve as the project manager overseeing the development of an advanced manufacturing facility to build then state-of-the-art printed circuit boards for a Department of Defense client. As it turned out, the diversity of my education and work with complex systems, such as the F4, became instrumental in my ability to assess the critical issues facing this start-up manufacturing facility.

Understanding integrated circuit board manufacturing

Printed circuit boards provide the electrical connections between power sources, other circuit boards and system components, and the electronic components, such as integrated circuits, capacitors, and resistors, that are mounted on the circuit boards. Each circuit board contains layers of fiberglass and epoxy resins that have copper foil bonded on to one or both sides. Electrical circuits are printed and chemically etched into the copper foil.

The fiberglass layers are stacked and compressed together under heat to form the hardened circuit board. Then, holes are strategically drilled through the circuit board in specific places, a conductive metal is then plated through these holes to form the electrical connections between layers. Referred to as through-hole technology, the drilled and plated holes are mapped to hit metal pads, printed on each layer, that serve as the electrical connection points between layers.

One of the issues with through-hole technology is that the holes that are drilled through the circuit board take up a lot of space in the interior layers of the boards, and also constrain the location of connections points. For example, if layers one and five in an eight-layer board need a connect point, then the through-hole forming the electronic connection takes up space on the top and bottom of the board—limiting the number of electronic components that can be mounted on a circuit board, and through the other layers—which limits the space available to etch the circuitry.

In contrast, our circuit boards were designed to support a large number of surface-mounted components that had to fit in relatively small physical spaces on both the top and bottom of the circuit boards. In a manufacturing process called surface-mount technology (SMT), the surface-mounted components are attached to copper pads that are etched on the board surfaces, as opposed to being connected through wire leads that are inserted into plated holes drilled through the boards. With SMT, our shop could fit many more electronic components on each board than was feasible with the through-hole process.

To eliminate the through-holes, our engineers defined processes to build circuit boards, layer by layer, more analogous to the fabrication approach to building integrated circuits. This approach allowed the drilling and plating of holes between each discreet layer, and not through the board. Our engineers also used lasers to design and etch much smaller circuit patterns on each layer—enabling more complex circuitry—and to discretely drill much smaller holes between each new layer added to a circuit board. This strategy provided more space on the surface of each board to attach surface-mounted components.

Adding layers increased defects

The downside of this approach was a huge increase in the number of manufacturing steps. The manufacturing process for each layer of the board included all the steps of a through-hole board, with a few additional steps, but these steps were also multiplied by the number of layers. So each discrete-hole board had more than eight times the manufacturing processes as the through-hole boards. In addition, the increased number of steps per board also increased the number of potential defects and failures per board.

For example, let's suppose a through-hole board has a theoretical failure rate of, say, 1 defect per 1,000 manufacturing steps—to keep the math simple—and that each board has 100 manufacturing steps. In this scenario, we should expect an average of 1 failure for every 10 boards produced; however, if our discrete-hole boards have the same number of manufacturing steps per layer, times 8 layers, then our average defect rate is now 1.25 boards per defect.

Evaluating the manufacturing facility and processes as a system

I joined the team when the participating organizations were six months into the pilot development phase. The engineering team had obtained equipment that they took over from another prototype printed-circuit development team. Space was extremely limited, and the equipment was shoe-horned into the old development shop with no logical design to support the flow of work through the manufacturing processes. In addition, the engineers had to borrow equipment from other shops in the facility as they designed some of their new manufacturing processes. All of the new equipment was custom built and required months to build and deliver.

I spent my first two months working in the shop with the fabricators, across all shifts, helping to build the circuit boards. I wanted to understand the processes intrinsically. Within the first month, I began to get a very unsettled feeling that we would not be able to deliver the volume of boards required under our contract, nor within the approved budget constraints. It was clear to me that the engineers were optimizing their workflow around their processes, but were not looking at the operations of the manufacturing system as a whole. They were all very bright individuals, but their primary direction was to focus on developing the individual equipment and processes within their area of expertise.

I completed some simple queuing theory and defect-per-unit calculations, and the results were not good. We were not going to make our product numbers, and we would be late in delivery and be over budget. I would like to say that my concerns were immediately addressed, but I was the new kid on the block, and nobody wanted to hear this news from me.

Fixing a broken system

As you might imagine, when we went into the initial production phase, the truth was exposed, and we found ourselves the bottleneck in a $1-million-per-day systems development project. Now that is not a fun place to be, and all of a sudden, we had more senior management attention than anyone wants. The company brought in an expert on Failure Mode and Effects Analysis (FMEA), and he and I were assigned to figure out how to fix the design problems of our complex manufacturing system. FMEA is an approach to evaluating components, assemblies, and subsystems to identify potential failure modes in a system, along with their causes and effects.

I also started looking at the manufacturing systems from the perspective of Six Sigma and the theory of constraints. In the end, it was not our engineering processes, defect rates, or our state-of-the-art manufacturing equipment that were the drivers behind our failures to meet our delivery and cost obligations. We could address all our failures and meet our performance cost objectives through the simple realignment of the equipment and matching product flow rates across the manufacturing processes. Plus, we had to strategically add equipment capacity to some stages of the development flow.

Netting out the problems

I have spent a number of paragraphs explaining the complexities that we dealt with in that prototype circuit board manufacturing facility. But now, let's set out what the real problem was. Can you precisely state in one sentence what the real issue was?

Quite succinctly, the engineers developing the equipment and processes were practicing local optimization.

Local optimization simply means that each engineer was evaluating the needs of their equipment and processes in isolation, as if they had no effect on the other elements that made up the manufacturing system as a whole. And it's a complex problem to solve. For example, my first attempt was to use relatively simple queuing theories to evaluate the flow across the development activities. Those models gave me an approximation of the issues we would face when the plant went live, but they were only approximations, and I knew the situations would, in fact, be much worse—and they were.

The mathematical models in queuing theory could not adequately reflect the queues that would form over hundreds of mixed/matched batch processes and cycle times, extensive setup requirements, and multicyclical flows. For that kind of modeling, we needed to use simulation modeling, which we ultimately did. And our simulation models were amazingly accurate, all things considered.

However, this was also about the time that Eliyahu Goldratt wrote his famous book, The Goal: A Process of Ongoing Improvements (Goldratt, 1984). This book introduced the theory of constraints to the world's manufacturing communities. With the theory of constraints, we learned how important it is to match production flows in even the most complex manufacturing systems. Now we had a practical theory as to how to go about solving our complexity issues. Bear with me through one more section as I explain how we resolved our systems-oriented complexity issues.

Addressing causes and effects

The primary causes and effects of our issues were the mismatched capacities and processing rates of the manufacturing equipment, and how long a board could stay in-queue between certain processes before the integrity of the board was compromised, resulting in higher defect rates. For example, if a board came out of the plating process and had to wait for the plasma etcher to free up, and the moisture content built up in the boards while it was waiting, parts of the boards would explode inside the plasma etchers—or, to phrase it more technically, there would be violent delamination or separation between layers of the board.

The problem was that the plating line could simultaneously plate many more boards than the plasma etchers could handle on delivery. Plus, the plasma etching process took longer than the plating and was often bottlenecked by other downstream processes. The initial fix was as simple as installing an oven between the two processes to keep the boards dry while sitting in-queue. We also had to add additional capacity in the plasma etchers to better match flow rates across the combined manufacturing processes.

There were other process bottlenecks that occurred as boards in the process cycled back though the same equipment that was busy working on other boards in production. The cyclical nature of the discrete-hole development process, building each circuit board one layer at a time, meant that the boards recycled continuously back through the same equipment.

The required changes were much more challenging to implement in production than it would have been had we addressed the plant design issues as a whole before we designed, purchased, and installed all the capital equipment. For example, we had to offload some of the work in our low-volume, high-tech facility to the company's high-volume shop to make room to move all our existing equipment plus new equipment to support a better flow and capacity. We had to purchase additional equipment to align capacities across the fabrication process, causing further delays in production. We had to move the equipment. We had to redesign equipment and tooling to support the flow of work further. We also had to redesign the product's substrate to use a less dangerous and less costly material.

In the end, our budget ballooned to more than three times the original planned costs to build the advanced interconnected facility. On the other hand, the efficiencies gained provided significantly more capacity than was needed to support our original client's demand, and our per-board costs were less than half of our original budgeted cost estimates.

As a result of our team's efforts, the new circuit boards had the speed and low-cost economics to support the emerging very high-speed integrated circuit (VHSIC) chips, which expanded the company's market opportunities for the advanced circuit boards. Moreover, the commercial VHSIC option might never have been considered under the economics that justified the development of the advanced circuit board shop for our government client.

This real-life story, though somewhat lengthy, demonstrates how systems complexity issues are challenging to resolve. Hopefully, you got the point that local optimizations were killing the productivity of the entire plant. If we had continued to optimize individual processes and not looked at the manufacturing system as a whole, it's likely the whole program would have been shut down. This type of analysis is a prime example of how systems thinking was key to fixing problems that had nothing to do with the viability of the underlying development tasks and processes.