Chip Technology, Geopolitics and the CAD Industry

This article is an update of an article originally published on Architosh in the Fall of 2021.

Rising tensions between the west (specifically the United States) and China is ushering in large-scale change in the semiconductor industry. With nearly all leading-edge node semiconductor manufacturing in either Taiwan or South Korea—and not here in the US with Intel in its typical leadership position—the US government has stepped in to assist in a new era of industrial nationalism. In early June of this year, the US Senate passed legislation (the US Innovation and Competition Act, USICA) which includes 52 billion USD of federal funding to accelerate domestic semiconductor research, design and manufacturing in what is known as the Chips Act for America.

Considered an issue of national security, semiconductors power nearly every type of digital device and certainly every type of computer system running an operating system, including military systems. The US has traditionally led the world in leading chip design and manufacturing. While design leadership remains in US hands, its national manufacturing champion in Intel has faltered.

On top of security concerns, the global semiconductor industry is behind and unable to meet demand. There is a ripe economic opportunity and every major global economy—from the US, EU, China and Japan—wants more of the burgeoning action.

Intel’s new semiconductor fabrication facility at the company’s Ocotillo campus in Chandler, Arizona.

On the back of US government investment in the domestic semiconductor industry, Intel's new CEO Pat Gelsinger announced a $20 billion expansion of two new fabs at Intel's Chandler, Arizona facility where Fab 42 is fully operational producing 10nm node chips.

The new US Chips Act could spur the development of up to 10 new chip manufacturing factories. Intel has promised two new ones in Arizona (see above). Similar EU plans call for self-sufficiency in the design and manufacturing of semiconductors in the EU. The US, which once had a 37 percent share of semiconductors and microelectronics production in 1990, holds a 12 percent share currently.

While China aspires to self-sufficiency in semiconductors, they lack native companies that can develop and manufacture the equipment, non-wafer materials and wafer materials used in the manufacturer of semiconductors. The US and EU dominate the critical equipment market with nearly zero equipment makers in Taiwan and a single-digit share in China. What China does have is a growing "fabless" chip design industry.

The global democratization of semiconductor industry design, development and manufacturing are altering the possible futures for the computer software industry. This can have significant implications for design and engineering software in the decade ahead. The once stable "Wintel" duopoly (Windows and Intel) has largely de-coupled. In December of 2020, Microsoft announced that they were designing their own ARM-based chips for servers and Microsoft Surface devices. The server chips are for the company's own Microsoft Azure Cloud services data centers. This play is largely imitating rival Amazon which designed its own ARM-based chip (Graviton 2) to power its AWS datacenters.

Microsoft Windows PCs are not the center of computing anymore but rather exist in our presence, much like fossil fueled cars amid the EV automobile revolution. While they remain in the CAD industries as our primary equipment, they are supplemented by a rapidly changing landscape of new types of smaller devices. The promises of cyclical economic improvement by the Wintel hegemony have firstly slowed and then rather abruptly collapsed in the past few years with Intel's manufacturing hiccups (more on that below).

This is a moment of overlap between an old paradigm being slowly replaced by a new paradigm.

Moore's Law: Then and Now

Since Intel's founding and the emergence of the x86 CPU chip architecture, Moore's Law has largely held its promise. Specifically, Moore's Law—named after Gordon Moore, an Intel co-founder—says that the number of transistors on microchips doubles every two years. That requires a compound annual growth rate of 41 percent.

Moore's Law across decades of chip advancements from Intel, Motorola, ARM, Apple, IBM and others. Note the doubling of chip transistors approximately every two years, an annual compound rate of 41 percent that is Moore's Law. Yet, there is a pattern emerging that leads to the next image… (Image: Wikipedia Commons)

In the 70s and 80s, Intel's chief microprocessor competitors such as Motorola and IBM largely kept the semiconductor industry red hot with advancements. AMD added competitive pressure in the 90s and into this century and the competition for servers, in particular, led Intel to push for larger and more powerful server CPUs.

In the years from 1994 - to about 2007, we can see from the above chart, a massive gap between powerful server chips like IBM's Power6, Intel's Itanium 2 and AMD's K10—all basically over 500 million transistors compared to the ARM Cortex A9 with less than 50 million transistors.

Yet, something changes from 2008 onward as ARM advances at a steeper rate than everyone else in the chip industry (see the green line in the chart below).

Connecting the dots. Above Moore's Law chart but this time connecting the Intel’s progress (in red) and ARM (in green). It is clear that from 2008, ARM progress gets steeper while Intel is barely keeping up with Moore's Law. By 2016, Intel slows even more, while ARM accelerates

Suddenly, ARM licensee Apple comes out with the A7, a landmark 64-bit SoC (system on a chip) with over one billion transistors. Apple's iPhone chip now has more transistors than IBM's Power6 from 2007, when the iPhone was introduced. In just six years, a chip in a phone had more transistors than one of the world's most powerful server chips from IBM.

ARM's ascendance is just one factor in the global semiconductor tidal shift. Intel's falling off Moore's Law is another (more on that later). No other company in the industry signifies the democratization forces in the global semiconductor industry as much as ARM, which licenses its chip designs to anyone who wants to work in the ARM ecosystem.

ARM, Apple and Intel

In the latter half of the first decade of this century, Advanced RISC Machines (ARM) emerged as a clear leader in powerful processors for mobile devices and equipment. Once Apple, one of ARM's major founders, experienced the RISC-based CPUs from the Newton line, the company became devoted to ARM chips and have used them since the original iPhone in 2007 that launched the smartphone era. ARM's chip development continued at a fast clip. Today ARM owns the mobile device market on a platform architecture level.

The competitive pressure to deliver more processing power and longer battery life at the same time put ARM chip progress on a steeper progress curve, destined to match and overtake Intel x86 in performance per watt. While Intel has plans to become the market leader once again—as measured in performance per watt, not just absolute performance—Apple holds the lead at the moment.

The Apple A15 Bionic is Apple's latest SoC powering the upcoming iPhone 13 line. Remarkably, the A15 Bionic doesn't make much progress in CPU performance over the A14. Apple appears to have been hit by a talent exodus from its semiconductor ranks to both Nuvia and now Rivos. Still, the A15 Bionic manages to increase the GPU performance over any other smartphone chip in existence (including its own A14) by 50 percent. The A14 Bionic has 15 billion transistors, just 1 billion less than the Apple M1 chip.

However, Apple is destined to face steep competitive pressure from none other than its ex-chief CPU architect Gerald Williams who left Apple in 2019 to form Nuvia. Qualcomm swiftly acquired Nuvia for $1.4 billion with reportedly multiple one billion-plus offers from Qualcomm rivals.

Apple largely built its world-class semiconductor design team from the ground up after acquiring PA Semi in the spring of 2007. A Forbes story makes the note that Steve Jobs wanted to ensure Apple could differentiate its new iPhone from a raft of new competition. The acquisition was a blow to Intel because they were hoping to convince Apple to build future mobile devices using its Atom processor. Why Intel failed to secure a footing in the smartphone chip market is a larger story best told on another day, but suffice it to say their failure to do so helped secure ARM's dominance.

Nuvia's planned custom ARM-based Phoenix NUMA chip (blue region) would exceed Apple. (Picture courtesy of Nuvia)

While Intel and AMD fought it out in a post-PowerPC era (Apple moved Macs to Intel in 2005), ARM quietly advanced the ARM chip architecture to squeeze every ounce of computer performance out of every watt. While many smartphone makers used largely unaltered ARM chip designs for their smartphone CPUs, Apple had a special license with ARM to develop customer ARM-based chips with proprietary logic.

As you can see from the two charts above, the ARM world has caught and surpassed the Intel X86 world. New Intel CEO Pat Gelsinger says Intel will retake the performance per watt crown by 2025. Given recent history and ARM's inherent architectural advantages, that claim seems like a risky bet. If anything, Intel (and AMD also) will face significant new competition for performance per watt from the likes of another break-out new company called Rivos Inc.

Process Technology—Intel's Falters

We will discuss Nuvia and Rivos in the next section. What is critical to understand now is how Intel fell behind in performance—not just to the ARM-based chips at Apple but even AMD.

Intel began to stutter in its leadership at the manufacturing a few years ago, but things took a horrible turn for the worse in the summer of 2020 when Intel announced a significant delay in its next manufacturing milestone. Intel would now not move to 7nm process nodes for several years. This process technology has now been renamed "Intel 4" and is due in 2022 H2.

From its earliest days, Intel orthodoxy said that it could lead the world in semiconductors if its chip designers could work directly with its manufacturing engineers, something not easily done when working with chip foundries halfway around the world. That was the philosophy but not necessarily the reality. As a case in point, Dutch semiconductor equipment maker ASML—which manufactures incredibly complex lithography systems critical to the production of chips—partnered with Intel back in 2012 in order to develop extreme ultraviolet light (EUV) lithography systems for the next era of tiny chips. But ASML also partnered with Samsung and TSMC over the same technology. Today, TSMC alone is estimated to possess half of the total EUV lithography machines that exist in the world.

A photo of a final assembly of an ASML EUV lithography machine. These units cost over 150 million USD and can require up to 6 months to install before use. ((Picture courtesy of ASML)

TSMC's 5nm node-based chips—like Apple's new M1 processor in its new Macs—are entirely reliant on ASML's EUV lithography machines, each of which costs upward of $150 million and can take 4-6 months to install before use. Intel's earliest 10nm chips pursued the smaller node using conventional lithography using quad patterning but it failed. While they have since worked out their 10nm chips, which they are shipping today, ASML's EUV machines come into play when Intel 7nm chips ramps in the near future.

Intel's chip manufacturing problems may ultimately stem from the attrition that comes from global specialization. Companies that try to do the whole chip themselves succumb to companies that allow other smaller companies to risk the capital to tackle specialized components that are highly competitive. In the mid-80s, Intel abandoned the RAM market because it could not compete with major Japanese rivals that poured massive capital into new factories to produce the world's best RAM (random-access memory) chips.

Now at this time, with TSMC and Samsung producing hundreds of million more chips for the smartphone market—a market much larger in numbers of devices than computers—the Asian chip foundries have more capital and larger valuations. With chip fabs costing dozens of billions to build, the market leaders are obtaining a capacity to ramp new process nodes more quickly than their smaller rivals. In simple terms, they have the money to get started on new process node technology sooner.

Complicating matters for Intel, its tight-knit relation between chip design and chip manufacturing meant that when the company had trouble back in 2018 with its 10nm node ramp, it didn't have any outside fabs to turn to. That's because Intel chips are design-optimized for their chip-making tools, which third-party foundries don't own. They couldn't just go to Samsung and say, "make this design for us."

The decades-long unique advantage Intel held by designing chips with tight linkages between its manufacturing tools only became a curse once chips shrank to levels where electricity was observed to behave in unexpected ways at the atomic level. The solutions required novel materials and redesigns and pushed Intel into an unprecedented situation.

Meanwhile, contract chip fabs in Asia worked out such issues more swiftly where standard ARM-based chip designs and custom designs from AMD, NVIDIA and Apple did not have such linkages between their designs and the tools used to produce such chips.

AMD's Rising Star

With Intel's unique situation coming to haunt them in the latter years of the last decade, nearly every major rival made significant progress and captured market share and outright performance leadership. Intel's primary rival in computer chips, AMD, forged ahead with brilliant new CPU chip designs manufactured in Asia.

AMD's flagship CPU, the Ryzen 5000 series Ryzen 9, is fabricated by TSMC on a 7nm process and does not yet use EUV. Still, AMD leads the world in absolute best balance between single-core and multi-core chip performance. It's AMD Ryzen 9 5950X, 16-core CPU boasts average single-core scores of 1689 with a multi-core score of 16,681. While Intel's 11th generation Intel Core i9-11900K boasts slightly better single-core scores (1853), at 8 cores its multi-core score is a long way off.

In essence, AMD is delivering industry-leading single-core performance with high-multi-core performance to boot. This is the kind of balanced top performance that matters to the CAD industry.

Nuvia and Rivos—Ex-Apple Startups

Earlier, we noted that Apple leads the world in performance per watt. It’s M1 processor, for example, boasts Geekbench single-core scores of over 1700. By comparison, Intel's 11th generation scores slightly higher than Apple's M1 at 1757 and 1853 for its Intel Core i9-11900KF and Intel Core i9-11900K, respectively.

However, those chips consume vastly more energy—the M1 has a published TDP (thermal design power) of 39 watts. The Intel Core i9-11900K has a TDP rated at 125 watts. For Intel CEO Pat Gelsinger to state Intel will take the performance per watt crown by 2025 sounds too fantastic to be true.

Intel isn't just competing against serious competition from AMD and Apple. They are also competing against new startups like Nuvia and Rivos. Let's look at why these new chip startups are important.

As mentioned, Gerald Williams III was Apple's instrumental chief architect of CPU and SoC at Apple before he left to start Nuvia in 2019. But he reportedly left with 100 engineers from Apple and founded Nuvia with Manu Gulati (former lead SoC Architect at Google) and John Bruno (former systems architect at Google).

Gerald Williams III (middle) was Apple's chief CPU architecture and largely responsible for the market-leading A-series custom ARM-based SoC chips that power iPhone and iPad devices. Apple sued Williams almost immediately after he left to form Nuvia, for recruiting Apple engineers while employed at Apple. Williams has countersued. (Image: Nuvia/Qualcomm).

This formable team is now a part of Qualcomm after the company acquired Nuvia in 2020. And while Nuvia initially was aiming at performance per watt leadership for chips in the datacenter with its planned Phoenix CPU, Qualcomm leadership seems to have a different idea. The Nuvia team is reportedly working on the Phoenix technology—which is ARM-based—and plans to use it to compete directly with Apple for mobile devices like tablets, smartphones and small laptop computers on Chrome and Windows platforms.

If Williams' departure from Apple wasn't enough, new chip startup Rivos was also led by an exodus of Apple semiconductor veterans. Rivos Inc. is still in stealth mode. Unlike Qualcomm's Nuvia team, Rivos is focused on RISC-V chip platform technology, not ARM platform technology and is targeting datacenters, a market Nuvia was aiming for.

RISC-V is an open specification and an open platform but that does not make for an open-source processor. Both RISC-V and ARM are based on RISC (reduced instruction-set computing) architecture, while Intel X86 has primarily employed CISC (common instruction-set computing) architecture throughout its history.

Multiple sources explain the difference between RISC and CISC processors. To summarize, RISC allows a lower number of processor clock cycles per instruction and a standardized load-store limits model. The big takeaway is that RISC aims at reducing overall clock cycles which makes it superior for power consumption. Therefore, it should not be surprising that the RISC-based ARM chips have led the world in mobile device semiconductors where battery life means everything.

When Apple first had discussions with PA Semi a few years before the 2007 acquisition, it considered PA Semi chips inside future Mac computers. PA Semi founder Daniel W. Dobberpuhl and his team wished to design an enormously powerful chip based on the PowerPC architecture (RISC) that used little power. Shortly before the acquisition, in February of 2007, PA Semi debuted a 64-bit dual-core microprocessor that was 300 percent more energy efficient than any comparable chip. That chip consumed 5 - 13 watts at 2 GHz.

This was the team that formed the basis of Apple A-series chips, leading the industry in performance per watt. But more of Apple's semiconductor team is leaving for their own chip design startups. This may be normal behavior and cut both ways. Apple is reportedly trying to recruit Nuvia engineers.

Like Nuvia, Rivos may be another serious competitor, not just to Apple but also to AMD and Intel. The company aims to have the first high-performance RISC-V core. The new chip startup has garnered many senior CPU architects from Apple, Google, Marvell, Qualcomm, Intel and AMD.

The semiconductor market is incredibly competitive at a time when chip demand is outpacing supply, at a time when leadership is increasingly global and at time the core technologies and talent are getting more democratized across geopolitical regions.

Impacts on the CAD Industry

With all these landmark tidal shifts in the semiconductor industry, once dominated by Intel and much more US-based, this next decade may see the complete upending the Wintel duopoly of the software and hardware industry. Windows itself is being more robustly rewritten for the ARM architecture.

The main impact on the CAD and 3D software industries comes about from the massive code rewrites necessary for companies to respond to the times. At this present time, ARM hasn't just matched Intel X86, it has achieved performance/watt superiority. That matters with cloud computing as much as mobile computing. Amazon, Microsoft, Google and Apple are all moving towards the ARM-based datacenters because they run cooler and cheaper.

"Such a sea change will be punishing for CAD industry incumbents who lack the experience and expertise in multi-platform and multi-device development."

At any moment a chip designer could design an ARM chip and match the larger die size of AMD, for example. What would that performance be like? How would the industry respond?

Fujitsu's A64FX is an ARM-based chip that powers the world's fastest supercomputer. It is the first ARM chip to implement the use of the ARM Scalable Vector Extension (SVE) instruction set to increase vector lengths from standard 128-bit to 512-bit vectors. If NVIDIA does in fact acquire ARM, what is to stop them from entering the ARM server and desktop CPU market and melding their GPU technologies into ARM SoCs and CPUs?

In truth, what is keeping someone like Apple from developing a monster ARM chip with technologies like this? Perhaps only scale and return on investment. Perhaps they should partner with Fujitsu or someone else on new larger-die HPC ARM chips?

With some CAD and 3D developers moving their software solutions over to Apple Silicon (ARM-based Apple SoCs), Architosh has learned the process involved. Vectorworks, the leading CAD solution on the Mac, has over 120 dependencies in the code that needed to be rewritten from X86 to ARM. Vectorworks had to work these out with third-party developers. Most CAD, BIM and 3D software include multiple dependencies—from physics engines, digital terrain modeling engines, CFD engines, geometric modeling kernels like Parasolid and ACIS to innumerable rendering and visualization engines. This will be a disruptive process at various levels depending on each applications' legacy dependencies. It provides an opening for newcomers to react and develop more quickly with innovative new CAD industry offerings, like Shapr3D, for example.

If Intel's Pat Gelsinger meets his stated mission of Intel taking the performance per watt crown by 2025, then only minor and slow disruption will occur in the engineering software markets like CAD, BIM and professional 3D. The Windows-dominated engineering software world will largely remain on Intel X86 codebases, while incumbents slowly deploy newly written ARM-based software applications for the plethora of ARM-based devices that will likely remain dominant even past Gelsinger's 2025 timetable.

However, if Intel fails to meet its goal—and this author personally feels that the odds are against it obtaining the performance per watt crown—then it will only be AMD who will hold off an ARM onslaught. Operating systems and software applications will rapidly migrate over to a sea of fastest best-in-class next-gen ARM-based computing devices, which have become ever more critical in the "remote-work" reality of the post-global pandemic context.

Such a sea change will be punishing for CAD industry incumbents who lack the experience and expertise in multi-platform and multi-device development. These companies will suffer a paradox familiar to Intel—when what has allowed you to streamline and succeed becomes your handicap to adaptation.

PostScript

Since this article’s original publication date in late September of 2021, several new chips have emerged from both Apple and Intel. For its part, Intel released its 12th generation Intel Core processors. The Core i9-12900K is now the single-threaded king with Geekbench 5 score of 1991. This makes Intel again the new CAD leader as most CAD and BIM applications are single-threaded. Autodesk Revit is the prime example in the AEC market.

Apple is hardly sitting still. Its original M1 chip now has three new big siblings, the M1 Pro, M1 Max and M1 Ultra. These SoCs from Apple don’t boost single-core processing much over their original M1 introduction but they do demonstrate massive benefits for multi-core processing and memory management.

The CAD industry’s strongest leader in multi-core processing (for fully threaded workflows) is AMD’s Ryzen Threadripper 3990X. The chip has 64-cores, boasts 38.7 billion transistors and is huge. It also runs hot with a TDP of 280 watts.

Now compare this with Apple’s latest M1 Ultra which boasts a Geekbench 5 multi-core score of 24,055, just shy of the Threadripper’s 25,315 score. Yet, Apple has achieved this multi-core score with a chip that has 20 CPU cores instead of 64. Even more impressive, it achieves 90 percent of its full performance at under 100 watts. That’s a far cry from the AMD Ryzen Threadripper’s 280 watts. And to add injury to insult, Apple’s latest M1 Ultra has a reported GPU performance equivalent to NVIDIA’s RTX 3090 GPU with 200 watts less power.

Part of the secret to Apple’s SoC semiconductor performance advantages lie in the architectural advantages of a system on a chip. Apple’s integrated memory sits on the chip and is shared among the chip’s computational units. It means there is no copying data between CPU and GPU as they share and access the same memory.

Apple’s M1 Ultra delivers 800 GB/s of memory bandwidth and the M1 Max delivers half that. This is significantly higher than AMD and Intel systems with conventional DDR4/5 PCIe-based memory. Even the AMD Threadripper 3900X series systems peak at under 100GB/s.

The bottom line is nobody is sitting still in this industry and Apple will have challengers with future ARM-based SoC chips from the likes of Qualcomm (via Nuvia) and others. Notably, NVIDIA’s ARM-based Grace chip foretells of possible future NVIDIA ARM chips for desktop supremacy.