Using High Performance Computing: It’s About Time

ANSYS has sponsored this post.

Predicting wind noise around the Alfa Romeo Giulietta with Fluent. (Image courtesy of FCA Italy.)

Are infinitely powerful computers an engineer’s dream? It’s not so far-fetched. These days, compute power is super available—you just can’t see it. It’s not in your next workstation. Infinite computing power comes in computers the size of pizza boxes; stacked in floor-to-ceiling racks; crammed with GPUs, CPUs and massive amounts of storage—all networked, all online, in row after row of racks in buildings that are spread as much as a quarter mile in each direction. These data centers are all over the world—over 1,800 are located in the U.S. alone. They are popping up in cornfields and deserts, wherever there is space—and power.

Data centers are popping up everywhere. Compute power may never be enough for engineers, but with over 1,800 data centers located in the continental U.S. alone, each crammed with compute power and storage, we are living in an age where compute power is cheaper and more plentiful than ever. The San Francisco Bay Area alone has 61. (Image courtesy of Datacentermap.com.)

The vast scale and number of data centers make up cloud computing, and in the cloud is where your split-second search, your video streaming—as well as your project collaboration, file storage and, for the purpose of this article, where your engineering simulation should be.

The most common engineering simulations, finite element analysis (FEA) and computational fluid dynamics (CFD) solve systems of equations based on the nodes of a mesh. The meshes are approximations of the original non-uniform rational basis spline (NURB)-based CAD geometry. The finer the mesh, the more fidelity it has to the geometry and therefore will yield the most accurate results. But as element size shrinks to infinitesimal, the number of elements and degrees of freedom approach infinity. Even a simple part can be meshed so fine that it will take hours on a supercomputer.

Moore’s Law Was Not Enough

But are supercomputers, at the cost of tens of millions of dollars, the way to go for today’s engineering simulations? An alternative to supercomputers is tying together ordinary computer hardware in massive networks to create what is known as high performance computing (HPC). The economies of scale of HPC have made solution times faster and cheaper than ever been—bypassing Moore’s law.

Gordon Moore, CEO of Intel, famously predicted in 1965 that there would be an annual doubling in the number of components per integrated circuit, in what is known as Moore’s law.

Yes, Moore’s law has made for smartphones that are more powerful than the computer that put Apollo astronauts on moon. And mobile workstations now run circles around your dad’s minicomputer. But the wealth of computing on the cloud is faster by a long shot.

With HPC on the cloud, the simulation submitted today so it will be ready after breakfast tomorrow will begin and end in a coffee break. The cost for simulation is fractions of a cent per second. You wonder why we should buy any local workstations at all. Why not just plug into the massive grid of unlimited speed and power that this new world of computing and data centers offer?

Enter HPC

Gordon Moore could not have guessed that there would be such a thing as HPC. HPC networks take advantage of parallel processing, with each processor handling a calculation. If the software is able to feed each processor a calculation, all of the calculations happening at once—in parallel, even on commodity processors—is faster than sequential processing, with one calculation occurring after the other, if there are many calculations, even if they are done on the fastest single computer.

In its heyday, HPC was thought to get faster as more compute nodes were added—a linear relationship. However, as Gene Amdahl pointed out, with what is referred to as Amdahl’s law, the fundamental limit of parallel processing that limits HPC time savings is reached when a process ties up a single node. It doesn’t do any good to have other processors available, as they will have nothing to do while the long process cranks away, with the sum total of the longest calculations along the critical path limited by how long the longest single calculation takes in a single node.

Creating and maintaining a HPC data center will make sense if your simulation needs are massive and regular. For smaller operations, an HPC appliance configuration may suffice. (Picture courtesy of ANSYS)

Mesh Complexity

To perform a simulation, you start with a 3D CAD model and make a mesh. But only the greenest—or laziest—engineer would take the entire CAD model and make a mesh from it. So many details on a CAD model don’t matter to the analysis. The mesh will be too big, and solving it will take too much time. Useless details must be removed, a process called defeaturing. It will cost valuable engineering time.

But what if you had unlimited computing power at your disposal, so vast that you could give it the entire CAD model meshed? If the model is solved in less time—useless details and all—than a carefully defeatured model, why bother with the tedious exercise of defeaturing? Let the computers solve it. You can find better things to do.

After the time-honored—and time=consuming—task of defeaturing is over, one push of a button creates millions, if not billions, of finite elements (for a structural simulation) or cells (if it is a flow simulation). The diligent engineer, by tradition, will now go to work. They have to sort out how the nodes are to be connected, how contact surfaces are to be established, and how meshes can be refined. The engineer must always be conscious of the number of degrees of freedom. Each node represents up to 6 degrees of freedom (DoF). Each DoF requires a calculation. Collectively, these calculations can overpower a supercomputer. You may not get a solution, or enough of them, in the time you have available.

“Engineers are constrained by turnaround time limitations,” said Wim Slagter, director of HPC and cloud alliances at ANSYS. “So, they are spending time and effort on changing their model in order to be able to run it on their existing hardware or to get acceptable runtimes. They must make their models smaller. They are trading off accuracy of results by reducing the number of elements, the number of features—or using a less advanced turbulence model in their CFD application just to get acceptable runtimes.”

Limited by Hardware?

And while computer service providers can be expected to have the latest, fastest hardware, the same cannot be expected of an engineer who is forced to use old hardware due to capital expenditure restraints. There is quite a bit of angst with mesh size, in terms of adjusting models to have them fit within the capacity of Old Faithful, your aging workstation.

“We found out that a large portion of engineers run on relatively old hardware,” noted Slagter, citing a survey ANSYS did with Intel. “We know they could benefit from recent hardware, the latest processor technology. Our software takes advantage of the latest Intel processors, for example. People can improve their performance using those recent Intel processors tremendously. And by the same token, simulation models are getting bigger in terms of size and complexity. Engineers have to run an increasing number of design variants to ensure robustness, for example. And we see also an increasing number of engineers who are computer-bound or constrained by the lack of computer capacity.”

To find out how long it takes to recoup investing in a new workstation, ANSYS offers an ROI calculator.

How often do you limit the size and amount of detail in simulation models due to turnaround time limitations? (Image courtesy of ANSYS.)

In a recent survey done by ANSYS and Intel with 1,800 engineers responding, 75 percent felt forced to limit the size and amount of detail in simulation models just to complete the solution in the required time. Almost all of respondents (87 percent) are not happy with the results, though.

How often does “defeaturing” produce less useful results? (Image from a survey by ANSYS.)

“Of course, engineers are computer-bound. Also, they are constrained by turnaround time limitations,” said Slagter. “That became very clear through our survey. About 40 percent of the respondents limit the size or amount of detail for nearly every simulation model. And 35 percent of the respondents limit it more than half of the time.

“That means that customers, engineers, are constrained by turnaround time limitations because they are spending time and effort on changing their model in order to be able to run it on their existing hardware, or to be able to squeeze it on their existing hardware, or to be able to get acceptable runtimes,” Slagter added. “So, they are actually making the model smaller, or maybe having a trade-off on accuracy because they are reducing the number of elements or the number of features—or maybe they take a less advanced turbulence model in their CFD application, in order to get acceptable runtimes.”

Death by Defeaturing

The urge to see how fine a mesh you can get away with is constant. How fine can you go before your results don’t come back in time? To get to a manageable problem size, engineers pull out all sorts of tricks. A fine mesh in critical areas, a coarse mesh in others. We agonize over how small our elements can be, how fine the mesh can be, and we stick to lower order elements. We model materials as homogenous, ignoring the microscopic scale where material properties can vary greatly. We assume a steady state because time stepping is one solution per each time step. Who has time for that? A rounded edge will result in a fine mesh, so it is reversed into a sharp corner. In the best case, you have reduced the number of elements, cells, degrees of freedom—all without affecting the outcome. The stresses or flows do not change in all the areas you care about. Sure, it took a little time, but it was worth it, right?

So, here we are. We go around a part or volume removing chamfers, fillets, protrusions, holes and other details. They can set the minimum size to reduce the number of elements. Although CAE tools often claim they can fully defeature a part—and it may be fun to see all the tiny holes and fasteners disappear from your model—veteran analysts have learned not to avail themselves of wholesale push-button automation out of caution, resorting to defeaturing more or less manually, a time-consuming and laborious effort.

We remove detail, mesh, count the elements, then repeat. Each time you mesh, you found the number of degrees of freedom too high, then submitted the model to another wave of defeaturing. All that so the model fits, will run at all—and run in time.

There must be a better way.

A CAD model is “defeatured,” a manual or semiautomated process that seeks to reduce the size of the mesh to be used for simulation. Do it well and details that do not affect simulation will be safely removed. However, each defeature departs from the fidelity of the mesh. Do it badly and the simulation will avoid critical areas. (Image courtesy of TransMagic.)

In order to decrease the turnaround time of a simulation, engineers see defeaturing a model as a necessary evil. Defeaturing is a learned skill. Careless defeaturing can reduce the accuracy of results. In one study of a part where one feature after another was turned off, the accuracy of results (with a fully featured part representing 100 percent accuracy) dropped to as low 32 percent.

As we work our way around the CAD model, removing details so the mesh will fit in our computers and in our schedules, we cross our fingers hoping the detail we remove is not critical—like that bolted joint.

In the worst-case scenario, removing detail, or “defeaturing,” can be, dangerous. You might have glossed over a critical area and ignored a potential cause of failure.

The stress concentration factor around a hole is 3X. Let’s hope that hole was not “defeatured” in an attempt to reduce problem size. (Image courtesy of FractureMechanics.com.)

Let’s say you follow a routine procedure as prescribed by FEA wisdom and you remove all features less than a certain size from the CAD model. So, all holes of that size or smaller disappear. What if one of those holes is in an area of high stress? The stress concentration factor at the 3 and 9 o’clock position around a sample in pure tension is 3. What if that would have produced stresses past a failure criterion?

But let’s say you are fortunate, and are experienced and talented enough to avoid such danger. You may now be the victim of a more insidious, creeping danger, but one that grows with each change. The effect of defeaturing is cumulative. The defeatured model grows more approximate with each detail you discard. Remove enough detail and you will have to squint to see any resemblance to the original geometry. The mesh will now have a “low fidelity” with the actual geometry.

The Cost of Defeaturing

Let’s say a simulation takes a hundred hours. This includes getting the model from the CAD program to the final report. In rough orders of magnitude, let’s say the first step, reducing the unnecessary complexity of the model, or defeaturing, takes a tenth of that number, or 10 hours. By all accounts, this may be conservative number. With an engineer’s time worth a $100 an hour (again, in rough orders of magnitude, one significant digit precision, but bear with me), that comes to $1,000 per analysis. A full-time analyst can be expected to perform 10 of these per year, which makes for a cost of $10,000 in labor for defeaturing per analyst.

Decreasing Time to Market

The world is getting more complex, the drumbeat of technology gets quicker, and products appear faster.

In a study of product development over 15 years, product cycles were found to have been shortened by about 25 percent across all industries. In fast-moving consumer goods, the design cycle is half of what it was in 1997, according to a 2012 study of European companies by Roland Berger consultants. During the same period, the number of products in the same industries increased by three times.

Simulation has been getting more sophisticated to keep up. The previous generation of analysis may have had one set of engineers doing structural analysis, another set doing aerodynamics, and another studying acoustic effects. Now those processes may all be combined in a multiphysics environment. This puts an increasing strain on simulation teams and their hardware.

“The complexity lies in the details and in phenomena that exists on a small scale—which can be transient in nature, like acoustics models or combustion phenomena or multi-physics applications,” confirmed Slagter.

“The other reason is that markets expect more innovative products that come to market faster,” he added. “To make things more difficult, we have requirements that conflict each other. In order to differentiate products from the competition, engineers will need to explore more design iterations, do more trade-off studies, make course corrections quickly—all leading to new designs that can win against the competition.”

With an increase in the amount and sophistication of analyses, the result of market pressures, engineers bump up against the limits of their computers.

“Engineers may also be looking at various manufacturing processes and multiple operating conditions,” said Slagter.

Ideally, to save time, all the design variations could be run in parallel. However, that would require a workstation and a CAE software license for each one—resources not available for most organizations. HPC simulation can get around this problem as all the design variations can be sent at once, and the solutions run in parallel, on multiple cores, or what ANSYS refers to as an “embarrassing parallel computing.”

They offer a “more affordable licensing model,” according to the company’s white paper, that solves multiple design variations with only one set of application licenses. This, in effect, removes the insurmountable barrier of one license/one workstation per solution and enables the vast time savings of running all design variations simultaneously rather than sequentially. With little additional cost per each added design variation considered only increases the ROI of HPC.

Simulate vs Test

Simulation can reduce the time required if it replaces physical testing. For example, consider the amount of testing that is required before autonomous vehicles are certified. Billions of miles must be driven in test vehicles. Even with a fleet of a thousand cars, there are still a ridiculous number of combinations of environments and situations to be cycled through. By the time all the testing is done, no one reading this will be alive. With every year, another forty thousand people die in human driven vehicles in the U.S. alone—enough of an argument to shorten the testing cycle, with simulation being the only safe way to replace testing.

A proper simulation can be done in a fraction of the time it takes to perform real road testing. The billions of miles of testing can be driven at breakneck speed in simulation, in parallel on parallel processors. Mistakes made can be corrected without risk to humans. With technology under the public microscope, even one death can be a huge setback—as the death of Elaine Herzberg by an Uber self-driving car demonstrated. Ironic as it may be, the death of one person by one self-driving car stalled an industry that has the potential to prevent the deaths of tens of thousands.

How Much Faster, Exactly?

How much faster can a simulation actually be when you no longer leave it to the aging beast of a workstation located under your desk?

The time savings afforded by multiple cores can be dramatic at first. If a CFD simulation takes 16 days on a single core, adding 32 cores might reduce the simulation time to four hours -- slashing more than two weeks from the schedule. Doubling the number of cores from 32 to 64 might cut the time in half again – from four hours to two hours. However, with further increasing cores, the effect is less dramatic, until continued increases provide only incremental gains measured in minutes or seconds.

“The incremental value of HPC decreases as the number of cores increases,” confirms Slagter. “We have a value-based pricing model to accommodate that.”

You might expect the cost of employing an increasing number of cores to be proportional to the increasing number of cores. That is a misconception of HPC, according to Slagter. Running 2,000 cores instead of 20 incurs a cost premium of only 1.5 times and not the 100 times that the core increase might imply,

Flow simulation is arguably the most computer-bound of all engineering simulations. CFD models are composed of millions, if not hundreds of millions of cells, making for overnight runs on local workstations – if they will run at all. This might easily be the orderly world of a lone analyst content to line up single jobs, push a button, get on with their evening, and return in the morning to see the results. But is the world of multi-person simulation teams in the aerospace, automotive and fast-moving consumer goods industries, with ever-decreasing product cycles, more variations in product lines and ever-increasing regulatory demands to be satisfied, the queue is clogged with hundreds of simulations, and the clock is ticking very loudly. The valiant workstation, chugging away on all processors, cannot keep up.

Submitting a solution to ANSYS HPC cloud can be done through the ANSYS interface. (Image courtesy of ANSYS.)

Happy HPC Customers

"The ANSYS Cloud service built into ANSYS Mechanical provides intuitive, easy access to HPC directly from the application. For large, high-fidelity models, ANSYS Cloud reduced our solution times by 5 and 6 and cut the entire simulation workflow by half,” says Marcos Blanco, Mechanical Simulation Engineer at LEAR Corporation

"High-efficiency equipment is critical for improving plant performance in the oil and gas industry. ANSYS Cloud enables Hytech Ingeniería to calculate large and complicated geometries within hours, instead of days or weeks -- resulting in significant time savings,” says Luis Baikauskas, Process Engineer for Hytech Ingeniería.

RJM CleanAir Gas Burner combustion simulation. RJM found simulation that took a week could be done in a day using HPC. (Picture courtesy of ANSYS.)

RJM International, makers of industrial burners for power generators and other large combustion plants, such as those found in refineries and steelworks was doing combustion simulation. The scale and complexity of the burners required simulation that took a week to solve. Moving the simulation to HPC cut the time by 86%, with simulations now being done daily.

Modine Manufacturing, maker of heating systems, found its 10-core, 64 GB of memory, personal workstations would take 3 months to solve a transient heat transfer solution. The company could not tie up its workstations for weeks, much less months, so it had to look for a faster solution. Using Rescale’s HPC, the company was able to reduce the solution time to days. An additional bonus to Modine was in saving much of the time it took to build and test the systems because it was able to simulate instead.

See for Yourself

ANSYS invites you to see the difference with HPC for yourself with its free benchmark program, which uses “expert” configurations of high-end, simulation-capable Windows 10 workstations, one with Intel Xeon Gold 6148 Processor with 16 cores, 192 GB of memory and 2 TB of storage that cost $8,000, and another with Intel Xeon Gold 6148 (2S) Processor with 32 cores in use, 384 GB of memory and 2 TB of storage for $12,000 and another with a cluster of 128 cores.

“We will run the model and tell you how long it takes so you can compare the turnaround time to your workstations,” said Slagter.

If you would like to kick the tires of HPC in the cloud and experience it yourself, ANSYS offers a cloud trial at www.ansys.com/cloud-trial so you can find out how much quicker simulation can be. When you submit your model for simulation, you can choose 16, 32, 64 or 128 cores.