GPUs, CPUs, and OpenSees: Where Acceleration Makes Sense
- silviamazzoni
- Sep 12
- 3 min read
There’s been growing interest in developing GPU-based approaches for OpenSees. That’s exciting work, and it’s important to encourage innovation. At the same time, we should be careful about the objective of this development. It’s probably not as simple as “put your stiffness matrix on a GPU and the problem will run faster.” The real question is: which parts of the problem actually map well to GPU strengths?
I have my own strong opinion on the subject, so I asked ChatGPT to help me put it into a linear argument. This is the process where I write do a significant brain dump into the prompt and ChatGPT writes it out in a legible manner. I will write a separate blog about my "strong opinions", but, for now, let's stay on topic, here we go.
CPUs vs. GPUs in a Nutshell
CPUs (Central Processing Units) have relatively few powerful cores. They excel at tasks requiring branching logic, irregular memory access, and adaptive algorithms.
GPUs (Graphics Processing Units) have thousands of lightweight cores. They’re ideal for workloads that apply the same simple operation to huge arrays of data — image processing, dense matrix multiplications, machine learning.
Why Nonlinear Analysis Doesn’t Fit the GPU Model
A nonlinear response history analysis in OpenSees isn’t about repeating simple operations at scale. Instead, the heavy lifting comes from:
Sparse, irregular systems of equations (not the dense, regular ones GPUs love).
Element state updates that involve conditional logic and history dependence.
Convergence bottlenecks, where an entire analysis can stall because a single element won’t converge.
Adaptive solution strategies, where time steps and iterations constantly change.
These are exactly the kinds of tasks that GPUs struggle with. CPUs (and clusters of CPUs) are far better suited.
Where GPUs Might Help
GPU development for OpenSees should be guided by clear goals: accelerating specialized kernels (e.g., sparse solvers), running ensembles of analyses (parameter sweeps, uncertainty quantification), or exploring hybrid methods. Chasing the idea of just moving the global stiffness matrix onto a GPU misses the bigger picture.
Smarter Models Beat Faster Hardware
The most effective way to speed up OpenSees analyses is almost always through modeling decisions:
Fibers: You don’t need a million fibers per section, or a uniform grid. Distribute fibers intelligently — increase density where the loading direction makes it matter. For beams, which primarily act in 1-D bending, horizontal fibers are sufficient.
Section choice: Fiber sections are only useful if interactions (axial–moment or biaxial) and their variation during analysis are significant enough to affect stiffness and strength. For example, axial forces in beams are typically small and constant, so fiber sections add unnecessary cost — and can even create fictitious effects under rigid-floor constraints by inducing artificial axial loads at yield. In those cases, simpler phenomenological sections perform better.
Elements and materials: Some element or material formulations in OpenSees are extremely detailed but slow down analyses dramatically. Use the simplest option that still captures the behavior you care about.
Solution strategies: Solver choice, convergence tests, and tolerances often matter more than hardware. You can also add adaptive solution strategies for cases of nonconvergence, so you don’t have to lock yourself into very small time steps across the whole analysis. Some algorithms may require more steps but converge faster because each step takes fewer iterations; others may take fewer steps but need more iterations. Matching the algorithm to your problem often yields better efficiency than any hardware upgrade.
Parallelism and Communication
Even in distributed computing, scaling isn’t free. Communication overhead can outweigh benefits if the problem size is small. Smarter partitioning and load balancing matter just as much as raw node counts.
Final Thoughts
The future of OpenSees and GPU development isn’t about “putting your stiffness matrix on the GPU.” It’s about finding the right fit: targeting specific kernels, leveraging GPUs for ensembles of analyses, or combining CPUs and GPUs where appropriate. But for single nonlinear response history simulations, the real performance gains come from intelligent modeling, thoughtful section and element choices, and efficient solution strategies.

Comments