AI Framework Built on Thermodynamic Laws Solves Decades-Old Polymer Simulation Challenge

The information you throw away is exactly the information that controls how the material flows
Thomas O'Connor describes the core problem that has plagued polymer simulation for decades.

For half a century, the complexity of polymer systems has outpaced humanity's ability to simulate them faithfully — a gap between what nature does and what computation can afford. A team from Carnegie Mellon University and the University of Pennsylvania has now bridged that divide by embedding the laws of thermodynamics directly into the architecture of a machine-learning framework, making it mathematically impossible for the model to contradict nature. The work, published in the Proceedings of the National Academy of Sciences, suggests that the deepest advances in artificial intelligence may come not from data alone, but from the marriage of algorithmic flexibility with the hard-won wisdom of physics.

  • Fifty years of polymer modeling have forced scientists to choose between physical accuracy and computational speed — a compromise that leaves engineers unable to reliably predict how real materials will hold, flow, or fail.
  • Even modern machine-learning approaches have stumbled here, breaking thermodynamic laws when pushed to scales that actually matter for industrial design.
  • Researchers anchored a neural network to the metriplectic bracket — a mathematical structure from non-equilibrium thermodynamics — so that energy conservation and the Second Law are satisfied before a single parameter is trained.
  • A self-supervised learning strategy allows the framework to infer hidden variables like entropy and microstructure directly from particle motion, meaning models can now be trained from experimental video rather than costly atom-by-atom simulations.
  • Validated against star polymers and dense colloidal suspensions — two notoriously resistant test cases — the framework outperformed state-of-the-art graph neural networks and captured rare flow-driving events that had long eluded modeling.
  • Open-source releases in PyTorch and LAMMPS now put thermodynamically consistent, data-driven polymer simulation within reach of academic labs, national laboratories, and industry at engineering scale.

For fifty years, materials scientists have confronted the same stubborn problem: polymers are too complex to simulate faithfully. A single chain holds tens of thousands of atoms; a real sample holds billions. The properties engineers care about — how an adhesive grips, how a copolymer self-assembles, how a film resists tearing — only emerge at scales that atom-by-atom computation cannot reach.

The standard workaround, coarse-graining, trades physical fidelity for speed by replacing clusters of atoms with simpler particles. But compression destroys information. Conventional coarse-grained models can capture equilibrium structure or large-scale dynamics, rarely both — and they routinely fail to represent the entropic and viscous forces that actually govern whether a material holds or fails. Machine learning offered flexibility but not salvation; even ML models tended to violate fundamental physics under pressure.

A team from Carnegie Mellon University and the University of Pennsylvania has now published a different approach in the Proceedings of the National Academy of Sciences. Their framework does not merely approximate thermodynamic behavior — it is architecturally incapable of violating it. The key was not a machine-learning innovation but a physics one: the researchers embedded the metriplectic bracket, a mathematical structure from non-equilibrium thermodynamics, directly into the skeleton of their neural network. Energy conservation and the Second Law are enforced by construction, before training begins. CMU's Thomas O'Connor described it as strong domain science enabling better machine learning — the structure they needed had been waiting in the polymer rheology literature for decades.

A companion innovation handles what experiments cannot easily label. Because entropy and internal microstructure are nearly impossible to measure directly, the team developed a self-supervised strategy that lets the network discover these hidden variables by watching how particles move — making it possible to train models from experimental video rather than expensive atomistic simulations alone.

Testing on star polymers and dense colloidal suspensions — two cases that have long resisted coarse-graining — the framework recovered both structure and non-equilibrium dynamics where state-of-the-art graph neural networks failed, and captured the rare localized rearrangements that drive flow in dense suspensions.

Open-source implementations in PyTorch and LAMMPS, validated at the scale of millions of coarse-grained particles, now make thermodynamically consistent polymer simulation accessible across academia and industry — opening new ground in adhesion, self-assembly, self-healing, and fracture research.

For fifty years, materials scientists have faced a stubborn wall: polymers are too complex to simulate faithfully. A single polymer chain contains tens of thousands of atoms. A real-world sample—a melt, a composite, an adhesive—holds billions. The properties that actually matter to engineers—how an adhesive sticks, how a block copolymer self-assembles into nanostructures, how a film stretches without tearing—only emerge at scales and timescales that atom-by-atom simulation cannot reach. The math is simply too expensive.

The workaround has always been coarse-graining: replace clusters of atoms with simpler particles, shrink the model, make it fast enough to run. But speed comes at a price. When you compress the system, you lose information. Conventional coarse-grained models can usually capture either the equilibrium structure or the large-scale dynamics, but rarely both. More critically, they fail to capture the entropic and viscous forces—the ones that actually govern how polymers flow, relax, and dissipate energy. Those forces determine whether a material will hold or fail. Machine learning has been flexible enough to help, but even ML approaches tend to break down when asked to preserve these fundamental physics.

A team from Carnegie Mellon University and the University of Pennsylvania has now published a solution in the Proceedings of the National Academy of Sciences. They built a machine-learning framework that does something previously thought impossible: it lets coarse-grained models achieve both accuracy and speed at once, while being mathematically incapable of violating the laws of thermodynamics. The architecture learns polymer dynamics directly from data—whether from simulation or experiment—but it cannot break the rules of nature.

The breakthrough did not come from machine learning itself, but from polymer physics. The researchers anchored their neural network to a mathematical structure called the metriplectic bracket, a tool developed decades ago within the non-equilibrium thermodynamics community to describe how soft materials flow and deform. By translating that structure into the skeleton of their neural network, they created a framework that conserves energy and obeys the Second Law of Thermodynamics before any parameters are even trained. Thomas O'Connor, an assistant professor of materials science and engineering at CMU who co-led the work, described it as strong domain science enabling better machine learning. The mathematical structure they needed was already waiting in the polymer rheology literature.

A second innovation addresses what cannot be measured directly. Entropy and internal microstructure are nearly impossible to label in an experiment. The team developed a self-supervised learning strategy that lets the network discover these hidden variables on its own, simply by watching how particles move. This means models can now be trained straight from experimental video rather than only from expensive atomistic simulations.

They tested the method on two difficult cases. For star polymers—branched structures whose dynamics have resisted most coarse-graining attempts—the framework recovered both the radial structure and the non-equilibrium dynamics even at aggressive levels of simplification, where state-of-the-art graph neural networks failed. For a dense colloidal suspension filmed under oscillatory shear, the framework learned a model directly from high-speed video that captured the rare, localized rearrangement events that drive flow. This connection between microstructure and macroscopic behavior had been elusive for years.

The researchers have released open-source implementations in PyTorch and LAMMPS, the standard molecular-dynamics engine used across academia, national laboratories, and industry. The LAMMPS version has been tested at the scale of millions of coarse-grained particles, opening a path to thermodynamically consistent, data-driven models of polymeric materials at engineering scale. The work is intended to enable new research in polymer adhesion, self-assembly, self-healing, and fracture—problems where accurate simulation has long been out of reach.

By building the laws of thermodynamics into the architecture itself, we get models that recover those missing entropic and viscous forces, not as an approximation but by construction.
— Thomas O'Connor, Carnegie Mellon University
It's a nice example of strong domain science enabling better machine learning. The mathematical structure we needed to make our ML models respect the laws of nature was ready and waiting in the polymer rheology literature.
— Thomas O'Connor
The Hearth Conversation Another angle on the story
Inventor

Why has polymer simulation been so hard for so long?

Model

Because the things engineers care about—how a material actually behaves—only show up at scales that are too big to simulate atom by atom. You'd need a computer the size of a building and a thousand years of runtime.

Inventor

So coarse-graining is the obvious answer. Why hasn't it worked?

Model

Because when you simplify the model, you throw away the exact information that controls how the material flows. You can get the structure right or the dynamics right, but not both. And you almost always lose the entropic forces that determine whether something will hold or fail.

Inventor

What made this team's approach different?

Model

They didn't start with machine learning. They started with physics—a mathematical structure from polymer rheology that describes how soft materials actually deform. Then they built the neural network around that structure, so it's mathematically impossible to violate thermodynamics.

Inventor

That sounds like they're constraining the model. Doesn't that limit what it can learn?

Model

The opposite. By building in the laws of nature, they actually let it learn more. The model doesn't waste capacity trying to find shortcuts that break physics. It learns the real forces.

Inventor

How do they handle things you can't measure, like entropy?

Model

They let the network figure it out on its own by watching how particles move. It discovers the hidden variables without being told what they are. That means you can train on real experimental video instead of just expensive simulations.

Inventor

What does this actually enable?

Model

Models that can predict how polymers will behave in real applications—adhesives, self-healing materials, things that fracture. And they've made the tools open-source and tested them at scales that matter to industry. It's not just a paper. It's a tool people can use.

Contact Us FAQ