Penn engineers use mollifier layers to solve inverse PDEs faster, more reliably

Sometimes the answer is not more computing power, it's smarter mathematics.
A doctoral candidate reflects on why the Penn team's approach succeeds where conventional AI scaling fails.

From the surface of a disturbed pond, one cannot easily know what caused the ripple — and yet science has long depended on exactly that kind of backward reasoning. Engineers at the University of Pennsylvania have quietly reframed this challenge by reaching not for more computing power, but for a mathematical idea from the 1940s, adapting it into a method called Mollifier Layers that allows AI systems to solve complex inverse equations more reliably and with far less strain. The work suggests that in an era of relentless computational scaling, elegance and efficiency may still be found in the depth of older ideas.

  • AI systems tasked with solving inverse partial differential equations — the math that works backward from observation to cause — routinely collapse under noisy data and higher-order complexity, consuming enormous memory and producing unreliable results.
  • In one benchmark test, a standard neural network estimated a key mathematical parameter with only 0.21 correlation to the true answer, a near-total failure that exposed how fragile conventional approaches become when conditions grow difficult.
  • Penn engineers bypassed the problem by inserting a smoothing step — borrowed from mathematician Kurt Otto Friedrichs' 1940s mollifier technique — before derivatives are calculated, shifting the computational burden away from error-prone gradient tracing.
  • The results were striking: training time fell from over 3,300 seconds to under 340, memory use dropped by more than tenfold, and parameter recovery accuracy climbed from 0.44 to 0.99 correlation on the hardest test problems.
  • The method is already being applied to chromatin biology, where it helps infer the hidden chemical reaction rates that govern gene expression — opening potential pathways toward therapies for cancer, aging, and diseases of cell identity.

A ripple on water reveals that something disturbed the surface — but not what caused it, or with what force. This gap between observation and origin sits at the center of inverse partial differential equations, the mathematical tools scientists use to work backward from what they can see to what they cannot. Across biology, materials science, and climate modeling, these equations are indispensable. And yet when data grows noisy and the math grows complex, the AI systems built to solve them tend to buckle.

Researchers at the University of Pennsylvania encountered this wall directly. Vivek Shenoy's lab studies chromatin — the folded structures of DNA and protein inside cell nuclei — and could observe how chromatin organized itself but could not reliably infer the chemical processes driving those changes. The bottleneck, the team realized, was not the neural network architecture. It was the differentiation process itself: the way AI systems calculate how things change by tracing backward through their own layers, a method that becomes unstable and memory-hungry as problems grow harder.

Their answer came not from building larger models, but from revisiting a mathematical technique developed by Kurt Otto Friedrichs in the 1940s. Mollifiers are smoothing functions — they clean a signal before derivatives are calculated, rather than computing derivatives directly from noisy output. The Penn team embedded this idea into a new layer within physics-informed neural networks, shifting the hardest computational work toward stable analytical operations.

The performance gains were substantial. On a fourth-order reaction-diffusion problem — among the most demanding benchmarks — training time fell from 3,386 seconds to 335, peak memory dropped from 2.75 gigabytes to 0.23, and parameter recovery accuracy rose from 0.44 to 0.99 correlation. Across all tests, the mollified approach delivered six to ten times better efficiency than standard methods.

For Shenoy's lab, the payoff is immediate: the method can now infer reaction rates from noisy microscopy images of chromatin domains just 100 nanometers across — structures that regulate which genes are active and therefore shape cell identity, aging, and disease. If those reaction rates can be tracked and eventually altered, doctoral candidate Vinayak Vinayak suggests, it may become possible to redirect cells toward healthier states, opening new avenues against cancer and age-related conditions.

The technique's implications reach further still — into materials science, fluid mechanics, climate modeling, and anywhere hidden quantities must be estimated from imperfect data. Challenges remain: the approach depends on choosing the right smoothing kernel and currently struggles near boundaries and irregular grids. But the deeper lesson may be the most durable one: in a field defined by the pursuit of scale, the clearest path forward sometimes runs through mathematics that has been waiting, largely forgotten, for decades.

A ripple on water tells you something disturbed the surface, but not what caused it or how hard the impact was. That simple observation sits at the heart of one of science's thorniest mathematical problems: working backward from what you can see to figure out what made it happen.

Researchers across biology, materials science, and climate modeling face this puzzle constantly. They observe a temperature field spreading through a material, watch cellular structures organize themselves, track weather patterns shifting—but the underlying forces and rules remain hidden. For years, scientists have relied on inverse partial differential equations, or inverse PDEs, to bridge that gap. The trouble is that traditional approaches, especially those powered by artificial intelligence, start to fail precisely when the data gets messy and the math gets complicated.

Engineers at the University of Pennsylvania say they have found a cleaner path forward. Their method, called Mollifier Layers, rethinks how AI systems calculate derivatives—the mathematical operation that measures how things change. Instead of chasing the modern trend toward bigger models and more raw computing power, the Penn team dusted off a mathematical technique from the 1940s and adapted it for physics-informed machine learning. The work appears in Transactions on Machine Learning Research and will be presented at NeurIPS 2026.

The core problem lies in how conventional AI handles inverse PDEs. When a neural network tries to solve these equations, it typically calculates derivatives by tracing backward through its own layers repeatedly—a process called automatic differentiation. This works adequately for simple problems. But when equations involve higher-order derivatives, when data contains noise, or when both conditions exist together, the system becomes a resource hog. Memory consumption balloons. Training time stretches. Accuracy deteriorates. In one test using physics-informed neural networks, or PINNs, peak memory jumped from 0.21 gigabytes to 2.70 gigabytes when solving a fourth-order reaction-diffusion problem. More troubling, the network's ability to recover the underlying mathematical structure collapsed—in one benchmark, a PINN estimated a key derivative with only 0.21 correlation to the true answer, a sign the system had drifted badly off course.

Vivek Shenoy, a materials science professor at Penn, had encountered this wall firsthand. His lab studies chromatin, the folded DNA and protein structures inside cell nuclei. They could observe how chromatin organized itself, could model the structures they saw, but could not reliably infer the chemical processes driving those changes. "The more we tried to optimize the existing approach, the clearer it became that the mathematics itself needed to change," Shenoy says. The bottleneck was not the neural network design. It was the differentiation process itself.

The solution came from mollifiers, a mathematical smoothing technique developed by Kurt Otto Friedrichs in the 1940s. Instead of calculating derivatives directly from a neural network's noisy output—which amplifies errors—the Penn system inserts a mollifier layer that smooths the signal first, then computes derivatives through fixed mathematical operations. This shifts the hardest computational work away from repeated gradient calculations and toward an analytical smoothing kernel. The result is simultaneously lighter on memory, faster to train, and more stable when data is imperfect.

When the team tested the approach on three increasingly difficult problems—a first-order equation, a second-order heat equation, and a fourth-order reaction-diffusion system—the mollified models consistently outperformed standard ones. In the first-order case, mollified PINNs achieved 0.97 temporal correlation versus 0.36 for standard networks, while using less memory and training time. The gap widened dramatically for harder problems. On the fourth-order reaction-diffusion benchmark, mollified models cut training time from 3,386 seconds to 335 seconds and reduced peak memory from 2.75 gigabytes to 0.23 gigabytes. Parameter recovery improved from 0.44 correlation to 0.99. Across all tests, the method delivered 6 to 10 times better efficiency.

For Shenoy's lab, the practical payoff centers on chromatin domains roughly 100 nanometers across that regulate which genes are accessible and therefore active. Because gene expression shapes cell identity, aging, and disease, understanding how these domains form and change matters profoundly. The mollified approach enabled researchers to infer reaction rates from noisy microscopy images with high accuracy, potentially connecting nanoscale chromatin remodeling to cancer, aging, and cell fate. "If we can track how these reaction rates evolve during aging or cancer," says doctoral candidate Vinayak Vinayak, "this creates the potential for new therapies. If reaction rates control chromatin organization and cell fate, then altering those rates could redirect cells to desired states."

The method's reach extends far beyond chromatin. Inverse PDEs appear whenever scientists need to estimate hidden quantities—diffusivity in materials, conductance in electrical systems, reaction rates in chemistry, forcing terms in weather models—from incomplete or noisy measurements. The Penn framework could streamline work across materials science, fluid mechanics, genetics, and climate modeling. The researchers also suggest the principle might apply to forward models, operator learning, and neural ODE systems, anywhere that stable, efficient gradient calculation matters. Still, the approach has limits. Performance depends on choosing the right smoothing kernel, and the current implementation struggles near boundaries and on irregular grids. Future work will need to explore adaptive kernels and boundary-aware formulations. But the core insight stands: sometimes the path forward in AI is not more computation, but better mathematics.

The more we tried to optimize the existing approach, the clearer it became that the mathematics itself needed to change.
— Vivek Shenoy, University of Pennsylvania materials science professor
If reaction rates control chromatin organization and cell fate, then altering those rates could redirect cells to desired states.
— Vinayak Vinayak, doctoral candidate in materials science and engineering
A Conversa do Hearth Outra perspectiva sobre a história
Inventor

Why does working backward from observations turn out to be so much harder than working forward from rules?

Model

Because you're trying to infer invisible causes from visible effects. A forward problem says: given these rules, what happens? An inverse problem says: I see this outcome, what rules or parameters must have produced it? That's fundamentally underdetermined—many different causes could produce the same effect, especially if your measurements are noisy.

Inventor

And the AI systems that tried to solve these problems kept running into a wall?

Model

They did, specifically when calculating derivatives. The standard approach traces backward through the network repeatedly to measure how things change. That works fine for simple equations, but with higher-order derivatives and noisy data, the system becomes unstable and memory-hungry. It's like trying to measure the slope of a jagged, noisy line by zooming in closer and closer—eventually you're just amplifying the noise.

Inventor

So the Penn team's solution was to smooth the signal first?

Model

Exactly. They borrowed a 1940s mathematical technique called mollification—essentially applying a smoothing filter before calculating derivatives. Instead of fighting the noise directly, you suppress it first, then measure change on the smoothed version. It's a different kind of trade-off: you lose some fine detail, but you gain stability and efficiency.

Inventor

What does that mean for actual science?

Model

In Shenoy's case, it means being able to infer how chemical reactions organize DNA inside cell nuclei from noisy microscopy images. That's not just an academic win—if you can understand what controls chromatin organization, you might be able to redirect it therapeutically. The same principle applies to materials science, weather modeling, anywhere you need to extract hidden parameters from messy real-world data.

Inventor

Is this a permanent solution, or does it have its own limitations?

Model

It's a real improvement, but not a silver bullet. The choice of smoothing kernel matters—you have to balance noise suppression against losing important high-frequency features. And the method struggles near boundaries and on irregular grids. But the core insight is sound: sometimes the answer is not more computing power, it's smarter mathematics.

Fale Conosco FAQ