Digital Twins in Process Plants: Where They Pay Off

Every process plant already runs models. The steady-state heat-and-mass balance the process engineers keep in a spreadsheet. The first-principles simulation that sized the columns at design. The regression a controls engineer fitted to last year's historian to guess at a quality number the lab confirms hours later. None of those is a digital twin, and treating them as one is a fast way to burn a capital budget. A twin is a model that stays bound to the running asset, fed by live measurement, correcting itself as the plant ages. That binding is the hard part. It's also where most projects quietly come apart.

So this is a memo about the binding, not the brochure. Where a process-plant twin pays back the instrumentation and the modelling effort, where it decays into an expensive screensaver on the control-room wall, and how you tell which one you're about to build before the money's gone. I'll assume you know what a process simulation is and what a sensor does. The interesting questions live in the gap between them.

What separates a twin from a model that happens to run

Among the distinctions on offer, the most useful is also one of the oldest. Reviewing the manufacturing literature in 2018, Kritzinger and colleagues sorted everything people call a "twin" into three tiers, graded by how data actually moves between the physical object and its digital counterpart (Kritzinger et al., 2018). The grading is unglamorous and it settles most arguments before they start.

Tier	Data flow physical ↔ digital	What you actually have
Digital model	Manual both ways	A simulation. Useful for design and study; blind to today's plant.
Digital shadow	Automatic one way, physical to digital	A live mirror. It watches, infers, and reports, but can't act.
Digital twin	Automatic both ways	A mirror with a return path: model state feeds back to influence the asset.

Read that table honestly and most of what gets sold as a plant twin is a digital shadow. That's not an insult. A good shadow is worth building. But the word "twin" implies the return path, the automated influence back onto the process, and that path is what carries the cost and the risk. If your project has no plan for how a model conclusion becomes a setpoint, an alarm threshold, or a scheduling decision, you're funding a shadow and calling it a twin. Be clear which one the budget is for.

The idea is older than the label. Grieves traces the concept to a 2002 product-lifecycle lecture, the "Conceptual Ideal for PLM", built on three elements: the physical thing, its virtual counterpart, and the data connection binding them. The framing that matters for operations came later, with Grieves and Vickers arguing the twin's real job is to surface unpredictable, undesirable behaviour in a complex system before it shows up on the physical side (Grieves and Vickers, 2017). For a process plant that's the whole pitch in one line: catch the excursion in the model, not in the relief valve.

The reference architecture, and where the data actually comes from

There's now a standard that says all of this in committee language. ISO 23247 defines a digital twin in manufacturing as "a fit for purpose digital representation of an observable manufacturing element with synchronization between the element and its digital representation" (ISO 23247-1:2021). Two phrases in there earn their place. "Synchronization" is the binding again, stated as a requirement rather than an aspiration. "Fit for purpose" is the quiet escape hatch: you model what you'll act on, at the fidelity the decision needs, and nothing more. A twin that mirrors every flange in a unit at full physics is a research project, not an operating tool.

The reference architecture is worth knowing because it forces the awkward questions to the surface. NIST's implementation guidance lays out the ISO entities plainly: the Observable Manufacturing Element (the pump, the column, the line); a data-collection and device-control entity that reads it and, where allowed, actuates it; a core entity that holds the digital representation and its analytics; a user entity, which is often an existing MES or ERP rather than a person; and a cross-system entity for translation, security, and data assurance across domains (Shao, 2021). The same guidance splits the plumbing into proximity, access, service, and user networks, and pulls operational data off the floor using established protocols such as MTConnect. In a process plant that role usually falls to OPC-UA over the historian, with Modbus or fieldbus underneath for the older instruments.

ISA-95 stops being a poster on the wall right about here. A twin reaches from Level 0 instrumentation up to Level 3 and Level 4 systems, and every boundary it crosses is a place where timestamps drift, units disagree, and a tag means one thing in the DCS and another in the data lake. The synchronization the standard asks for isn't a slogan; it's a budget line for time alignment, buffering, and dealing with the sensor that drops out for ninety seconds every shift. Get the data contract wrong and the cleverest model downstream is fitting noise to noise.

A worked example of the contract: a level transmitter sampled at one rate, a lab assay landing every few hours, and a flow meter logging on change-of-value will never line up on their own. Someone has to decide the master clock, the interpolation rule, and what the twin does when a channel goes stale rather than wrong. Those choices belong to the data engineer long before the modeller opens a notebook, and skipping them is the commonest reason a promising twin never gets past its demo.

How fast does the binding need to be? That depends entirely on the loop you're closing. A twin advising a maintenance planner can tolerate minutes or hours of lag. A twin sitting next to model-predictive control on a fast loop cannot. Decide the timescale first, because it sets the sampling rate, the network, and ultimately whether the model can be physics-based at all or has to be something cheaper that runs in the time you have.

Where the twin earns its keep

Modelling choice is the fork in the road. Rasheed, San and Kvamsdal frame it cleanly from a modelling perspective: physics-based models carry known mechanism and generalise well but are heavy to compute, while data-driven models are fast and flexible but only as honest as the data they saw, and they extrapolate badly past it (Rasheed et al., 2020). For real plants the answer is usually neither alone. You keep first principles where the chemistry and the transport are well understood, and you bolt on data-driven pieces where the behaviour is messy, drifting, or simply not worth deriving from scratch. The hybrid is what survives contact with a feedstock change.

A concrete case makes the trade-offs legible. Heusel and colleagues built a twin for an industrial dynamic crossflow filtration unit clarifying grape juice, and they didn't run full mechanistic physics in the loop because it was too slow to advise the operator in time (Heusel et al., 2024). Instead they fitted a metamodel, a second-degree polynomial regression trained on 125 simulated runs of the underlying mechanistic model, so the twin could recommend a new setpoint within seconds and refresh it every five minutes during a run. Mean productivity over a year of real batches rose from roughly 466 to 536 litres per hour against the prior season, about a 15% gain, and the difference was statistically significant (p = 0.040). The metamodel even reached its valve-opening criterion about an hour early in some runs, which is exactly the kind of honest, documented limitation a shadow would never confess to.

Generalise from that case and the conditions for a twin paying off come into focus. First, there's a decision that genuinely moves on a timescale you can act on, and acting better is worth real money or real risk reduction. Second, there's a measurement you can't get directly, fast enough, any other way, so the model is buying you information rather than redecorating information you already have. Third, the model can be validated against something trustworthy, and re-validated as the plant changes. Hit all three and a twin earns out: real-time optimisation, soft sensing between lab samples, what-if rehearsal of a grade change before you commit the feed. Miss any one of them and you're back to building a shadow, or worse, a model.

Of those three uses, soft sensing is usually the one that pays first and argues least. A validated model standing in for a slow lab assay gives an operator a continuous number to steer by between samples, and the same inferential value can seed the twin's other functions. What-if rehearsal is the next rung: run the candidate grade change or the startup sequence in the model, watch where it pushes constraints, and only then commit the physical plant. That rehearsal is worth most on the manoeuvres that are rare, expensive, or unsafe to learn by trial on the real unit. Real-time optimisation against a live model is the highest rung and the one that demands the return path, the guardrails, and the maintenance budget all working at once, which is why it's also the one most often promised and least often delivered.

Where it doesn't

Now the failure modes, because they're more instructive than the wins and nobody puts them in the case study. These limitations aren't edge cases; they're the base rate, and most stalled twin projects died on one of the five below. The first is structural: a twin claimed where there's no return path. And plenty of plant "twins" are dashboards over a steady-state model with a manual data feed. That's a digital model wearing a twin's name tag, and the gap shows the day the plant moves and the screen doesn't.

The second is drift, and it's the quiet killer. A data-driven twin validated beautifully against a year of historian starts to lie the moment the feedstock, the fouling state, or a replaced sensor pushes the process outside the data it learned from. The reference you validate against keeps moving too; the lab method gets recalibrated, an analyser ages. What matters isn't accuracy at commissioning. It's whether anyone owns that accuracy in month nine, and whether there's a defined trigger to retrain or fall back to manual. Without that owner, operators stop trusting the displayed value and route around it, and the twin is dead while still drawing licence fees.

The third is the compute-versus-clock squeeze. Full first-principles fidelity is often too slow for the loop you wanted to close, which is precisely why the grape-juice team reached for a metamodel. So reduced-order models, surrogates, and regression metamodels are the usual escape, but each one trades mechanism for speed, and that trade has to be made on purpose and written down, not discovered later when the twin confidently extrapolates somewhere its surrogate was never fitted.

The fourth is integration debt. Rasheed and colleagues are blunt that data quality, legacy systems, and interoperability are first-order obstacles, not footnotes, and the state-of-the-art reviews say the same thing from the industry side (Tao et al., 2019). In an older process plant the twin's hardest engineering is rarely the model. It's getting clean, time-aligned, well-named data out of three generations of controllers that were never meant to talk to each other.

And the fifth is the one that keeps me cautious: that return path is an attack surface. The thing that makes a twin a twin rather than a shadow is its ability to write back toward control, and a write path into a process is a safety and security concern before it's a feature. Any real bidirectional deployment belongs inside an IEC 62443 zone-and-conduit design, with the twin treated as an untrusted source until proven otherwise. A shadow that only reads is far easier to defend, which is another reason a shadow is often the right answer rather than a lesser one.

A build-or-skip test before you commit the budget

Strip all of it down and a usable test fits on an index card. Run a candidate through it before anyone writes a model.

Name the decision. What setpoint, threshold, or schedule changes because of the twin, and on what timescale? No named decision, no twin.
Name the missing measurement. What does the model tell you that you can't already measure directly and fast enough? If you can just buy the sensor, buy the sensor.
Name the validation. What truth do you check the model against, how often, and who owns the retrain-or-revert trigger when it drifts?
Name the return path, and its guardrails. If data flows back toward control, where's the zone boundary and the human in the loop? If nothing flows back, admit you're building a shadow and price it as one.

But most candidates fail at least one line, and that's a good outcome. A failed line usually points at the cheaper, better project hiding underneath: a steady-state simulator for design studies, a couple of well-maintained soft sensors to close a measurement gap, or a clean historian and a dashboard that earns its keep as an honest shadow. We see plants reach for a full twin when a soft sensor would have done the job at a tenth of the lifecycle cost, and reach for a dashboard when the economics clearly justified closing the loop. The discipline is matching the tier to the decision.

The plumbing underneath all four tiers is the same work regardless: ruggedized sensing, time-aligned edge telemetry, and models that are maintained rather than abandoned. That's the part we build, whether the right answer turns out to be a shadow or a twin, and getting an honest data foundation in place first is what an edge telemetry and analytics platform is for. The modelling, and the retrain-or-revert discipline that keeps it alive, is the harder and longer commitment. Decide the tier on the merits. Then build the smallest thing that closes the loop you actually named.

Digital Twins in Process Plants: Where They Pay Off

What separates a twin from a model that happens to run

The reference architecture, and where the data actually comes from

Where the twin earns its keep

Where it doesn't

A build-or-skip test before you commit the budget

References

Reuse & license

Disclaimer

Cite this article

What separates a twin from a model that happens to run

The reference architecture, and where the data actually comes from

Where the twin earns its keep

Where it doesn't

A build-or-skip test before you commit the budget

References

Reuse & license

Disclaimer

Cite this article

Related articles