Soft Sensors: Keeping an Inferred Measurement Honest

A soft sensor earns its keep the day a number you can't otherwise see in real time starts driving a control loop or a shift decision. Composition at the top of a distillation column. Melt index out of a polymer reactor. Biomass in a fermenter. The lab can tell you these things, but the answer comes back hours after the process has already moved on. A soft sensor reads the cheap, fast measurements you already have (tray temperatures, pressures, flows, reflux) and infers the expensive, slow one continuously, between lab samples.

That much is settled engineering. What gets less attention is the second life of the model: what happens after commissioning, when an inferential estimator that validated beautifully against a year of historian data starts to drift, and an operator quietly stops trusting the displayed value. This is a memo about that second life. How to pick a model you can defend, how to validate it against a reference that itself keeps moving, and how to keep it honest while the plant ages around it.

What the soft sensor is actually standing in for

Strip away the machine-learning vocabulary and an inferential sensor is a stand-in for a measurement you would rather make directly but can't, for one of three reasons: there's no online analyzer rugged or cheap enough, the analyzer that exists is slow or unreliable, or the only trustworthy method is an offline lab assay that arrives once a shift. Kadlec, Gabrys and Strandt, in their 2009 review of data-driven soft sensors in the process industry, frame the model as the learned relationship between the easy-to-measure input variables and the hard-to-measure output, applied online to estimate that output continuously (Kadlec et al., Computers & Chemical Engineering, 2009).

It helps to be precise about which job you are asking it to do, because the jobs don't have the same tolerances. The same review groups the uses roughly into four: online prediction of a quality variable that otherwise waits on the lab, process monitoring, fault detection, and acting as a backup when a hardware analyzer drops out. A backup estimator that holds a loop together for an hour during an analyzer outage can be looser than a primary estimator that closes a composition loop every minute of every day. Decide which one you are building before you argue about model type.

Refinery and petrochemical practice is the clearest example. Advanced process control sits on top of a base regulatory layer, and the inferential layer feeds it product properties that no fast analyzer reports cleanly: research octane number, Reid vapour pressure, density, distillation cut points. Yokogawa's own description of advanced process control puts the soft sensor squarely in this role, processing dozens of measurements together to estimate a property in real time so that a model-predictive controller has something to push against between lab results (Yokogawa, Fundamentals of Advanced Process Control, 2015). The economic logic is simple. If you can see the property now instead of after the next assay, you can run closer to the spec limit without crossing it.

Choosing a model you can still explain in a year

The temptation is to reach for the most expressive model available and let it sort out the correlations. Resist it, at least until you know the data can support it. You're not obliged to spend that complexity if the data won't pay it back. The dominant data-driven techniques in the literature are the unglamorous ones: multiple linear regression, principal component regression, and partial least squares, with neural networks and support-vector regression brought in when the relationship is genuinely nonlinear. The 2021 Frontiers review of soft sensors for bioprocesses found partial least squares regression to be the most prevalent approach in that field, and the reason generalises well beyond bioprocessing (Brunner et al., Frontiers in Bioengineering and Biotechnology, 2021).

Process data is collinear by nature. Tray temperatures up a column move together; flows and pressures are coupled by the same hydraulics. Partial least squares handles that by projecting the inputs onto a small set of latent variables that capture the covariance with the target, then regressing on those. You trade a little raw accuracy for a model whose internals you're able to inspect: loadings that tell you which measurements actually carry the signal, scores you can trend to spot when the process has wandered outside the space the model was trained on. When a neural network's estimate goes wrong at three in the morning, what does the shift engineer do with it? A latent-variable model at least gives them a contribution plot to argue with.

None of this is an argument against nonlinear models. It is an argument for earning them. If a linear latent-variable model leaves structured error that tracks a known nonlinearity (an exponential temperature dependence, a phase change), then a nonlinear method is justified and you have the residuals to prove it. Reaching for deep models first, on a few thousand correlated samples, usually buys you a fit to noise and a maintenance liability. Start simple, measure what the simple model misses, and add complexity only against evidence.

And the model choice isn't where most of the effort goes anyway. The 2009 review is explicit that the characteristics of process-industry data are what decide whether a soft sensor succeeds, and anyone who has built one will recognise the ratio: the algorithm is the small part, the data work is the rest. Historian data arrives with missing samples, frozen values held over comms dropouts, outliers from transmitter glitches, and operating regions visited so rarely the model barely sees them. Before any regression runs, you spend real time selecting which periods represent normal operation, removing samples taken during startups and shutdowns and known upsets, and deciding how to treat the gaps. A model trained on dirty data inherits every problem in it, and no amount of cleverness downstream recovers what the input selection threw away or let through.

Validation against a reference that keeps moving

Here is the awkward truth at the centre of soft-sensor work: you are validating an estimate against a reference that has its own uncertainty. The lab assay isn't ground truth. It's a measurement, with a sampling protocol, a calibration history, and an error bar of its own. If you treat lab values as exact, you will tune the model to chase lab noise and then wonder why it looks worse in production than it did in your test set.

Metrology discipline pays off at exactly this point. The reference value your model is trained and judged against should itself be traceable. NIST defines metrological traceability as the "property of a measurement result whereby the result can be related to a reference through a documented unbroken chain of calibrations, each contributing to the measurement uncertainty" (NIST, Metrological Traceability). Read that carefully: traceability is a property of the result, not of the instrument, and every link in the chain adds uncertainty. The practical consequence for a soft sensor is that the residual you compute (model estimate minus lab value) is the sum of model error and reference error, and you can't tune away the second part. Knowing the lab's uncertainty tells you how small your validation residuals can ever honestly get.

So how do you validate without fooling yourself? A few rules we hold to. Validate on data the model never saw during training, separated in time rather than shuffled at random, because process data is autocorrelated and random splitting leaks the future into the test set. Time-align the lab sample to the process conditions it actually represents, accounting for residence time and sample-loop dead time; an estimate judged against a lab value from the wrong moment will look wrong even when it is right. And report the model's error in the same units and against the same uncertainty as the lab, so a reviewer can see whether the estimate is within the reference's own noise band or genuinely outside it.

One more validation habit earns its place: decide in advance what the model is allowed to be wrong about. A soft sensor trained on a year of data knows the operating space it saw, and nothing else. When the process moves into a region the training set never covered (a new feedstock, a rate it never ran, a recovered exchanger that changed the temperature profile), the model is extrapolating, and its error there is unbounded by anything in your validation. So pair every estimate with a measure of how far the current inputs sit from the training space (the latent-variable scores make this cheap to compute) and treat an estimate from outside that space as a guess, not a measurement. An operator who knows the model is extrapolating will treat the number with the suspicion it deserves. One who doesn't will trust it straight off a cliff.

Drift is the adversary, not accuracy

A soft sensor almost never fails by being inaccurate on day one. It fails by being accurate on day one and slowly wrong by month six. Catalysts age, heat exchangers foul, feedstock slates change, a sensor gets recalibrated and shifts its bias. The relationship the model learned is a snapshot of a process that doesn't hold still. In machine-learning terms this is concept drift, and it is the single most common reason inferential sensors get switched off.

The follow-up review by Kadlec, Grbić and Gabrys treats adaptation as the core problem rather than an afterthought, and sorts the available mechanisms into three families: moving-window techniques that retrain on a sliding block of recent data, recursive techniques that update the model incrementally as each new sample arrives, and ensemble methods that maintain several models and reweight them as conditions change (Kadlec et al., Computers & Chemical Engineering, 2011). Each has a failure mode worth naming. A moving window that is too short chases noise and forgets valid behaviour the process will revisit; too long, and it adapts too slowly to matter. Recursive updates can quietly walk the model away from physical sense if the incoming labels are bad. Ensembles spread the risk but multiply the number of things you have to keep an eye on.

Adaptation also creates a governance hazard that operators feel before engineers do. A model that retrains itself on field data will happily learn from a period when the plant was running off-spec, or when an input sensor was drifting, and bake that bad behaviour in. Automatic adaptation without a gate on which data is allowed to update the model is how a soft sensor launders a measurement fault into a "correction". Any adaptive scheme needs a rule for when not to adapt: hold the model during known upsets, during analyzer maintenance, when inputs fall outside validated ranges. Self-validating designs that pair the estimate with a moving-window quality check exist precisely to make that decision automatically rather than leaving it to whoever is on shift.

The input-sensor problem

An inferential model is only as trustworthy as the measurements feeding it, and this is the part that gets underbuilt. The Frontiers bioprocess review is blunt about it: tolerance to sensor faults is, in its words, the greatest challenge in soft-sensor development, and studies on fault tolerance in their domain are still rare (Brunner et al., 2021). The same point holds across the process industries. Feed a soft sensor a frozen transmitter, a saturated signal, or a thermocouple reading ambient because it has come loose, and it will produce a confident, completely wrong estimate. The model has no way to know its inputs lie unless you give it one.

There is a second, subtler trap the same review names: the smearing effect. When inputs are correlated (which, in process data, they always are), a fault on one sensor spreads its signature across the others in any contribution-based diagnostic, so the obvious "which sensor is bad" plot points at the wrong tag. That means input validation cannot be an afterthought bolted onto the output. It has to run on the inputs in their own right: range and rate-of-change limits, cross-checks between redundant or physically related measurements, and a defined fallback (freeze the estimate, flag it, or hand back to the base controller) when an input fails its checks. A soft sensor that can't say "I don't trust my own inputs right now" is worse than no soft sensor, because it removes the operator's instinct to be suspicious.

The cheapest input check is often physical reasoning the model already implies. If two tray temperatures sit a fixed distance apart in normal operation and that gap collapses, one of them is suspect regardless of what either reads in isolation. Mass and energy balances close, or they don't; a flow that no longer reconciles with the levels it feeds is a flag the model can raise for free. These cross-checks won't catch every fault, but they catch the loud ones, and the loud ones are what produce the confident, badly wrong estimate that loses an operator's trust for good.

Running it as a maintained asset, not a project

The deployments that last share one trait: someone owns the model the way someone owns a pump. There is a named custodian, a record of every retrain and what data it used, an expected accuracy band with an alarm when residuals leave it, and a documented procedure for taking the estimate out of closed-loop service when it degrades. Treating a soft sensor as a one-off data-science project that ships and then runs forever is how you end up with a control room full of inferential displays nobody believes.

The surrounding standards help here, and they are worth naming because they give the model a place in the plant's existing discipline. ISA-95 gives you the layering to decide where the estimate lives and what it is allowed to touch. OPC-UA gives you a transport that carries data quality and timestamps alongside values, so an input flagged "bad" arrives flagged rather than silently. ISO 9001 quality-management practice expects calibration and measurement records that a soft sensor's reference chain should slot into. And if the estimate touches anything that matters for safety or production integrity, IEC 62443 expects you to think about who can change the model and how, because a model you can edit is an attack surface and an accident surface at once. None of these were written for soft sensors specifically. All of them apply.

This is the part of the work we spend the most time on with operators, and it is the least glamorous: instrumenting the model so that its health is visible next to the process it serves, and wiring the inputs, the residuals, and the adaptation gates into one place an engineer can actually watch. Our edge telemetry and analytics platform exists to make that lifecycle observable rather than implicit, but the principle stands whatever you build it on. The model is not the deliverable. The maintained, validated, fault-aware estimate is.

So before the next inferential sensor goes into closed-loop service, ask the question that decides whether it survives its first feedstock change: who watches it, against what reference, and what happens the day it starts to drift? If there is a clear answer, the model will earn its keep for years. If there is not, it will be switched to manual within a quarter, and the lab loop you were trying to escape will quietly take back over.

Soft Sensors: Keeping an Inferred Measurement Honest

What the soft sensor is actually standing in for

Choosing a model you can still explain in a year

Validation against a reference that keeps moving

Drift is the adversary, not accuracy

The input-sensor problem

Running it as a maintained asset, not a project

References

Reuse & license

Disclaimer

Cite this article

What the soft sensor is actually standing in for

Choosing a model you can still explain in a year

Validation against a reference that keeps moving

Drift is the adversary, not accuracy

The input-sensor problem

Running it as a maintained asset, not a project

References

Reuse & license

Disclaimer

Cite this article

Related articles