Lykos Defence Logo

LYKOS DEFENCE

Readiness. Response. Resilience.

OT Incident Response Site Champions: The Missing Link Between IT, OT, and Crisis Response

By · 17 min read

OT incident response site champion coordinating cyber, operations, vendor, and crisis response decisions

In an operational technology (OT) incident, the first hour is often messy.

The security operations centre (SOC) is looking at suspicious authentication to an OT jump host. Operators are reporting odd human-machine interface (HMI) behaviour. A vendor session ran earlier in the day. A historian has stopped receiving data. Someone in IT is asking whether they can isolate a subnet, disable remote access, or reboot a server.

At the same time, the site still has to run.

That’s where OT cyber incident response plans (IRPs) are stressed. The central security team understands the alert, but not always the process. The operations team understands the process, but not always the cyber risk. The vendor understands one slice of the control environment. Crisis leaders need a clear site picture, but the information reaching them is often partial, delayed, or missing local context.

OT IR site champions address that gap. The role gives each site a prepared person, or small group, who can connect cyber response, plant operations, engineering, vendors, safety, and crisis leadership while decisions are still a work in progress.

Treat this as a readiness role, with authority, training, and a place in the response structure.

Need to test whether your OT response model would hold under pressure?

If you’re unsure whether your plans, playbooks, escalation paths, evidence sources, and containment decisions would work together during a real OT incident, explore our Incident Capability Validation. It’s designed to baseline and stress-test response capability before a serious event exposes the gaps.

What is an OT Incident Response Site Champion?

An OT incident response site champion is a pre-designated person, or small team, at an operational site who helps translate plant reality into response decisions during a cyber incident.

The title can vary. Some organisations use site response lead, local OT responder, plant cyber lead, automation lead, or control systems response lead. The name matters less than the function.

Your site champion can answer questions the central response team often can’t answer from a dashboard:

  • Is the process affected, or is this limited to supporting systems?
  • Are operators seeing abnormal alarms, loss of view, unexpected control behaviour, or process instability?
  • Has anyone restarted, isolated, patched, reconfigured, or called a vendor already?
  • Was there recent maintenance, engineering activity, or remote vendor access?
  • What evidence is likely to disappear if we reboot, disconnect, or rebuild?
  • Which containment options are safe now, and which need operational approval?
  • Who needs to be on the call before the organisation makes a decision?

A central incident response plan can list those questions. A site champion helps answer them during the actual IR.

The Local Gap in OT Incident Response

OT incident response is different from traditional enterprise/IT response because the consequences are different.

In IT, common containment actions include isolating hosts, disabling accounts, blocking routes, rebuilding servers, and forcing password resets. In OT, the same action can have very different consequences. An engineering workstation might support access to a live process, an operator workstation might provide visibility and control, and a network segment might carry traffic across several process areas using industrial protocols. A remote access change can also cut off the only vendor able to support safe recovery.

Those differences have meaningful impact during real incidents, up to and including life and safety impact.

In my experience, the people with cyber authority often don’t have enough process context. The people with process context don’t always know how to support evidence preservation, containment, or escalation. Crisis leaders can be several steps removed from any given site. By the time everyone finds the right contacts, builds a basic picture, and works out who can approve what, the incident has already moved or escalated.

The first hour is often less about detection and more about getting the organisation into a position where it can make a safe, informed decision, and taking action without making things worse.

What a Site Champion Actually Does

A site champion doesn’t own the whole incident.

The incident still needs a response structure. There still needs to be an incident lead or incident commander. Crisis leadership still owns business decisions. Safety needs a voice when safety could be affected. Legal, communications, vendors, and executives might all need to be involved.

Your champion’s job is to make sure everyone’s working from information that reflects reality instead of assumptions.

In practice, your site champion is usually someone trusted by the operation: a control systems engineer, OT systems lead, production supervisor, automation specialist, reliability engineer, maintenance lead, or another person who knows how the site actually works. In larger environments, the role can sit with a small group instead of one person.

The strongest site champions tend to:

  • Understand the local process: they know which systems are critical, which systems are convenient, and which dependencies non-OT folks usually miss
  • Know how the site works under pressure: they know who can authorise operational actions, who’s available after hours, how vendors are engaged, and which communication paths still work when normal channels are unavailable
  • Follow the response structure: a good site champion doesn’t freelance. They make the response team faster and better informed. They don’t create a second chain of command.

OT incidents already involve enough ambiguity; the site champion should reduce noise, not add more of it.

The Work Starts Before the Incident

Like most things in incident response, the value of a site champion is created before anything happens.

A person nominated during a crisis is just the person everyone happens to know. A champion nominated, trained, and exercised before the incident even occurs becomes part of your response system.

Before an incident, your champion helps make the OT IRP usable at the facility level. They test whether the contact tree is accurate. They know which vendors support which systems. They understand how to reach local OT support out of hours. They help identify which process areas matter most. They know where useful evidence lives and which recovery actions could destroy it.

One practical way to support the role is to maintain a site response pack.

This should be short, current, and useful during an incident. A good site response pack usually includes:

  • Local escalation contacts
  • Vendor and integrator details
  • Critical process areas
  • Priority OT assets
  • Fallback communication options
  • Known OT and IT dependencies
  • Evidence sources and logging retention notes
  • Recovery dependencies
  • Approved containment paths

The pack should answer the questions people ask under pressure:

  • Who do we call?
  • What is this system connected to?
  • What happens if we isolate it?
  • What evidence should we capture first?
  • Who can approve a shutdown, mode change, or remote access change?
  • What are the safe options if corporate communication tools are unavailable?

If those answers are being discovered for the first time during the incident, the response is already slower than it needs to be.

Want to know whether your site-level response pack is enough?

Our Cybersecurity Tabletop Exercises help OT, IT, executive, legal, communications, safety, and vendor stakeholders test site response paths under realistic incident conditions.

The First Hour: Get the Site Picture Right

The first hour of an OT cyber incident shouldn’t become a rush to take the most visible technical action. Fast containment is sometimes necessary. In OT, the same urgency can also create avoidable harm.

Your site champion helps the response team slow down just enough to make the right decision: slow is smooth; smooth is fast.

The first task is to establish the operational picture. Is the site still operating normally? Are operators seeing abnormal alarms, loss of view, unexpected control behaviour, or process instability? Is this a cyber alert with no known process impact, or is there already evidence of an OT consequence? Has production changed state? Has anyone switched modes, restarted systems, or called a vendor?

Those details shape the response.

Case Study: Oldsmar and the Value of Local Observation

In 2021, the Oldsmar water treatment facility in Florida experienced a cyber incident where the water’s sodium hydroxide (lye) concentration was increased from a normal level of 100 parts per million (PPM) to a dangerous 11,100 PPM. Initial advisories from CISA and local law enforcement described unauthorised remote access to a water treatment supervisory control and data acquisition (SCADA) system. A plant operator noticed and corrected the change.

(Later reporting noted that the FBI couldn’t confirm the incident was initiated by a targeted cyber intrusion.)

A person at the site who can see what’s happening in the process, recognise that a change matters, understand whether the process is actually affected, and explain the local situation to responders is invaluable. An alert might say “remote access” or “setpoint change”; the site view answers the harder questions: did operators see it, is the process affected, were safeguards in place, what was changed back, and what evidence or operator notes should be captured now?

The second task is to connect the right people quickly. The champion should know when to bring in operations leadership, safety, on-call OT personnel, engineering, facilities, physical security, and vendors. That should follow a path agreed before the incident, not someone’s memory under pressure.

The third task is to preserve information that won’t last. That could include current HMI displays, alarm states, operator notes, engineering workstation activity, remote access records, controller status, shift logs, recent maintenance activity, or photos of relevant screens where appropriate. The champion doesn’t need to perform forensic collection. They do need to know that ordinary recovery actions can impact useful evidence.

The fourth task is to support containment decisions with local context.

If someone proposes disconnecting a workstation, blocking a route, disabling remote access, or rebooting a server, the champion should help the incident lead understand what that action means on the plant floor. What depends on it? Is there a safe fallback? Would the action affect one process or several areas? Would it overwrite evidence? Who needs to approve it?

That’s the difference between a security action and a response decision.

Evidence Quality Depends on Early Site Behaviour

In OT, evidence can change before a forensic specialist sees the environment.

That usually happens for understandable reasons. Operational teams are trained to restore service, keep production stable, and make the plant safe. Those instincts are critical. During a cyber incident, they need to be balanced with evidence preservation (where possible), while maintaining life and safety as priority #1.

A reboot can remove volatile data. A rebuild can destroy the clearest indication of what happened. A configuration change can overwrite useful state. A vendor troubleshooting session can alter logs. Even basic operator observations lose value if nobody records when they happened or who saw them.

Site champions help make early capture more deliberate.

They can make sure the response team receives the local facts that are unlikely to appear in a security information and event management (SIEM) platform: what operators saw, what alarms appeared, what the process was doing, which systems were touched, who was connected remotely, what maintenance was planned, and what actions were taken before the incident was formally declared.

That information supports more than root cause analysis; it affects containment, impact assessment, regulatory reporting, executive briefings, insurance discussions, recovery confidence, and lessons learned.

It also helps avoid a common OT incident problem: confusing system evidence with process impact.

A suspicious login to an OT jump host matters; it’s not the same as confirmed process manipulation. A disrupted HMI matters; it’s not always proof of malicious control activity. Your site champion helps the response team connect technical evidence to what was actually happening at the facility.

For evidence planning, this role also pairs well with a Collection Management Framework (CMF), because the champion can identify which local evidence sources actually exist and which ones disappear quickly.

Containment Needs Operational Context

Containment in OT is rarely a purely technical choice.

The better question isn’t:

Can we isolate this system?

It’s:

What happens if we isolate it now?

The answer depends on the process state, current production conditions, available manual procedures, vendor support, safety considerations, network design, recovery dependencies, and the quality of evidence already captured. In some cases, immediate isolation is the safest path. Usually, though, staged containment, compensating monitoring, controlled shutdown, or a vendor-supported change is more appropriate.

Your site champion brings the local knowledge needed to assess those options.

Case Study: When Isolation Needs Site Context

Imagine an engineering workstation is flagged for suspicious remote access during an OT incident. From the central security view, immediate isolation looks sensible: cut the connection, preserve the host, and stop any further access. The site champion adds the missing context. That workstation is currently the only machine with the project files needed to diagnose a live process issue, the vendor has been using it under supervision, and the last known-good configuration hasn’t yet been copied elsewhere.

The right answer might still be isolation, but the sequence changes. Capture the active session details, preserve the relevant files, confirm whether another workstation can take over, brief operations on the process impact, and isolate only when the site has a safe fallback. The containment decision becomes more appropriate because, with those facts in hand, the team can move quickly without guessing.

Those answers preserve urgency while giving the incident lead enough context to choose the right sequence.

The same applies to remote access. Disabling all vendor access can reduce exposure, but it can also remove support needed for safe recovery. Leaving access open can preserve support, but increase risk. Your site champion can help identify which access paths are active, which vendors are relevant, what the business needs, and what temporary controls are realistic.

Good OT containment reduces risk without creating avoidable harm. Sometimes that means immediate isolation. Other times it means a controlled sequence of evidence capture, operational briefing, compensating controls, vendor support, and then isolation.

How the Role Fits into Crisis Response

For the site champion model to work, it needs to be visible in the wider incident structure.

The central incident lead should know who the champion is and how to reach them. The crisis team should understand what information the champion can provide. Operations leadership should know the role is authorised to coordinate site-level information during an incident. Vendors should know how they’re engaged. Safety should know when they’ll be brought into the response.

During a serious incident, the champion should provide concise updates that separate confirmed facts from assumptions. A useful update covers current process status, safety concerns, systems affected, actions already taken, evidence captured, pending decisions, and support required.

In real incidents, that changes the quality of the response. I often see crisis teams receive technical updates that don’t explain operational consequence, or operational updates that don’t explain cyber relevance. Site champions bridge that gap. They give the response team a better view of what’s happening at the facility and give the facility a clearer route into the incident structure.

That’s especially important for critical infrastructure organisations, where poor early communication can complicate regulator engagement, customer messaging, supplier coordination, and public statements.

Test the Model Before it is Needed

A site champion model should be exercised before anyone relies on it.

The best starting point is usually an OT tabletop exercise (TTX). The exercise should include the champion, central cyber responders, operations leadership, safety, crisis management, communications, legal, and vendor representation where appropriate.

The scenario should create realistic uncertainty: suspicious access to an OT jump host, intermittent HMI issues, a recent vendor session, and unclear process impact. The champion should be expected to activate the site contact path, confirm the local picture, identify evidence sources, brief the incident lead, and help assess containment options.

The exercise should also include friction:

  • What if the primary OT engineer is unavailable?
  • What if corporate email and Teams can’t be trusted?
  • What if a vendor is slow to respond?
  • What if the proposed containment step could affect production?
  • What if executives need an impact statement before the site can provide one?

These are the moments that show whether the model works.

The measure of success isn’t attendance or a completed slide deck. The better measure is whether the organisation made better decisions because the role existed. How long did it take to reach the champion? How quickly did the team understand whether safety or production was affected? Was evidence identified before recovery actions began? Did containment follow agreed criteria? Was the local incident log useful? Did lessons from the exercise lead to changes in the plan?

A TTX should leave the organisation with sharper contact paths, clearer authority, better evidence expectations, and more realistic containment options.

If it only produces a vague action list, it hasn’t done enough.

Common Mistakes When Building the Role

The easiest mistake is to nominate a champion and assume the job is done.

A named person without authority, training, or integration can still be helpful during an incident, but the organisation is relying on goodwill and improvisation.

Another mistake is choosing the most technical person without considering communication and judgement. Technical knowledge matters, but the role also needs calm coordination. A good site champion can brief clearly, avoid speculation, escalate at the right time, and work within the response structure.

Organisations also get into trouble when the role is treated as an OT-only concern. The value comes from its position between OT, IT, and crisis response. If the central security team doesn’t know how to engage the champion, or if crisis leaders don’t understand what the champion can provide, the model will fail when it’s most needed.

The final mistake is neglecting maintenance. Sites change. People move roles. Vendors change contracts. Assets are replaced. Network paths are adjusted. Contact numbers become stale. A site response pack that was accurate last year can be actively misleading during the next incident.

The role needs an owner, a review cycle, and regular exercises.

Start with the Sites Where it Matters Most

This doesn’t need to become a large program on day one.

A sensible starting point is to identify the sites or process areas where an OT incident would have the highest consequence. Appoint champions there first. Define the role in plain language. Build a short site response pack. Run a focused TTX. Capture what didn’t work. Update the plan. Repeat.

From there, build the muscle. Standardise the training, make evidence expectations clearer, include champions in wider crisis exercises, test vendor support paths, and share lessons between sites.

The aim is to reduce uncertainty before the incident forces the organisation to learn under pressure.

For many OT environments, this is one of the most practical readiness improvements available. It doesn’t require a new platform. It starts with people who already understand the site, then gives them the structure, authority, and practice needed to support a serious incident.

The Site is Where the Plan Becomes Real

OT IRPs often look complete on paper. They define roles, escalation, severity levels, communications, technical actions, and recovery phases. The real test comes when cyber uncertainty meets operational consequence.

That test usually happens at the site.

Without someone local translating plant reality into response decisions, teams lose time. Weak early evidence makes the investigation harder. Containment without process context creates avoidable disruption. Crisis leaders also struggle to defend decisions when they can’t get a reliable site picture.

Site champions help prevent that.

They don’t replace monitoring, asset visibility, incident command, tested playbooks, or strong OT architecture. They make those capabilities usable. They give the organisation a local response layer that understands the process, knows the people, preserves early facts, and supports safer decisions.

For organisations responsible for OT, industrial control systems (ICS), or critical infrastructure environments, that’s a practical readiness capability.

Lykos Defence helps organisations assess, design, and validate incident response capability across IT, OT, and crisis environments. A site champion model can be built into OT cyber incident response plans, tested through TTXs, and improved before the next incident makes the gap visible.

If you want to baseline whether your site-level response model would hold under pressure, start with Incident Capability Validation.

If you want to rehearse OT decision-making with operations, cyber responders, safety, executives, legal, communications, and vendors, explore our Cybersecurity Tabletop Exercises.

And if you need a longer-term program to strengthen and regularly validate response capability, see our Incident Response Readiness Program or Incident Response Assurance Program.

OT Incident Response Site Champions FAQ

An OT incident response site champion is a pre-designated person, or small team, at an operational site who connects cyber response, plant operations, engineering, vendors, safety, and crisis leadership during a cyber incident. The role helps central responders understand what's happening at the site and what response actions mean for the process.

OT incidents need site champions because central security teams often lack local process context. A champion helps confirm operational impact, identify evidence sources, engage the right people, and support safe containment decisions before recovery actions remove useful information or create avoidable disruption.

A good site champion is usually someone trusted by the operation, like a control systems engineer, OT systems lead, automation specialist, reliability engineer, production supervisor, or maintenance lead. The person needs local process knowledge, calm communication, and enough authority to coordinate site-level information during an incident.

The best way to test the model is through an OT TTX that includes the site champion, central cyber responders, operations leadership, safety, crisis management, communications, legal, and key vendors. The exercise should test contact paths, evidence capture, operational impact assessment, containment decisions, and executive updates.




Disclaimer: This content may have been edited or refined with assistance from AI tools. All final content, views, and recommendations are our own.