Cybersecurity safeguards on rigs must be non-intrusive, have small footprint and be tailored for industrial control systems
By Nathan Moralez, BP, and Siv Hilde Houmb, Secure-NOK
Once considered rare, cyber-attacks on industrial control systems (ICS) – also referred to as operational technology (OT) systems – are increasing in frequency and sophistication. At the same time, drilling for oil and gas in today’s digital oilfield makes rig OT systems increasingly dependent on real-time data processing over information technology (IT) networks. The increase in cyber-attacks and the integration of OT and IT systems require cyber resilience to be built into rig OT systems. This article examines the challenges of building cyber resilience into rig OT systems and discusses a strategy based on a combination of IT security safeguards and OT tailored safeguards.
In recent years, the drilling industry has increased the efficiency and reliability of drilling operations because of its ability to collate, use and analyze real-time data. This has led to an integration of OT and IT systems. This integration is a major challenge for the drilling industry as it exposes systems that were earlier isolated from cyber-attacks.
OT and IT systems belong to different operational paradigms. OT systems are heterogenous systems with a mix of old and new hardware, software and applications, which need to be robust, predictable and reliable. Conversely, IT systems run commercial off-the-shelf (COTS) operating systems and applications that rely on continuous updates for cyber protection. These differences are important considerations when building cyber resilience into OT systems and mean that some security safeguards may not be applicable to secure OT systems.
This article discusses the feasibility of various strategies for building cyber-resilience into OT systems. It describes security safeguards that could be applied to the OT environment, as well as the specific challenges that OT environments present to traditional security thinking and solutions.
IT and OT system differences
IT systems are homogenous systems comprised of network nodes that have COTS operating systems, such as Windows or Linux, and run strictly over TCP/IP. IT networks support the everyday business of a company, such as email, communication, internet and business-critical software applications. Generally, these network nodes are not critical to rig and drilling operations – meaning that, if the network node restarts or goes down for some reason, there may be financial consequences but no impact to health, safety or the environment.
Rig OT systems are comprised of hardware and software that work together to achieve levels of discrete and semi-automated control (Figure 1). All aspects of these rig OT systems are built to execute the operational processes in drilling, completions and interventions for oil and gas wells. This includes logic solvers such as programmable logic controllers (PLCs), soft PLCs and single board computers (SBCs). Modern deployed rig ICS networks also include various types of supervisory control and data acquisition (SCADA) devices. These SCADA PCs and servers generally encompass the human machine input (HMI) and graphical user interface (GUI) applications.
Other parts of industrial control networks include: network devices, field sensors, remote input/output devices (RIO), voltage inverters/drives, motors, proprietary protocols and software, etc. The core objective of these networks and devices is to execute levels of discrete and semi-automated control of mechanized equipment to perform specific drilling-related task and processes. This includes control over drilling mechanization, such as a drawworks, top drive, pipe-handling, mud pumps, pipe torquing mechanism (Iron Roughneck), mud-processing equipment and blowout preventers (BOPs). These robotic mechanisms are heavy and encompass a large footprint and, therefore, must be safely operated and managed.
Some of these systems are legacy systems that can be 15-25 years old, with legacy or proprietary operating systems designed and engineered to be deterministic and predicable, with a high degree of availability and reliability. In practice, this means that rig OT systems often are a mix of new and legacy systems that work together to achieve the drilling, completion or intervention process.
Challenges in building in cybersecurity resilience
The traditional IT security model is built to protect the confidentiality, integrity and availability (CIA) of information and services in IT systems and networks. The IT security model covers multiple layers of basic security safeguards, such as firewalls, access control, malware protection, antivirus solutions, patch management and network monitoring. Some of these safeguards may be difficult to implement, as they need regular updates to function as designed. Even though they work well in a business environment, they may not work well in an OT environment.
Whereas the functional design of antivirus and malware applications is to update as soon as definition files become available, a common model for managing antivirus and malware protection software on SCADA PCs and servers is to have the definition files downloaded and tested by the SCADA integrator/owner. This model ensures that the definition files work seamlessly with the HMI/GUI application and other supporting applications installed on the SCADA PCs and servers. This maintenance model takes time and does not ensure the reliability of the antivirus and malware protection solutions. This solution is neither complete nor sufficient for securing an OT or IT system. Furthermore, applying antivirus and malware updates could impose compatibility and stability issues on the SCADA PCs or server in cases where systems are running unsupported software and hardware.
Operating system (OS) patching or updating is another security method generally deployed in IT systems. MS Windows security bulletins are released every second Tuesday of each month for all operating systems with active life-cycles. Each bulletin contains detailed information about the affected components, reboot requirements and related common vulnerabilities and exposures. Deploying MS OS patching has similar impacts on OT SCADA PCs as updating blacklisting applications, such as antivirus. Furthermore, in some cases, OT systems are running obsolete/unsupported software and hardware. In these cases, deploying patches is not an option. For example, a large majority of OT systems are still using MS Windows XP, for which Microsoft has discontinued issuing patches.
Network monitoring solutions, such as deep packet inspection (DPI), use inspection points across the network to filter the data for non-compliance with pre-defined rules and definitions. These types of network monitoring solutions work well for OT networks but have some deficiencies in identifying viruses and other intrusions at the source, especially if the non-compliant malware does not propagate across the network. Malware at the source could impact the PC or server locally and go undetected by these monitoring solutions. Network monitoring solutions, firewalls, access control and some types of malware protection are security safeguards that, with some modifications, could be applied to OT networks.
Cyber resilience strategies for Rig OT systems
The traditional defense-in-depth strategy works well for OT systems but needs to be tailored to the OT environment (Figure 2). The National Institute of Standards and Technology (NIST) Cybersecurity Framework (CSF) defines cybersecurity in terms of five core functions: identify, protect, detect, respond and recover. As the NIST CSF is considered the de-facto, although voluntary, industry standard for the drilling industry, it is important that the Rig OT cybersecurity model aligns with the CSF.
The five core functions of the CSF can be interpreted in the following way for rig OT systems:
• “Identify” refers to risk management and assessment and should preferably follow the IADC Cybersecurity Guidelines for Assessing and Managing Cybersecurity Risk to Drilling Assets. Risk assessment is essential to understanding the risks posed to a drilling rig and should be performed regularly as risk exposure might vary based on location of the drilling rig, the type of drilling rig, and the dynamic nature of cybersecurity risks, which evolve as new vulnerabilities are disclosed.
• “Protect” in a rig OT environment includes cybersecurity safeguards, such as perimeter defense using firewalls, physical and logical segmentation of networks and systems, network and computer hardening, and whitelisting.
• “Detect” refers to cybersecurity monitoring and detection solutions. These need to be non-intrusive, have little footprint on the network and the end-points, and be tailored for OT systems.
• “Respond” concerns the ability to respond to a cyber-attack in an efficient manner to avoid or reduce serious consequences.
• “Recover” refers to processes and procedures to quickly recover from a cyber-attack after the fact. Most often, this requires the development of incident response and recover plans and procedures that are not only thoroughly documented but also applicable practices where people get the necessary training to execute the response actions effectively should the situation arise.
In regards to the NIST CSF core functions, security safeguards can be grouped into the following categories:
• Physical and logical segregation;
• Authentication and access control;
• Perimeter defense, such as firewall;
• Network monitoring, such as intrusion detection/protection systems (ID/PS); and
• End-point protection, such as blacklisting, antivirus and whitelisting;
In addition, there are non-technical cybersecurity aspects, such as:
• Security policies and procedures;
• Incident response plans and procedures; and
• Security awareness training.
Some, but not all, of the security safeguards could be applied to strengthen rig OT system resilience against cyber-attacks. There is also a need to tailor some of the traditional security safeguards to make them efficient in an OT environment. The cybersecurity requirements for rig OT systems are strict because these systems run in real time. It is, therefore, essential that the cybersecurity safeguards are non-intrusive, have little footprint on the network and end points and are tailored to run in an OT environment. Note that both IT and OT cybersecurity needs to be built based on a defense-in-depth strategy by applying layers of cybersecurity (Figure 2).
Most importantly, cybersecurity resilience strategies and solutions should be applied to rig OT systems based on risk and criticality. No one cybersecurity solution fits every control system. Therefore, it is important to have some flexibility in cybersecurity requirements and strategy. Often, the best practice is to consider the loss of any component that would discontinue or disrupt critical drilling processes or tasks. For example: What impact would the loss of a process controller or SCADA PC have on critical drilling or completion processes? If the risk exposure is considered high, then it’s important to apply robust and reliable cybersecurity to these components and networks. It is also important to consider how to recover and go back to a normal workflow.
Availability and real-time performance is paramount in rig OT systems. These systems must provide reliable control of the industrial processes they support. Therefore, cybersecurity solutions for rig OT systems should not impact the availability and performance of the system. Control system hardware and software are designed for specific functions, and their resources are restricted to those necessary to perform their functions reliably. Indeed, these constraints necessitate a critical requirement for cybersecurity solutions for OT systems: the need for a prudent approach to updates, as any change poses a risk of disrupting the design of the system. ICS cybersecurity practitioners recommend implementation of a defense in-depth strategy based on several layers of protection, covering as many networks and components as possible. They also emphasize having an effective security policy in place, based on training and the security awareness of the ICS users.
ICS and SCADA PCs should be hardened and made reliable. End users should preferably not use cybersecurity barriers and solutions that require routine maintenance. Access to unused PC ports and drives should be restricted, so as to not to leave the door open for human error or malicious attacks, as well as other compounding issues, introduced through routine access. However, access to some drives and ports in certain PCs and servers is necessary. Therefore, a solution should be applied that allows the necessary ports and drives to be accessible on demand. In addition to restricting access to PC ports and drives, locking and securing the PCs and servers in industrial cabinets creates another barrier for onsite attackers to overcome and reduces the risk of human error.
Another solution that is recommended for rig OT systems is application whitelisting (AWL), provided that the AWL is secured. AWL solutions provide good maintenance workflow strategies for ICS and SCADA PCs. AWL-type applications, when configured appropriately to avoid resource consumption, are non-routine maintenance-based applications that work well with OT network devices because they can be hardened with the ICS and the SCADA hardware and software. AWL applications protect the operating system and PC from unknown sources. AWLs do not use the quarantine and delete strategy used by blacklisting applications, such as antivirus, so there is little risk of deleting critical files. However, when updating the SCADA PC, it is important to update the whitelisting application to ensure correct operation.
Figure 3 illustrates an example of an OT cybersecurity strategy applied to a rig OT system, focusing on the defense-in-depth and layered strategy. It is comprised of physical and logical segregation (built-in segmentation), authentication and access control, perimeter defense (such as firewall), network and host-based monitoring, and whitelisting and hardening as end-point protection.
Our recommendation is to adopt a defense-in-depth cybersecurity strategy for rig OT systems that uses barriers and mitigation solutions designed to operate in an OT environment. This can be accomplished by applying a combination of COTS security safeguards and OT-specific tailored cybersecurity safeguards and requires a close collaboration between IT and OT technologists to achieve an effective strategy in practice. There needs to be a considered, flexible and thoughtful approach to building cyber resilience into OT systems that takes into account the risk and impact to the related operational processes, and that is not based on technology alone but on a combination of people, processes and technology. DC
Author acknowledgements: We are thankful to the following entities and colleagues for their help in improving the clarity and presentation of this article: BP, Secure-NOK and May Akrawi, PhD (BP).