Data centre management part two: operations and maintenance

by | Nov 28, 2017 | Articles, Maintenance & Management

The effective management of any data centre requires a coherent operational philosophy backed up by accurate and current documentation. As detailed in part one of this series, innovative design and construction is only half the story when it comes to data centres being resilient and fit for purpose. The discipline of effective data centre management and maintenance is referred to by various names but operations and maintenance (O&M) is one accepted term. A properly developed and implemented O&M programme is critical in order to reduce risk, lower costs and improve performance.

However, despite its importance, aspects of effective O&M are often overlooked. For example, as an engineering design specialist, Future-tech is often brought into assess operational facilities as part of due diligence for potential acquisitions. Our assessment process should be made easier by being able to refer to accurate and current O&M information but all too often the information is inaccurate or out of date. This can make assessment of a facility more challenging and, more importantly, is a potential indicator of the O&M team’s competence and management culture.

Definition
O&M can be difficult to define. Broadly it is set of best practices, or ‘operational philosophy’, for the effective management of the facility as well as guidance on how to deal with incidents and emergencies. Specific areas include:
– Health and safety
– Maintenance management
– Asset management
– Change management
– Documentation management
– Quality management
– Energy management

These procedures may also be referred to as standard operating procedures (SoPs), methods of procedure (MoPS) and emergency operating procedures (EoPs). This procedural information is combined with accurate asset information detailed in physical manuals or other forms of documentation. O&M information can also extend to other aspects of facilities management including staff training and financial planning.

Documents and resources
O&M documents should be created during the design, fit-out and commissioning of a facility. These documents should then be handed off to the operations team to provide detailed guidance for day-to-day as well as strategic management. The O&M information should be continually updated to reflect any changes in the mechanical and electrical (M&E) design and other fundamental alterations to the facility.

The O&M documents should include the following elements:

Basis of design document. This includes crucial information such as the original clients brief as well as a clear description of the operational plan or philosophy for the site. Other information could include descriptions of power loading and levels of redundancy

Commissioning records. Records from the commissioning process should verify and record that the design consultant’s design meets the client’s brief and what has been installed operates as intended

Drawings and diagrams. Including electrical single line diagram, plant room layouts, cable routes, rack layouts, final sub-circuits. Inaccurate or missing drawings and layouts can introduce additional time and risk into maintenance procedures. Downtime incidents can often be traced back to poor and inaccurate O&M information. For more detail read Data centre CSI: Investigating facility downtime

Protection discrimination study. This study calculates and records the settings of the circuit breakers used within the electrical system.

O&M manuals of package suppliers. Suppliers of major components of the electrical system will all provide their own O&M manuals. These will often be generic rather than specific to the site and need to be updated during the commissioning process.

Opportunities and challenges. Maintaining a coherent and effective O&M programme is challenging for a number of reasons:

Not prioritized. O&M information is crucial from a strategic perspective in keeping the site running optimally. However the myriad of tactical and short-term tasks may distract staff from the importance of creating accurate O&M records and keeping them current.

Staff turnover. Although O&M information is meant to be recorded and kept current, in practice a lot of specific maintenance information may not be recorded at all but retained by specific employees or contractors. If those team members leave the organization then crucial O&M information may be lost.

Different teams. The sheer number of different teams and organizations involved in the design, construction and operation of a facility makes creating coherent O&M documents very challenging. O&M information is usually created by the design team then handed off to operations. If communication is poor during this hand-over then the documents and practices may not be fit for purpose. Often the design team tasked with implementing on-going upgrades and changes may not be the same as the team that designed the original site. Having up to date records to use as reference and then updating these to reflect the changes is vital.

Document formats. A lot of potential O&M information is recorded in formats such as .pdf that can be difficult to keep up to date. Record drawings should ideally be handed over in Revit, Autocad or similar format to allow on-going updates. Software tools can help with the recording and updating of this information.

There are a number of approaches that can help ensure O&M best practice and accurate documentation.

Choosing the right design team. Proactive design consultants will create accurate O&M information and then work with the operations team during a handover period to ensure it is fit for purpose and easily updatable.

Certification. Organisations such as the Uptime Institute, with its management and Operations (M&O) certification scheme, can provide third-party guidance and oversight of O&M procedures and documentation.

Facilities management services. Third-party FM providers are able to provide guidance on O&M best practice to internal facilities teams or if necessary take on the facilities management, in person or remotely, of the entire facility.

Software tools and automation. As described in the first part of this series, a new generation of software tools are emerging that can help to automate some O&M tasks as well as the creation of the new documentation. This includes so-called Datacentre Operations Management (DCOM) tools but also cloud-based data centre monitoring software known as data centre management as a service (DMaaS). These tools can enable operators to introduce predictive and preventative maintenance procedures.

Future direction
As detailed in part one of this report, there have been a number of new technologies, regulatory changes and other factors that are changing the way facilities are designed and managed. However, improvements in data centre management ultimately depend on better communication and integration between all players involved from design and operations teams to IT and facilities.

The importance of up to date O&M information should not be underestimated.  If you would like an independent review of your O&M information or already know it could be better and requires updating, contact us on info@future-tech.co.uk