This article is the sixth in a series of Data Centre Best Practice articles provided by Future-tech Ltd.
In this article we seek to highlight those basic best practices which are essential for effective cost and energy efficient management of deployed IT infrastructure within a data centre.
In addition to close communication and liaison between the separate teams responsible for data centre operations and ongoing service continuity, it is essential to have an accurate understanding of the IT equipment installed and how this relates to the management of the environment in which they reside. This is particularly important from a cooling, power supply continuity and ongoing energy efficiency perspective. These best practices should be applied by the facility engineering / operations teams and IT hardware management teams.
IT Infrastructure Environmental and Operational Management
In order to maintain continual improvement in energy efficiency and data centre environmental management it is worth considering regular internal checks against EU Code of Conduct for Data Centre Energy Efficiency Best Practice recommendations. This best practice document contains recommendation taken from experience in a wide variety of data centre sites across Europe, all of which have proven to be effective in the reduction of energy usage in data centres. This document (2020 version), is free to download from the following location: https://e3p.jrc.ec.europa.eu/publications/2020-best-practice-guidelines-eu-code-conduct-data-centre-energy-efficiency
Beyond the recommendations of the EU Code of Conduct the next key element to address is the management of Data Centre airflow. The locations of cooling systems and airflow paths are critical, as are the locations of cabinets, cable trays, IT equipment layouts, and partitioning or containment, as well as basic room layout. Management of airflow and the removal of hot air is at least as important as the delivery of cold air. In a data centre it is vital to remove the hot air without allowing it to mix with the incoming cold air in order to maximise cooling (and therefore energy), efficiency. Avoid low Delta T Syndrome (inefficient operation of cooling equipment), by maximising the difference between the supply and return airflow.
To improve energy efficiency make sure that room temperature settings are correctly optimised for the equipment installed. Consider increasing the Delta T (the temperature difference between the incoming cooling supply air and the returning heated exhaust air), of the cooling system to more closely match IT equipment specifications. This may allow a reduction in total airflow, while meeting the same cooling capacity. This will reduce operational costs by improving cooling system demands. In general a higher return air temperature is preferable and it is possible to maximise return air temperature by supplying air effectively to the IT equipment loads. Cooling air should be supplied directly to the IT equipment air intake as recommended by ASHRAE TC9.9. Unlike office spaces, the average room temperature is not a critical parameter and should not be used in either SLA’s or for cooling control.
Temperature sensing and monitoring should be in line with ASHRAE TC9.9 recommendations. Do not allow policy or set points be dictated by “ambient” temperature / humidity measurement or based on data from the Hot Aisle. Hot areas in a Data Centre are not always a bad sign, providing they are in the correct place, i.e. the Hot Aisle. With effective Airflow Management in place seeing hot air in the Hot Aisle is a sign of a well managed data centre.
With the above in mind be aware of the potential for Short Cycling from AHU / CRAC units and always be vigilant for Temperature Acceleration and Hot Air Bypass. It is also worth noting that increasing the velocity of cooling air distribution by increasing fan speeds can often reduce cooling effectiveness rather than improving it. Increasing fan speeds also consumes significantly more energy based on the Cube Law (This relationship means that a small increase in speed requiring proportionally far more energy to operate. Conversely a reduction in fan speed can result in beneficial energy savings).
Beyond the IT equipment rooms / data halls the cooling equipment outside of these spaces should also be reviewed for their air flow conditions and operational status. These include external condenser and/or chiller units etc. It is not uncommon to find that external units sit in a compound or area with limited airflow. This may allow temperature build up in these locations such that the installed equipment is required to operate beyond design parameters. This can lead directly to both equipment inefficiencies and risk of cooling failure.
Consider consulting with cooling specialists for complex implementations and to maximise energy efficiency gains. Intelligent tools are now available to help with cooling and airflow management, not just to highlight problems but also to offer intelligent solutions based on machine learning and built in expertise.
It is also extremely important to conduct regular IT asset / inventory audits and CMDB accuracy checks to verify that capacity and performance data is reliable. This is vital for both capacity planning and the trending of data centre resource utilisation (primarily space, power, cooling). During these audits additional visual checks should include confirmation that blanking plates have been installed in empty spaces within cabinets in order to prevent supply / return air mixing / bypass. Cable management in the rear of cabinets should be examined to identify any restrictions or impediments to effective hot air exhausting.
Checks should also be conducted to ensure that vented floor tiles are not placed where they are not required. If there is no equipment in the immediate vicinity to be cooled there should be no vented floor tiles present (This includes the Hot Aisle!). If issues of this nature are spotted vented floor tiles should not be relocated without first consulting Data Centre Management and having formal change approval.
Regular checks should highlight any inconsistencies in hot aisle / cold aisle separation. Consider that the use of empty cabinets or partitioning to fill gaps between cabinets to reinforce hot aisle / cold aisle layout or containment.
The regular audits should confirm the accuracy of records of room layout, equipment locations and IT equipment details. This should include information on equipment manufacturer, model, installation date, purchase date, manufacture year, power requirements, space requirements, cooling requirements (temperature, humidity and airflow volume), equipment and application owner, warranty information, etc.
Within equipment cabinets should confirm the fact that the load should be properly balanced with the highest power rated equipment at the bottom of the cabinet (assuming cold air supply is via and under floor plenum). Less power consumptive hardware should be placed at the top of the cabinet with increasing equipment power requirements moving down the cabinet.
If no automated monitoring system is in place checks should also confirm that cabinet and PDU breaker loads are kept within agreed tolerances. Generally electrical infrastructure loading should be monitored and understood in the context of both overall capacity and the potential impact upon power supply redundancy in the event of a failure (E.G. N+1).
Insist on a documented process for managing the installation and removal of IT equipment from the computer room, which anticipates the impact on the power and cooling infrastructure and capacities. Equipment installations or removals should never be rushed through as an expediency and without adherence to standing procedures.
Housekeeping policies should enforce a technical environment free of contaminants such as dirt and debris or stored items including combustibles, cleaning equipment, shipping boxes, paper-based documentation or personal items.
Future-tech have been designing, building and managing business critical data centres since 1982. The experience gained in being involved in the data centre sector from the outset has resulted in Future-tech sites achieved 99.999% uptime during 35+ years of operation. Future-tech has a team of experienced, skilled and highly trained in-house Data Centre Engineers capable of properly maintaining and operating business critical data centre sites of all sizes. For more details please contact Richard Stacey on 0845 900 0127 or at firstname.lastname@example.org