Data centre auditing

It is important to decide what you are trying to achieve with the audit

Capitoline have audited 100s of data centres across the world from small computer rooms to large colocation data centres. If you are specifically interested in Design or Facility Certification to the TIA-942 standard click here or if you are interested in Design or Facility Certification to the EN50600 standard click here. Alternatively please explore auditing further by clicking on the tabs to the left.

CLICK TAB FOR MORE

TIER, RATING or CLASS COMPLIANCE

If you are specifically interested in Design or Facility Certification to the TIA-942 standard click here or if you are interested in Design or Facility Certification to the EN50600 standard click here.

If you unsure but are considering certification of your data centre or a new data centre design first ask yourself…

Which standard do you want the design to be certified against?

Some of the key data centre standards referring to Tier, Rating or Class;

  • TIA942 – The American Data Center Standard
  • EN50600 – The European Data Centre Standard
  • The Uptime Institute – a privately owned business

Not all standards have the same interpretation of Tier or Class. The main aim of classifying a data centre design in terms of Tier or Class is to establish its resilience to failures. The aim of a higher Tier or Class is to remove single points of failure from the design and to enable maintenance to take place on the infrastructure without the need to shut down all or part of the IT systems (often referred to as concurrent maintainability).

It is worth asking…

Do I really need my data centre certified?

If you are building or operating a colocation data centre then you may want to certify your data centre against the TIA942 or EN50600 data centre standard. Provided you use a competent and experienced consultant such as Capitoline to do this then your customers will have the confidence in that certification. The certificate can then be used as a marketing tool.

On the other hand if you are building or managing your own data centre then it is important to understand the vulnerabilities in the design but the need for a certificate is questionable.

What most data centre owners, operators and users are really interested in understanding is…

Do I have an Single Points of failure in the design?

Will I have to shut down my IT equipment in order to carry out routine maintenance on the data centre infrastructure?

Whether you are looking for certification against a standard or focussing on removing single points of failure and establishing concurrent maintainability then Capitoline can help you. We can combine this with any of the other aspects of data centre auditing to provide a comprehensive review of your data centre infrastructure and operations management. If you would like to know more please…

Please ask us a question

TROUBLESHOOTING and FAILURE ANALYSIS

Reliability is perhaps the most important requirement for a modern data centre because of the dependency organisations have on IT systems. When the IT systems fail it is a major issue and so we must respond appropriately. The first priority is to get the IT systems back online.

IT SYSTEMS OFFLINE!

Why did it happen?

How can we stop it happening again?

The first question may be easy to answer. Perhaps the cause of the IT systems outage was the failure of a component in the power or cooling systems. If so we can fix it and get our systems up and running again but if we do not invest time in answering the second question then we may suffer the same fate again. It may not be the same component that fails but perhaps we have an inherent weakness in our data centre infrastructure. This should be investigated, the weak points identified and remedial work be recommended. The management team can then make an assessment as to whether the cost of the remedial work compares favourably to the reduced risk of another failure.

It is important to understand that not all failures are caused by the physical infrastructure, many are caused by operational issues. We must therefore also review operational processes and staff experience and training to ensure that we have minimised the risk of a repeat failure due to these factors.

Capitoline has a proven method of analysing both infrastructure and operational failures in data centres and recommending improvements to mitigate future risks to continuity of service. We can combine this with any of the other aspects of data centre auditing to provide a comprehensive review of your data centre infrastructure and operations management. If you would like to know more please…

Please ask us a question

ENERGY EFFICIENCY AUDIT

A typical data centre consumes 50 times more energy per square meter than an office building. Data centres are responsible for more than 2% of the worldwide production of Carbon Dioxide. Carbon Dioxide has been linked to global warming and so it follows that we should be trying to reduce energy consumption in data centres.

Concern for our organisations “green” credentials and being conscious of environmental issue are important but often the main driving force for reducing energy consumption in data centres is that…

Improving energy efficiency in the data centre saves money!

Not only does improved energy efficiency reduce running costs but it can also increase the data centres capacity to house IT equipment. A more efficient data centre can make more power available to the IT equipment.

There are many ways in which the efficiency of a data centre can be improved, some are low cost and very simple to implement, others require a long term view to be taken.

We can help you to improve your energy efficiency. Whether you are trying to reduce operating costs, maximise capacity or demonstrate compliance with the European code of conduct on data centres Capitoline can help. We can combine this with any of the other aspects of data centre auditing to provide a comprehensive review of your data centre infrastructure and operations management. If you would like to know more please…,

Please ask us a question

BUSINESS CONTINUITY and RISK ASSESSMENT

Data Centres play a critical role in supporting the IT systems that our organisations rely on. For the business to keep operating the data centre has to provide continuity of service. The consequences of a significant data centre failure can be catastrophic …

“Two out of five companies that experience a catastrophe or an extended system outage never resume operations, and of those that do, one-third go out of business within two years” Gartner Group

Planning for business continuity starts with assessing the risks to the business. Failure of the data centre is a key risk to the business and so it follows that we should assess the risks in the data centre and document how we intend to mitigate against them. This means assessing the external environmental risks, the internal risks and in particular the risks to the infrastructure, the personnel and the operational risks.

Even if we have the most resilient design and industry leading operational processes things may still go wrong!

Even with the best laid plans things can go wrong and so it is important to have a disaster recovery plan. Documenting your disaster recovery plan in advance of a disaster can dramatically reduce recovery time. Many industry sectors including banking and government require adherence to standards relating to security and business continuity such as ISO27000 and those standards require formal risk assessment and mitigation. Capitoline can help you to assess the risks to your data centre and propose mitigation through upgrades to infrastructure or improved operational processes. We can combine this with any of the other aspects of data centre auditing to provide a comprehensive review of your data centre infrastructure and operations management. If you would like to know more please…

Please ask us a question

OPERATIONAL POLICIES, PROCESSES and PROCEDURES

It is becoming increasingly recognised that solely focussing on data centre design to achieve high reliability in data centres is a mistake…

Data centre operations management is just as important as design !

To prevent data centre failures many organisations spend millions on redundant equipment and alternative power supplies. However, if you do not have the right documentation, operational processes and suitably trained staff that money can be wasted.

A Tier 4 data centre with poor operations management can behave like a Tier 1 site in terms of reliability!

Good operations management is not just about putting maintenance contracts in place. We must also have…

  • Up to date documentation
  • Appropriate organisation structure
  • Trained staff
  • Appropriate policies
  • Appropriate operating procedures
  • Planned and corrective maintenance
  • Monitoring systems
  • Measurement and feedback of KPIs (Key Performance Indicators)

If you want to know whether you have everything you should have to prevent operational failures and maintain an efficient, secure and professional run data centre we can help you. We can combine this with any of the other aspects of data centre auditing to provide a comprehensive review of your data centre infrastructure and operations management. If you would like to know more please…

Please ask us a question

SINGLE POINT of FAILURE ANALYSIS

The concept of Tiers in data centre design can cause confusion and is often misinterpreted.

What many data centre owners and users want to understand is whether there are any single points of failure (SPoF) in their data centre infrastructure. In other words, is there a component, pathway or system supporting the IT equipment in my data centre which, if it failed, would result in my IT equipment being offline until the faulty item was fixed.A data centre operator may be willing to tolerate the occasional failure for a short period of time provided they can switch to an alternative system whilst the faulty item is repaired. This is referred to as concurrent maintainability.Alternatively, if the services are critical to the organisation then they may require the alternative system to be continuously active and the switch over to be instantaneous and automatic so that there is continuity of service. Such a system is said to be fault tolerant.In order to establish whether a data centre is concurrently maintainable and/or fault tolerant…

It is important to establish that there are no single points of failure.

In a data centre this normally means performing a detailed review of the power and cooling infrastructure to ensure that the failure of a single item can be isolated in such a way that the power and cooling requirements of the IT systems can be maintained whilst the faulty item is replaced.

The power and cooling capacity should not be reduced below that required by the IT systems by a single item failure or we will have to shutdown some of the systems.

The principles of identifying single points of failure has been applied for many years in critical engineering facilities not just in data centres. In fact it has also been applied extensively in critical business processes such as supply chain management and in IT network design.

If you would like to know whether you have any single points of failures in your data centre or in other engineering infrastructures we can help. We can combine this with any of the other aspects of data centre auditing to provide a comprehensive review of your data centre infrastructure and operations management. If you would like to know more please…

Please ask us a question

CONCURRENT MAINTAINABILITY

The key design requirement for a Tier 3 data centre is concurrent maintainability.

The infrastructure of the data centre is concurrently maintainable if we can maintain any item in that infrastructure without the need to shut down all or part of the IT systems being supported.

A data centre which is concurrently maintainable is not necessarily fault tolerant. A data centre operator may be willing to tolerate the occasional failure for a short period of time provided they can switch to an alternative system whilst the faulty item is repaired. This is referred to as concurrent maintainability but the system is not fault tolerant. A fault tolerant system would be able to switch over to an alternative supply automatically and instantaneously so that there is continuity of service.

In order to establish whether a data centre is concurrently maintainable we must perform a detailed review of the power and cooling infrastructure to ensure that each item in the infrastructure can be isolated in such a way that the power and cooling requirements of the IT systems can be maintained whilst that item is maintained or replaced.

The power and cooling capacity should not be reduced below that required by the IT systems whilst maintenance is being carried out or we will have to shutdown some of the systems.

We can tell you whether your infrastructure or your design is concurrently maintainable. We can combine this with any of the other aspects of data centre auditing to provide a comprehensive review of your data centre infrastructure and operations management. If you would like to know more please…

Please ask us a question

THERMAL IMAGING

Thermal imaging is using the infra-red light emitted from warm items to indicate their temperature in an easy to comprehend visual format. Thermal imaging, also known as thermography, is especially good at showing small temperature differences between adjacent objects and so has been used in many engineering, medical and military investigative work.

Taking the thermal photograph is easy. It is the interpretation of the image that is important .

In a data centre, thermal imaging can be used to;

  • Identify overheating IT equipment
  • Identify overheating electrical cables and switchgear
  • Get a ‘snapshot’ of hot and cold aisle temperatures
  • Find cold air leakages through floor tiles
  • Observe solar thermal gains through walls and ceilings
  • Discover failed cooling fans
  • Find IT equipment placed ‘the wrong way round’

Capitoline use thermal imaging to quickly identify problems which are not always obvious with the naked eye. We combine this with other aspects of data centre auditing to provide a comprehensive review of your data centre infrastructure and operations management. If you would like to know more please…

Please ask us a question

RADIO and ELECTRO-MAGNETIC FREQUENCY INTERFERENCE

There are two main reasons for conducting an electromagnetic and radio frequency survey or audit of a facility;

  1.  To ensure the levels are safe for people.
  2. To ensure the reliable operation of information technology equipment.

1. Human exposure requirements
Humans can be exposed to electromagnetic fields from the following;

  • High voltage and/or high current cables
  • High current transformers
  • Microwave ovens
  • Radar transmitters
  • Medical magnetic imaging
  • Mobile/cellular telephones and transmitter masts
  • TV and radio transmitters

Regulations and recommendations
The organisations that have most to say on exposure levels to electromagnetic fields are the World Health Organisation, WHO, European Union and the American Occupational Safety and Health Administration, OSHA. In turn they often look to the International Commission on non-ionizing Radiation Protection, ICNIRP, and the Institute of Electrical and Electronic Engineers, IEEE, for detailed guidance.

The appropriate levels can be measure to ensure that you are meeting the appropriate regulations and recommendations for the safety of your staff and visitors to your facility.

2. Requirements for the reliable operation of information technology equipment

Information Technology equipment (IT) covers everything from computers, servers, local area networks, storage and telecommunications equipment. It is designed to tolerate a certain amount of attack from electrostatic and electromagnetic fields. Once these levels have been exceeded however they will malfunction; this will be manifested as lost and corrupted data, a slowing down of LAN traffic (as the error correcting codes keep asking for data to be re-transmitted) and logging on and off of terminal equipment.

Most IT equipment is labelled with the American UL FCC and European Union CE mark to demonstrate that it can tolerate certain levels of electromagnetic fields and that it in turn will not radiate more than a certain amount of electromagnetic radiation.

If the amount of electromagnetic interference is likely to cause a problem e.g. if the electric field is greater than 3 V/m, then the following remedies can be applied;

  • Measure the local electromagnetic environment
  • Use screened/shielded copper data cables
  • Use optical fibre connections
  • Place all IT equipment in steel racks/cabinets that are correctly earthed
  • Ensure the building has an IT grade earthing system e.g. EN 50310 or TIA 607
  • Screen the room e.g. with copper foil
  • Use a steel framed-steel clad building

Capitoline use specialist measurement equipment to determine the levels of electromagnetic radiation across a wide range of frequencies to determine which, if any, of the above precautions are necessary for your data centre. We combine this with other aspects of data centre auditing to provide a comprehensive review of your data centre infrastructure and operations management. If you would like to know more please…

Please ask us a question

SITE SELECTION

If you are thinking of building a new data centre but not sure which is the best site, or perhaps you have a shortlist of one or two sites but you want to know which is the best. Obviously there is a financial decision to be taken but you should also be considering the risk profile of the site.

How much would you be willing to invest without understanding the risks to your investment ?

A data centre is a significant investment and perhaps more importantly your business will depend on the service continuity it can provide. Anything which may disrupt that continuity of operation should be identified and the risk assessed before making a decision on location.

This can be determined through a site location assessment. Guidance is provided in a number of standards including;

  • TIA 942
  • BICSI 02
  • EN 50600

A site location assessment includes the assessment of risk from a number of items including…

  • Road
  • Rail
  • Waterways
  • Flood zones
  • Airports/flight paths
  • Oil/chemical/industrial plants
  • Interference from Power lines
  • Interference from Mobile phone masts
  • Interference from Radar/radio/TV transmitters

The assessment of interference requires the use of specialist RF spectrum analysers which are used to perform a full frequency scan test to decide if the site meets the EMC/EMI requirements of TIA 942 and EN 55024 and other relevant standards.

Capitoline offer a site location assessment service. We assess each proposed site against the recommendations of the appropriate standards and produce a report with a definitive statement as to the suitability of each site. Please note that we also offer a service to carry out a detailed assessment of colocation facilities if you are considering using their services. If you would like to know more please…

Please ask us a question