Discover how DC5 operates: How do we optimize the energy footprint of datacenter? How can we prevent power loss? How can we deliver proper cooling make it modular?
One night in September, a power transformer shut down in one of our Parisian data centers. While we were writing this article, this situation happened again for the third time in ten years. Like the two other times, our two power backups ensured the power lineup worked while our team rallied to bring the situation back to normal. Read on to find out what happened during this tense night.
We equip all of our data centers with a Scaleway-made building management system tool called SiMA. Thanks to this tool, we can monitor and analyze hundreds of thousands of real-time data points from our equipment. This allows us to have a complete overview of our infrastructure at all times, and to be able to optimize it to be as close as possible to customers’ demands.
We build our software and hardware to monitor our equipment because manufacturers’ products do not come equipped with the technical level we require.
It is common to see building management system tools exceed one million euros in our business.
So, we built our own and integrated it as an internal chatbot. Thanks to SiMA, we started receiving notifications at 05:09 AM, alerting us that one of our power lineups was no longer being supplied by the grid. Our technicians immediately checked the programmable logic controller and confirmed what we feared: SiMA was right, and we had a long night ahead of us. As soon as the failure occurred, the automatic switch to our generators had been made.
First step: synchronize and assess the situation
We synched with our on-call engineers and board members, and notified our clients. At this point, we have an autonomy of five days on fuel oil, and 20 minutes on battery. The issue was likely caused by insufficient oil in the transformer itself. Our team quickly went to the site and found an oil leak by a transformer component called the Buchholz relay. This is a protection relay that acts as a sensor to monitor the temperature, oil level, and gas discharge of the transformer.
Safe working conditions - isolate the high-tension unit
The fault with the Buchholz relay triggered the insufficiency, but luckily, we were only a few liters of oil short. We then started by creating a safe working environment by isolating the high-tension unit from the power transformer, while other members of the team searched for vegetable oil to stock up on - this proved to be quite a mission in itself as the incident occurred in the middle of the night.
We use vegetable oil instead of other types of oil to power our transformers, mainly for environmental and security reasons. The oil we use has a fire point of over 300°C, which makes it barely flammable. It also is bio-sourced, easily biodegradable, and non-toxic. Unfortunately, so far, our experience with vegetable oil has been pretty bad.
The power lineup continued to be fed by its two electric generators, supervised by our engineers. The company that handles the maintenance came to the site, too, ready to assist us. Even with two electric generators, you can never be too cautious. The faulty Buchholz relay was dismantled and checked to diagnose and understand what went wrong, and learn from there. The new relay was calibrated and then installed.
We have now been relying on our electric generators for eight hours.
We secured a shipment of 90L of vegetable oil, and a brand new Buchholz relay from our supplier that will arrive at 9am.
After installing our new Buchholz relay, we added 30L of oil, and we then needed to purge any air from the system. There’s always a risk of fire with transformers, like batteries or power inverters. That is why we installed it in a fire-resistant retention tank, partitioned by a fire-proof wall. The tank also collects oil leaks - if the oil catches fire, the fire naturally gets suppressed by the tank. Even if the vegetable oil is pretty harmless for the environment, we ensure any residue and leaks are collected.
Following the system purge, the Buchholz relay was ready to be pushed into production. Our team prepared to put the transformer live again, and switch the system over to stop using the generators.
The transformer sings at 50 Hertz again
…and now all we had to do was to test it!
To monitor and test the new installation live, we must close its MV circuit breaker (20,000V). A special gas, SF6, is used to prevent the formation of an electric arc.
Usually, this is inevitable in this voltage range. The operator must use the appropriate PPE - gloves, an anti-UV helmet, an insulating mat, and a stool.
The Buchholz relay is a fire safety element, so we had to test it carefully: three engineers synched by radio were necessary to validate the alarm reports and that the circuit breaker would trip correctly.
After 10 hours of intervention, dozens of people involved, and zero customers impacted, the issue was resolved. In 10 years, this exact same incident has occurred three times.
Transparency is one of our core values, and that is why SiMA also provides live data like our real-time PUE reports for each of our data centers, as you can see, for instance, here for DC3.
This article provides a curated list of great open-source projects to help you build your startup and deal with tooling, design, infrastructure, project management, cybersecurity, and more.