Global warming is forcing us to question the techniques ordinarily used to cool data centers, as they are responsible for a significant part of the digital economy’s energy consumption.
The adage “data is the new oil” may be verging on the prehistoric; and yet data has never been more essential to success in all business sectors, and in tech in particular. Just ask Google, Facebook & co…
In recent years, as our usage of digital tools has skyrocketed, so has the sheer production of data. Let’s not forget that in 2018, IDC predicted that worldwide data would grow by 61%, to 175 zettabytes (a zettabyte is a trillion gigabytes). And that was before the recent ChatGPT-fuelled AI explosion, which is currently generating countless times more data than the gazillions of web pages and images GPT-3 and 4 ingested during their training.
The key difference, of course, is that whilst most oil extracted has intrinsic value, the same cannot be said for data. A raw stream of information is useless if you don’t know what to do with it. Such knowledge is now critical. According to DELL’s “Unlocking the Value of Data with Data Innovation Acceleration” report, better data management can improve product quality, application availability/predictability, customer service, productivity and more:
Even just taking the example of one industrial giant, global steel leader ArcelorMittal has “one of France’s biggest data lakes, because we have several sites and production lines, with countless sensors generating data each 10 milliseconds,” David Glijer, the company’s Digital Transformation Director, explained at France Digitale's FD3 event in Paris, March 29. Making the most of this flood of information is precisely why ArcelorMittal has “a department whose goal is to create added value from data, and works with other teams to do that.”
So, why is it so important to optimize data usage?
“We produce 1 steel coil every 10 minutes. If we have the slightest quality deviation, we need to be able to react efficiently. So we need the data to be very close to the machine”, said Glijer, adding that the steel giant uses a combination of on-premise, edge and cloud technologies in order to achieve this.
Cloud technologies are, of course, key to efficiency. “The cloud is frugal by design; it’s about sharing infrastructure and optimizing space,” Scaleway’s Chief Operating Officer Albane Bruyas told the same panel. Central to that frugality, she explained, is using the cloud properly.
“Most of our Compute machines are made to be shared. When I joined Scaleway, I was surprised to discover how many people use these products like dedicated servers. It’s important that the cloud be used how it was supposed to be used,” said Bruyas. This is particularly crucial considering most cloud instances are at rest 93% of the time.
So frugality isn’t just good for electricity bills, it’s also good for the planet. “We improve the impact of our data centers by analyzing where the biggest impact is,” said Bruyas. “We discovered the two most important parts of our impact are hardware and power. So we’re working to make sure that frugality is not only what we ask of our customers, but what we apply to what we do too.”
The notion of responsibility is another key to data efficiency, Bruyas added. “You’re very sensible when you’re on premise, but when you move to the cloud, you forget to ask where the data is. And yet it’s crucial, because it’s about sovereignty and environmental impact."
"If your cloud provider doesn’t give you that information, then there’s an issue, because the cloud is a strategic resource," Bruyas affirmed. "This information will enable you to control your risks. A CTO should ask for information on power consumption, products, costs (now and in the long term), environmental impact… It’s really a client responsibility.”
GDPR is another responsibility all companies handling data — i.e. all companies anywhere — have to deal with. But with some data restraints can come opportunities, as ‘deep data’ startup XXII reminded us at FD3.
The company, which uses AI with surveillance cameras to detect specific cases like traffic jams, garbage in the street, or even fires, has to focus on using synthetic data — information that’s artificially generated, rather than produced by real-world events — principally for privacy reasons.
“Our main challenge is GDPR,” said Dam Mulhem, XXII’s Chief Data Officer. “All cameras we use are on the street, or in private entities. So there are issues of privacy, personal data, how we can collect data and how we can give it to the R&D team to use. We don’t want to use biometric data like facial recognition, nor to recognize gender."
"We need thousands of images to train our model," he continued. "Using synthetic data, we’re able to scale quicker, and attain operational effectiveness faster. Furthermore, with a sovereign cloud, we can provide that platform directly to our clients, so it’s easy to deploy.”
Mastering the data this way, explained Mulhem, is also positive for XXII’s future perspectives. “We build our own software internally, we input the datasets we already have, we have all the metadata, so we can generate on each image the scale, the field of view, the whole chromatic operation with great balance. To generate data, as we already have an API internally, we can say, for example, ‘I need 10,000 fires’, then they can have the full dataset in just a few hours.”
So once efficiency and privacy are covered, how exactly do you go about generating value from data? Values are, fittingly, key to this equation. If prospects know their values are aligned with those of your company, they’re more likely to become customers.
This is one of the reasons XXII has an ethical committee, said Malhem, “so whenever we get a new request from a client, we discuss it first. Even if we know how to do facial recognition, we don’t want to provide it to [just] anyone. In airports, we’ve all agreed to have our biometric data available, through our passports. [But elsewhere] it’s not a good idea.”
Glijer concurred that “it’s very important for us to have partners that share our values. It makes no sense to work with people who aren’t in favor of decarbonation, for example (ArcelorMittal has pledged to be carbon neutral by 2050). So we need to create a big ecosystem and a strong will in society, for example to use more ‘green IT’ computers. It needs to be a big movement that we want to continue.”
Data is equally key to generating future value for ArcelorMittal’s, said Glijer. “We want to be an Industry 4.0 leader in Europe. Next year, we’re starting our first digital-native plant, with new IT systems, automation and 5G by design.” All of which will be impossible without the right data.
Still not convinced of data’s true value? A recent study on Measuring the Effectiveness of Data from the University of Texas found that if the median Fortune 1000 business increased the usability of their data by just 10%, it would translate to an increase in $2.01 billion in total revenue every year (source). And according to DELL’s aforementioned report, the companies who innovate the most when it comes to data increased their revenue by 19% and released around five more products than those just making a standard effort.
In a nutshell, investing in data always pays off. Especially with AI propelling us into a brave new world as we speak…
When you build your infrastructure with Scaleway, it’s important to take a few simple rules into account, to limit the risk of data loss. Data is a shared responsibility: provider & customer.