Computers

DeepMind AI slashes cooling costs at Google's data centers

View 2 Images
Google's DeepMind machine learning system has helped cut the cooling costs of the company's data centers
Google's DeepMind machine learning system has helped cut the cooling costs of the company's data centers
As seen in the graph, the data center's power consumption was cut by 40 percent when the machine learning controls were switched on

Deep learning AI has been put to work in intelligent drones, sequencing genomes, learning the tactics of the ancient Chinese board game Go, and even keeping cats off the lawn. Now, Google has set its DeepMind system loose on its massive data centers, and drastically cut the cost of cooling these facilities in the process.

Running Gmail, YouTube, and the all-knowing Google Search guzzles a tremendous amount of power, and while Google has invested heavily in making its servers, cooling systems and energy sources as efficient and green as possible, there's always room for improvement. Especially when the industrial-scale cooling systems are difficult to run efficiently, given the complex interactions that occur between equipment, environment and staff in a data center.

To account for all those factors that a human operator or traditional formula-based engineering might miss, the team put DeepMind to work on the problem, and the result was a drastic reduction in power consumption for the center's cooling systems.

The efficiency was measured by the ratio of the IT department's energy usage compared to that of the entire building – a metric known as Power Usage Effectiveness (PUE). DeepMind networks were fed existing data, including temperature, power and pump speeds, and then trained to focus on the average future PUE, while other systems analyzed data to predict how factors like the temperature and pressure would change over the next hour, and adjust the cooling systems accordingly.

As seen in the graph, the data center's power consumption was cut by 40 percent when the machine learning controls were switched on

With the PUE plotted out, DeepMind's effectiveness is pretty clear: when the machine learning controls were turned on, the site saw a consistent 40 percent reduction in power used for cooling, a 15 percent reduction in total PUE (after inefficiencies in other departments were accounted for), and a new record for the lowest PUE the center had ever achieved.

Google plans to expand the system more broadly across its own facilities, as well as share the nitty-gritty of how it achieved the energy savings to help other data centers and industrial system operators reduce their energy consumption and environmental footprint.

Source: Google DeepMind

  • Facebook
  • Twitter
  • Flipboard
  • LinkedIn
5 comments
chann94501
So what is the scale of that graph? Without any axes and with an offset zero how is that supposed to be anything but a wiggly line that indicates something got slightly lower?
ezeflyer
Now try it for governing with maximum benefit for all people and our environment.
Dan_Linder
@chann94501 - This is Google - I expect the scale of the graph to be reported in units related to Kelvin. :)
From the Wikipedia page about PUE, it states "In October 2008, Google's Data center was noted to have a ratio of 1.21 PUE across all 6 of its centers, which at the time was considered as close to perfect as possible. Right behind Google, was Microsoft, which had another notable PUE ratio of 1.22 [13]"
This doesn't shed light on the graph as displayed, but I would assume the low end would be "1.0" and the high end near the "1.21". If that's true, even if the low end is at the "1.15" point, it's quite substantial.
But then, this is all a guess without numbers... :(
timmyd89
The graph purports to show that PUE improves (moves toward 1) when AI is activated. PUE is the ratio of total facility usage to the IT equipment energy usage. A lower PUE is better. What I want to know is precisely how is AI attaining this; which inputs is it using and how is it calculating and performing a response to those inputs. How does this control loop differ from standard digital or analog control. What inputs are needed, where are the needed, and how is the programming done? The real challenge would be maintaining a low PUE at all times under all conditions, e.g. PUE 1.1 +/- .05. (P.S. a data center cannot have a PUE lower than 1 without some onsite renewable power generation). It has a lot of promise, and if actually repeatable, should be the standard of savvy DC operators.
Bob Flint
AI, or not...just monitor the logged on user's queries, time of day, location, news streams, etc. then throttle up or down...as traffic requires.