AI applications such as large-scale language models (LLM) have become an integral part of our daily lives. The computing, storage and transmission capabilities required are provided by data centers that consume a huge amount of energy. In Germany alone, this amounted to around 16 billion kWh in 2020, or about 1% of the country’s total energy consumption. This figure is expected to increase to 22 billion kWh in 2025.
The new method is 100 times faster with comparable accuracy
As more complex AI applications arrive in the coming years, demand for data center capacity will increase dramatically. These applications run out of vast amounts of energy into training neural networks. To counter this trend, researchers have developed a training method 100 times faster, achieving accuracy comparable to existing procedures. This significantly reduces the energy consumption of your training.
The functionality of neural networks used in AI for tasks such as image recognition and language processing is inspired by the way the human brain functions. These networks are made up of interconnected nodes called artificial neurons. The input signal is weighted with specific parameters and then summarized. If the defined threshold is exceeded, the signal is passed to the next node. Initial selection of parameter values is usually randomized to train the network, for example using a normal distribution. The values are then gradually adjusted to gradually improve network prediction. This training is extremely rigorous and consumes a lot of power, as it requires a lot of iterations.
Parameters selected according to probability
Felix Dietrich, a professor of physics-enhanced machine learning, and his team, have developed new methods. Instead of repeatedly determining parameters between nodes, their approaches use probability. Their stochastic methods are based on the targeted use of values at critical locations in training data where large and rapid changes in values occur. The aim of the current study is to use this approach to acquire dynamic systems that maintain energy from data. Such systems change over time according to certain rules and can be seen in climate models and financial markets, for example.
“Our methods allow us to determine the parameters needed with minimal computing power. This makes neural networks much faster and therefore more energy efficient,” says Felix Dietrich. “In addition, we found that the accuracy of the new method is comparable to the accuracy of a repeatedly trained network.”
Rahma, Atamert, Chinmay Datar, and Felix Dietrich, “Training Hamilton Neural Networks without Backpropagation,” 2024. Machine Learning and Physics Science Workshop at the 38th Neuroinformation Processing System (Nelip) https://neurips.cc/virtual/2024/9994
Bolager, Erik L, Iryna Burak, Chinmay Datar, Qing Sun, and Felix Dietrich. 2023. “Deep neural network sampling weights.” Advances in neural information processing systems36:63075–116. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2023/hash/c7201deff8d507a8fe2e86d34094e154-abstract-conference.html
If you find this piece useful, consider supporting your work with a small, one-off, or monthly donation. With your contributions, we enable you to continue bringing you reliable, accurate, thought-provoking science and medical news. Independent reporting requires time, effort and resources, and with your support we can continue to explore the stories that matter to you. Together, we can ensure that important discoveries and developments reach those who need them most.