Technical Article
Sustainable AI Starts with Efficient AI

Sara Han Díaz
DevRel Engineer

Bertrand Charpentier
Cofounder, President & Chief Scientist

AI is no longer a side experiment. It is already part of products, workflows, and day-to-day operations. So the question is not whether companies should use AI, but how to use it responsibly at scale. From our perspective, it starts with efficiency: getting the same or better results with less compute, less energy, and a lower environmental footprint.
The way models are chosen affects energy use and emissions, and as expectations around transparency grow, it's important to have a more solid way to explain those choices. The good news is that reducing AI’s footprint does not mean slowing innovation. It means running better systems: smaller models, faster inference, and clearer measurement.
How Big Is the AI Sustainability Impact?
Before we go further, let’s start with a few facts about AI and sustainability.
3-40Wh: Amount of energy consumed for one small to long ChatGPT query (Source, 2025)
2 nuclear plants: Number of nuclear plants to constantly work to generate enough energy if 80M people generate 5 pages per day (Source, 2025)
61,848.0x: Difference between the highest and lowest energy use in energy leaderboard for AI models (Source, 2025).
+160%: Expected increase of data center power consumption by 2030 (Source)
How Does AI Impact the Environment?
These figures are clear. Behind powerful AI tools, there is an environmental cost. AI systems require large amounts of natural resources and contribute to greenhouse gas emissions. Understanding how AI impacts the environment is important because the choices made now will shape whether AI becomes a tool for sustainability or a source of greater environmental pressure.
When discussing environmental impact, it is often easy to overlook which parts of an AI model’s lifecycle affect the environment. In general, the focus is usually placed on the impact of using AI tools, especially the energy required by large data centers. However, environmental effects occur across every stage of the AI lifecycle.
For example, it is not only during the training of AI models, but also during deployment and even the earlier stages required to make AI possible, such as material extraction, equipment manufacturing, cooling, networking, and storage.
The environmental costs of AI can take different forms, including the use of natural resources such as energy, water, and minerals, as well as greenhouse gas emissions.
How to measure the AI sustainability impact?
Now that we understand how AI can impact the environment, the next question is how we can measure that impact. However, to do so, we first need to understand that there is no single, fixed footprint for an “AI request.”
Input footprint varies: It is tempting to think of AI usage in simple units: one prompt in, one answer out, one measurable footprint. In reality, the same prompt can have very different impacts depending on model size, input and output length, inference settings, and the serving environment. Even the same model may behave differently across deployments, meaning its footprint can change and often be improved through better engineering decisions.
Not all AI workloads are equal: Text generation, image generation, and video generation sit at very different points on the compute spectrum. Video, for example, is typically far more compute-intensive than text. That means two teams may both be “using AI” while generating very different levels of environmental impact. Understanding those differences helps organizations identify and prioritize the areas where optimization can have the greatest effect.


Source: https://arxiv.org/pdf/2311.16863
Deployment shapes impact: Hardware type, region, serving setup, and runtime choices all affect environment. That is exactly why efficiency matters: once companies understand what drives impact, they can start reducing it through smarter models, better optimization, and more efficient infrastructure.
Perfectly measuring every aspect of sustainability is not possible. Still, it is worth tracking what can be measured and making the necessary updates as better data becomes available. So, it is time to look at the numbers.

This formula calculates the energy consumption of query i at the lower and upper utilization bounds. First, it calculates the total inference time in hours, denoted as Ti. Then, it multiplies this time by the effective power used by the hardware. GPU power multiplied by the lower or upper GPU utilization represents the GPU power draw under the corresponding utilization assumption. Non-GPU power multiplied by non-GPU utilization represents the power draw from other components such as the CPU, memory, networking, and storage. Finally, the result is multiplied by PUE, which accounts for additional data center overhead such as cooling and power distribution.

This formula calculates the water consumption of a query in liters by separating the impact into on-site and off-site components. The first part estimates the water used on-site at the data center, mainly for cooling. It does this by isolating the IT energy consumed by the computing equipment and multiplying it by the data center’s on-site water usage effectiveness, measured in liters per kilowatt-hour. The second part estimates the off-site water consumption associated with generating the electricity used by the query. This is calculated by multiplying the query’s total energy consumption by the water intensity of the electricity source. Adding both components gives the total estimated water consumption for the query.

It calculates the carbon emissions of a query in kilograms of carbon dioxide equivalent. The query energy represents the amount of energy consumed by the query, and the carbon intensity factor represents the emissions intensity of the electricity supply, usually expressed in kilograms of carbon dioxide equivalent per kilowatt-hour. By multiplying these values, the formula estimates the amount of greenhouse gas emissions associated with running that query.
Tools for individual testing:
- https://huggingface.co/spaces/optimum/llm-perf-leaderboard
- https://huggingface.co/spaces/genai-impact/ecologits-calculator
- https://huggingface.co/spaces/AIEnergyScore/Leaderboard
- https://dashboard.codecarbon.io/
How Can Efficiency Improve AI Sustainability?
The formulas above help quantify part of AI’s environmental impact, but they also raise a broader question: how does energy use affect the performance of AI systems themselves?
On the one hand, using more energy can improve the quality of AI outputs. This relationship has been widely studied through scaling laws, which show that increasing compute during training and, in some cases, during inference can lead to better model quality. Larger models, longer training runs, and more complex inference strategies can all improve the accuracy, reliability, or usefulness of predictions.
However, more energy does not mean more performance. A system that produces high-quality results but requires more time, larger hardware, higher compute costs, and greater energy consumption may not be efficient overall. Higher energy use can also increase the environmental impact of AI by requiring more resources to build, run, and cool the servers that support these systems.
| Performance is the combination of quality and efficiency.
At a practical level, efficient AI means achieving the same results with fewer resources. Even when sustainability is not your main priority, optimizing energy use remains important because it directly affects overall system performance, including cost, speed, scalability, and hardware requirements.
By reducing environmental impact without requiring users or developers to do less, it shows that sustainable AI is not only about limiting, but also about designing systems that are faster, more scalable, and less resource-intensive. In this sense, improving efficiency can benefit both the environment and the performance of AI systems, making it a practical and necessary direction for the future.
This is one of our key motivations behind sustainable AI: aligning environmental goals with broader performance incentives so that better engineering choices lead to lower impact.
What Does Pruna Do for AI Sustainability?
There is no single, one-size-fits-all approach to reducing the environmental impact of AI models. At Pruna, we believe that sustainable AI starts with efficient AI, and we work across several areas to make this possible.
Performance Models
At Pruna, we offer highly optimized models through our P-models family. They are smaller, faster, and more energy-efficient than many other released models, while still maintaining strong quality. This includes P-Image, P-Image-Edit, and P-Video, among others, which are 3 to 6 times more energy efficient than other models for the same tasks.
In addition, we provide optimized endpoints through our API and through other vendors, making the models more lightweight and easier to integrate into different environments. This reduces hardware requirements and energy consumption without compromising usability. Some examples are Wan 2.2 or Flux 2.
Open Source AI Efficiency
If none of the provided models meet your needs, we also offer tools to make your preferred model smaller and more efficient. The OSS Pruna package is a model optimization framework that helps developers build faster and more efficient models with minimal overhead. It provides a comprehensive suite of compression techniques (caching, quantization, pruning, distillation, compilation, kernels, or recoverers) that can be easily combined without requiring complex manual integration.
Events and Challenges
We also collaborated with different initiatives and communities to promote AI efficiency beyond our own work.
For instance, we have been running AI efficiency meetups and webinars where we discuss this topic with pruners, as well as with invited speakers from the broader AI and sustainability community.
In addition, we have collaborated with other organizations. For instance, we hosted a community event with CodeCarbon and EcoLogits, where participants could learn, exchange ideas, and discuss practical ways to measure and reduce the environmental impact of AI. We also supported the 1st International Challenge on Compression of AI Models, aiming to contribute to sustainable AI by encouraging participants to optimize models.

Our Metrics
To measure the environmental impact, we integrated our runs with CodeCarbon and used their dashboard to track the results. We also estimated the energy use and CO₂ emissions avoided by comparing our optimized models with their base versions: what would have been consumed without optimization versus what was actually required when using Pruna.
These are the results we achieved over the past year for a single provider.

A quick disclaimer: making AI more efficient is only one part of sustainable AI. Efficiency improvements can sometimes lead to more overall usage, known as the rebound effect. We should also ask whether AI is needed for every task, because in many cases, simpler solutions may be enough.
Conclusions
In this blog, we analyzed how AI impacts the environment, the stages where this impact occurs, and the main costs associated with it. We then explored how to measure this impact, showing that although results can vary depending on the prompt, task, deployment setup, and other factors, existing formulas can still help provide useful estimates. Finally, we presented what we are doing at Pruna through our efficient models, open-source package, and community events, and shared some of the results we have achieved.
Make your AI workloads More Efficient and Sustainable!
Run our efficient models from the API. Sign up here!
Compress your own models with Pruna and give us a ⭐️ to bring you many more algos!
Stay up to date with the latest AI efficiency research on our blog, explore our materials collection, or dive into our courses.
Join the conversation and stay updated in our Discord community.
——
References
Falk, S., Ekchajzer, D., Pirson, T., Lees-Perasso, E., Wattiez, A., Biber-Freudenberger, L., Luccioni, S., & van Wynsberghe, A. (2025). More than Carbon: Cradle-to-Grave environmental impacts of GenAI training on the Nvidia A100 GPU. arXiv. https://doi.org/10.48550/arXiv.2509.00093
Jegham, N., Abdelatti, M., Koh, C. Y., Elmoubarki, L., & Hendawi, A. (2025). How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference. arXiv. https://doi.org/10.48550/arXiv.2505.09598
Luccioni, S., Trevelin, B., & Mitchell, M. (2024). The Environmental Impacts of AI — Policy Primer. Hugging Face Blog. https://doi.org/10.57967/hf/3004
Luccioni, S., Jernite, Y., & Strubell, E. (2023). Power Hungry Processing: Watts Driving the Cost of AI Deployment? arXiv. https://arxiv.org/pdf/2304.03271


・