
It is though, because only like 10% of people in any 1 area are going to be using them at any one time. Data centers are running 100% of the time at high capacity. Maybe by sheer math it uses less resources overall, but it’s using more in a concentrated area, which is why communities living near them are suffering with way higher electric bills, and occasionally losing access to running water
Those concentrated areas though are also the areas with the most capacity to serve all that electricity. In a world where everyone uses local models and that load is distributed more evenly, those problems could be amplified even more in places that aren’t prepared for it. Anyone running local models that actually compare to Sora or Veo in quality probably has a pretty intense setup
And also when I say running the models locally is more energy intensive than in a data center, it’s not just like a little bit more. It’s probably between a 10x to 100x factor of waste. I’m not saying the current setup is perfect but I just don’t think everyone running models locally is a viable solution
As much as I hate to admit it, I learned a lot about basic coding, scripting tricks, and how to fix things I don’t understand in Linux specifically because of LLMs. Like I can’t code something by myself yet, but I can manually fix certain JavaScript issues now, whereas I didn’t even know what JavaScript was before. Also it’s pretty handy at speeding up my internet searches sometimes. Supposedly it’s a great language learning tool as well, but I haven’t personally tested that.
But everyone isn’t doing it though… idk man, whenever I run something locally, it doesn’t take up much more power than playing a modern AAA game does, or rendering something in Blender. I think it would be way better to just spread everything out. Plus, as our actual computing technology improves, we’re not going to need as much power as we currently do. Like NASA needed a whole room and massive generators to get to the moon, meanwhile now we can do that on a laptop with like 20% charge.
I mean yeah, that kinda checks out. One inference of Sora probably uses like 25x less electricity than playing COD on your PC for 10 minutes. The scale at which people are requesting inference is absurdly large and if it were all distributed everywhere, it would be far more wasteful than doing it in a data center. Data centers and the cloud existed before generative AI was mainstream and this is one of the main reasons why. It’s cheaper for AWS to run a server for me than for me to do it myself