Why Rural America is the Training Ground for AI - CoBank Site

Why Rural America is the Training Ground for AI

Episode ID S3E05
May 8, 2024

The “compute” phase of generative artificial intelligence — when computers are learning from other computers — doesn’t need to be near where the application is used. But it does take a lot of energy. In this episode of All Day Digital, data center investor Alexey Teplukhin explains why rural markets are ideal for this function.

Transcript

Alexey Teplukhin: One stat that I think is very interesting is, in the last two years, more data has been created than in all of human history combined.

Jeff Johnston: That was Alexey Teplukhin, managing director, capital markets for IPI Partners, about the unprecedented amount of data generative AI is creating.

Hi, I’m Jeff Johnston and welcome to the All Day Digital podcast where we talk to industry executives and thought leaders to get their perspective on a wide range of factors shaping the communications industry. This podcast is brought to you by CoBank’s Knowledge Exchange group.

Generative AI is upending the data center market and wreaking havoc on the energy complex. Hyperscale operators are being forced to seek out new locations for future data centers, with rural markets playing a critical role in how these networks are architected. And electric distributors are being asked to make significant capital investments that are taking them well outside of their comfort zone. And the crazy thing is, we’re still in the very early innings of all of this.

IPI Partners is actively involved in financing many of these large data centers builds, and Alexey sits at the center of these investments, which makes him an ideal guest to talk about generative AI, and how it’s impacting the energy and digital infrastructure markets.

So, without any further ado, pitter patter, let’s hear what Alexey has to say.

Johnston: Alexey, welcome to the podcast. It's an absolute pleasure to have you here with us today. Thanks for joining us.

Teplukhin: Thanks very much for having me, Jeff. Very excited to be on and discuss a topic near and dear to my heart.

Johnston: No doubt. Well, I was thinking about data centers, and we've had some interactions in the past, and you guys, yourself and your firm, are clearly at the epicenter of a lot of what's going on here. I was pumped when you agreed to join us here today.

Alexey, maybe we can just start off with AI. We hear about it all the time, and I think it's really the cornerstone or the key underlying driver of why we're seeing this explosion in the data center space. Maybe you could just help listeners understand what's the difference in today's generative AI that's driving all of this data center demand, and how is it different from previous iterations of AI?

Teplukhin: Yes, that's a great question, Jeff, and one that we spend a lot of time thinking about in terms of understanding why now is the exciting time for artificial intelligence, and how that overlaps with data centers versus before. The evolution of it is really in three stages.

The first stage is when computers were being trained by individual people. You would have one person who would program something to run, and the computer would execute the program. That developed and kept on progressing until we got to a point where computers could learn from many different iterations of programs functioning together.

Think about it not just as one person is training or telling a computer what to do, but many people are simultaneously telling a computer what to do. A great example of that was when computers learned to play chess and defeated humans at chess. That was many, many different chess masters, grandmasters teaching a computer various combination of moves which allowed that computer to evolve and go to another stage of artificial intelligence.

The breakthrough that's happened recently, really over the last couple of years, is when computers started learning from other computers, and that was really the great demarcation point of why artificial intelligence is exciting now, because there was a natural limitation that there are only so many hours in the day. There are only so many people that can teach a computer to do something. Once you have a computer interacting with another computer, that's when you get this phrase of generative AI, where you get this explosion in requirement of data capacity, but also an explosion in output and ability of the AI to do things for you.

One stat that I think is very interesting is, in the last two years, more data has been created than in all of human history, combined.

Johnston: That's insane.

Teplukhin: That's the concept of exponents and exponential data creation, but that really goes to show that once you have unlocked the ability from computers to learn from other computers, that's when you are creating this positive, virtuous circle of information being used, computers training and learning about information, providing better and refined outputs for humans to use in whatever task they want the computer to help on.

Johnston: Yes. It's just truly mind-blowing, and it's so hard to comprehend what this thing looks like in three, four, five years from now given the backdrop here. Wow. Let's talk a little bit about what all this means from an infrastructure perspective. Clearly, with exponential growth in data creation and computation, that's putting a lot of stress on the energy complex right now. It's difficult to find access or a clear path to energy in some of the core markets.

There's an evolution of large language model computation versus inference and things like that and then we hear more about these data centers getting pushed into the secondary markets and maybe rural markets. How do you think of all of this, from an infrastructure perspective, be it power, locations, inferencing, large language models?

Teplukhin: Yes. It's a good question because it touches on-- we talked about this exponential growth of data and data consumption. Well, there's ultimately a very important fundamental relationship, which is in order for a computer to calculate something, it requires power.

Unfortunately, the reality of it is, is that the expectation of power requirements to run artificial intelligence, it requires additional infrastructure upgrades in power across the world, across the U.S., across Europe, across APAC, across every continent in the world. Because historically, power companies have invested in power growth relatively proportional to population growth, because that's what power was used by. Power is used by people. More people come into a town, they require more homes, they require more electricity in their homes. It grows with the population growth.

Now, you've seen this recent boom in people are now using computers. Ultimately, you have a computer in almost every device now, your actual computer, your phone, your television, your car, all of these are small computers. And all of these computers now talking to each other, and that requires a much higher amount of power growth than we have currently.

What does it require in order to keep up with this projected growth? Will require investment in new power generation, will require investment in existing power generation being augmented by power transmission, and will require, frankly, investment in more areas that can have data centers than have before. Historically, data centers have always been close to population centers.

What I think is the interesting trend of artificial intelligence is for computers to learn or to train, and you mentioned large language models, what a large language model is, is a computer that is trying to take into itself all of the information that's available on a particular topic to try to effectively replicate a knowledge based on that topic. However, for that to happen, you don't actually need that data center to be right next to the population centers because while the data center is using that computing power, it's using it to train a model which then eventually will be used by people, and that's the difference between training artificial intelligence and inference, which is using that artificial intelligence.

What you're seeing is you're seeing a continued growth of data centers near population centers where people are using AI programs but the training of those programs can be in a place that's less populated, maybe more availability of power more readily, but less developed. In those places, I think you're going to see a tremendous amount of investment in power procurement and the transmission of that power, power generation. A city has many other uses of that power, but in more rural areas where there are fewer uses, that might be an area where you would see data centers being set up for training the artificial intelligence. And then the data centers that are close to the population centers would be used for inferencing and using that artificial intelligence.

Ultimately, it's a virtuous cycle. The idea is that you train the artificial intelligence, you use it in the population center, you get feedback from those users, what they like, what they don't like, what they're using it for, use cases you didn't even think about and then you go off and you train it again to specialize in those particular use cases.

Johnston: That’s super interesting because I think the narrative out there is that we're starting to see data centers get pushed into secondary markets and eventually maybe more rural-type markets just because of power, which certainly is part of the equation here. I think what maybe some people don't appreciate is sort of this bifurcation of training versus inference and the latency requirements for inference such that you've got to locate these applications closer to where they're being used, whereas when it's training, it can be done in a more remote area so that's really interesting. I would imagine these models are just going to get better and better and better and better and is there even a ceiling for that?

Teplukhin: Yes, think about how even Windows has evolved. The first edition of Windows that was used by the general public was probably Windows 95, and now we're on Windows 10 or 11 or 12. I don't know. It's almost 20 iterations of Windows, and that's a base program that many people use. Artificial intelligence will have that exact same evolution, just at a much bigger and much quicker scale.

I would be very surprised, and I think this is information that also Microsoft and Amazon, as two of the largest, both users and creators of cloud computing have stated, is that artificial intelligence is not a one-and-done thing. Artificial intelligence is meant to be another base program that humans can use for a computational task, just as you may use Windows on your computer or you may use the Apple iOS. If you have a new software, it will interact with artificial intelligence, and as that artificial intelligence grows and learns and optimizes more, your software will be better.

Johnston: Yes. It's going to be so much fun to watch. This is not like the Google Glasses fad back in the day.

Teplukhin: No, you're exactly right.

Johnston: Let's move on to the energy part of the conversation. Again, as we think about how these data center locations will evolve over time with more and more getting pushed into secondary, and presumably eventually rural markets if it's not already happening.

I think there's some concern out there on the part of energy cooperatives and G&Ts that they've had a kind of a taste of a little bit of this before, so they've dealt with crypto miners, and that sometimes didn't go as planned. They've had some experience with data center operators where the demand profile didn't really play out, and it exposes some of these electric distributors to risk when that happens.

Maybe, Alexey, you can just kind of walk us through a little bit about, and I think it's different in this context, but how with generative AI and hyperscale data center model, how maybe some of those concerns and risks and issues that electric distributors have had in the past maybe are not so applicable to what we're talking about here today.

Teplukhin: Yes, absolutely. That's a great question. It's a risk and a conversation that we face very frequently. Every time that we are going to power companies, we are trying to provide to them the highest level of comfort that what we think we're going to do is what will actually happen.

I think stepping back, it's important to think about how the market is structured and why hyperscale, and on top of that, AI has changed it a little bit. Over the last 20 years, multi-tenant or colocation data centers have existed. What that means is a data center has had multiple tenants, maybe 10 or 20 or 100 tenants, and each individual tenant is a big company, a large corporate in that area, a municipal government that needs computing capacity.

Now, what a hyperscale customer is, is its where one tenant, and the biggest hyperscale customers are Amazon, Microsoft, and Google, where one of those tenants takes the entire capacity of the data center and then uses that capacity to on-lease it to the corporate tenants in that area. Think of it a little bit like instead of a big company renting data center space directly from the data center, that big company is renting computing capacity from Amazon. Amazon is giving computing capacity or renting it out not just to 10 or 100 customers, but thousands, tens of thousands across the region. Consolidating all of that demand for computing capacity, and then renting out a big data center themselves to provide that computing capacity.

I think that market structure change is a really critical one, because if you present this power usage to electrical cooperatives, power companies and say, “We're a data center developer, we're going to lease it out to ten people. Maybe they'll come, maybe they won't. Maybe they'll come next year, maybe they'll come in five years. Who knows if they will exist in ten years, if they renew?” There are a lot of uncertainties there.

However, if you present it to them as, “We're a data center developer. We're going to lease it out to a company, one of the highest rated credits in the world, Amazon, Microsoft, Google, all tremendous credits. And they are going to on-use this capacity and using their cloud computing software, lease it out to tens or hundreds of thousands of different users.”

What you're really saying is to the power company, you are providing computing capacity for the use of the internet and technology and software of that region. It doesn't really matter whether one company uses that software or doesn't, or maybe it renews or it doesn't because your reach is everyone in that region who can use cloud computing. It becomes much more of a bet on whether people will use the internet.

But it makes the investment much less susceptible to idiosyncratic risks of one tenant renewing or not, because you're providing it to a much broader, diversified set of users.

Johnston: Great way to explain that. Correct me if I'm wrong, Alexey, but these hyperscale operators have network virtualization tools where they can dynamically move capacity and computing resources amongst their different data center sources to make sure that the network is running at optimal capacity and utilization rates, which I think would also benefit the power companies too, right?

Teplukhin: Absolutely. That's a really good point on, data centers often get caught in the fact that they use a lot of power and people are worrying about how much power data centers use. But when you take a step back, you have to think about that the data center isn't using the power for itself. The people who are using the power are the end users, the consumers of data, the consumers of computing.

What a data center does, especially a hyperscale data center, is it consolidates all of that demand into one building and therefore runs it more efficiently. A great example of this is, historically, before hyperscale data centers existed, every company could have data center renter space or even have a small series of computer servers in their office.

If you think about in the basement of many office buildings in America, often there's a server room, and it's not particularly well used or well maintained, but the important concept of it is, is that everyone overbuilt the amount of capacity they needed. What I mean by that is if you are a technician and you think you need to use 1 megawatt of power, that's your average amount of power that you consume. You can't just build a computer room that uses 1 megawatt of power because what happens if you need 1.2 megawatts of power one day?

What happens if you just have a day where there's a lot of utilization? Suddenly, your power requirement is too high for what you have and your computer services go down. That's a disaster. What did people do? They said, “Okay, on average, I need 1 megawatt of power. I'm going to reserve or build two megawatts of power.”

Now multiply that problem across 100 different companies, suddenly you have this 100 megawatts of power that's reserved but probably not used. Now what does a hyperscale data center do? It consolidates all of that into one, and then it takes a look and it says, “How likely are all hundred companies going to need 1.2, 1.5 megawatts of power all at the same time?”

It's unlikely. There's a natural diversification. What they're able to do is reduce the amount that you've had to over-reserve and overbuild capacity, which actually makes the grid more efficient. You are now having fewer megawatts of power wasted for reservations of a rainy day or big power usage day.

Then, of course, you got the other factor of economies of scale. It is cheaper to build one very large building than it is to build a hundred small buildings. That cost reduction is ultimately passed through to the users of that data center and they get a better product.

I think that's a really important way and you asked about this visualization. The next step that data centers or hyperscale data centers have taken is they've said, “Okay, what we're going to do is we're going to try to optimize our data center even more by having often three data centers connected together and making it so that if one data center somehow fails or there's a power issue, it can fail over that data storage to another data center.”

That is something that doesn't exist when people had data centers in server rooms and their offices even in data centers with third-party operators, because it would be cost-prohibitive to have a carbon copy of what you have in your main data center and a second one and a third one.

It's a better product at a cheaper cost that ultimately reduces the amount of wasted reserved energy that older generations of data centers used.

Johnston: I got one more question before we wrap it up with final thoughts. This has been a great conversation. Just continuing the conversation along the lines of what's happening in the energy complex and again, some of the hesitation that energy companies have. I think you've done a really good job in outlining how today is different than previous data center business models on the colocation side. That makes a lot of sense.

But then there's this question of economic justification or capital investments that are justified through local economic growth. Some of the energy companies are like, “Yes, this looks good for us, but how many jobs is it going to create versus ideally a new factory coming to town?” This is not completely up to them to decide this stuff. I get that they'd rather have a factory from an economic perspective coming to town versus a data center for employment growth and so forth. How would you respond to some of those more economic growth-related concerns on the part of G&T and distributors saying, “I don't know if I want to invest all this capital just to support a data center?”

Teplukhin: Yes, it's a great question and ultimately, again, it's one that we at the data center kind of industry have to face and grapple with and frankly persuade local communities as to why data centers are right for their particular region.

What we typically like to think about is, a data center requires a tremendous amount of investment of capital to construct. Typically, these jobs are high-skill jobs. These are complex, electrical jobs, these are complex installation, mechanical engineering, plumbing-type work that brings high-skilled workers to the area.

But, the other side of it, and I think a good one versus an industrial facility, a warehouse or something like that is, a data center ultimately from a maintenance perspective is actually quite low touch. It doesn't bring a huge amount of increased traffic. You don't get large trucks receiving and picking up packages constantly.

That's a big benefit to the local population because you're not taxing the roads, you're not taxing additional resources around that area. The other thing, and I think a critical one is, property and sales tax on equipment. The good thing about data centers is, from a community standpoint, a data center is effectively a very sophisticated power plug.

You're creating a very sophisticated power socket and what is using that power? It's computer servers. Now, the useful life of a computer server right now is between two and four years. Every two to four years, the computer servers are being refreshed, which means you are buying new computers.

Every time you buy a new computer, they're paying a sales tax in that region and they're paying staff to install it, to remove the older ones, and to keep the data center running and humming along.

From a tax-based perspective, data centers have this natural and very frequently recurring revenue to that local municipality or that local area. In many cases, bringing in highly skilled labor and setting up a quasi-annuity like tax base is a very, very good use of a potential problem land that in many cases, especially in some of these secondary or tertiary markets, is a rural or underdeveloped location. That's why we're seeing a lot of communities right now thinking, “Okay, we want to establish ourselves as a data center-friendly location. What are the right ways to do it?” Interacting with my company and our portfolio companies is one of the ways that we provide value for the ultimate users like a Microsoft, Amazon, or Google and provide value for the communities and marrying those two things together.

Johnston: That's great perspective. I don't think the nuances of the recurring aspect of a data center are really fully appreciated yet. I'm glad you're able to walk us through that because those are clearly very important points that I think are lost on some folks. Look, as you mentioned, these are long-life assets.

I would think, nobody knows what things are going to look like in 20, 25 years from now. It's tough to say, but given the backdrop as we started this conversation of just the way things are changing with AI and the exponential growth in data computation and creation, the infrastructure that's being put in place today and will continue to be put in place over the next several years, I don't know, you tell me, I can't see that going anywhere.

Teplukhin: Look, I think that that's certainly the perspective that I share and that many people in the industry share. Nothing in life is without risk. There's always going to be some chance of something happening. Ultimately, what you are fundamentally betting on is that people continue to use computers. That is the fundamental bet.

As long as people continue to use computers, continue to consume data, continue to produce data, data centers will be required. That's really the decision point. Can you think of a world in 20, 30, 40 years time, where people are using fewer computers and less computing power than they are now? I think that's a hard state of the world to really think about credibly.

Every time the software gets better, people are going to consume it more. I agree with you. In our opinion, these are long life assets. These are not assets which are 10 or 15 years, it's great, and then goes. I think there are going to be data centers that are built now that are going to be in continuous operation 50 years into the future, if not longer. I think that's a very realistic outcome.

Johnston: Yes, I would agree. Envisioning a life where we're using less computing technology and software versus today just doesn't even seem realistic. I'm with you there. Alexey, this has been fantastic. Really appreciate your thoughts here. Before we wrap it up, I just want to give you an opportunity to share any closing thoughts or comments. The stage is yours.

Teplukhin: Yes, thank you very much. Look, I think we touched on really a lot of the key topics for data centers. The one that maybe I would reiterate that's hard to understand but is a really important driver of it is the point that I mentioned right at the beginning. More data has been created in the last two years than in all of human history. It's extremely hard to comprehend, but that is the reality. That is the statistic. That is the number. Because of that, data centers are not a flash in the pan. Data centers are here to stay. I think computing usage is here to stay. I think investing in them and actually having them benefit local communities, cities, rural areas, secondary markets, that absolutely is a trend that we expect to continue in the near, medium, and long-term. That's why I'm very excited about it and I'm very much appreciate the questions today. Thanks for having me on the show.

Johnston: Absolutely. Well, let's leave it there. Alexey, you clearly know your stuff. This has been enlightening and a real pleasure. Thank you very much for being on the podcast today.

Teplukhin: Thank you.

Johnston: A special thanks goes out to Alexey for being on the podcast today. It feels like this whole generative AI phenomenon literally came out of left field and caught the data center and energy markets flatfooted. The projected amount of data creation and capital spend over the next few years is nothing short of mind blowing, and the current supply/demand imbalance in the energy markets looks like it’s going to get worse before it gets better. Rural markets are being asked to play a key role in all of this. I understand the hesitation on the part of rural electric distributors and G&Ts – they’ve been burned before with some colocation operators and the capital investments we’re talking about are significant. But the hyperscale model is different and appears to have fewer risks versus previous data center business models.

Hey, thanks for joining me today, and a special thanks to my fellow CoBank associates Christina Pope and Tyler Herron who makes this podcast possible. Watch out for the next episode of the All Day Digital podcast.

Disclaimer: The information provided in this podcast is not intended to be investment, tax, or legal advice and should not be relied upon by listeners for such purposes. The information contained in this podcast has been compiled from what CoBank regards as reliable sources. However, CoBank does not make any representation or warranty regarding the content, and disclaims any responsibility for the information, materials, third-party opinions, and data included in this podcast. In no event will CoBank be liable for any decision made or actions taken by any person or persons relying on the information contained in this podcast.

Where to Listen