How Organizations Can Avoid AI Sticker Shock

2023 was generative AI’s breakout year—where organizations started looking into how to integrate AI into every aspect of their tech stacks and operations.

But as companies start to look closer at their AI deployments over the latter half of 2024, the most important question won’t be what they can do with the technology, but how much is it all going to cost? Since there is not one blanket strategy for developing AI, there is often confusion surrounding the overall price.

By understanding the type of AI you’re training, its latency requirements, the quantities of training data, and what third-party data you’ll need, you can ensure that your company is able to innovate without breaking the bank.

Understanding the type of AI you’re training

Knowing how complex a problem you want it to solve has a huge impact on the computing resources needed and the cost, both in the training and in the implementation phases. Given the wide range of AI projects from training chatbots to self-driving cars, understanding the models you’re working with and resources required will be vital to matching costs to expectations.

AI tasks are hungry in all ways: they need a lot of processing power, storage capacity, and specialized hardware. As you scale up or down in the complexity of the task you’re doing, you can rack up huge bills in sourcing components such as the most coveted hardware—for example, the Nvidia A100 runs at about $10,000 per chip. Another example is you’ll need to understand if your project requires a brand new model or fine tuning existing open source versions; both will have radically different budgets.

Storing training data

AI training requires a ton of data, and while it’s difficult to estimate, we can ballpark that a large AI model will require a minimum of tens of gigabytes of data and, at a maximum, petabytes. For example, it’s estimated that OpenAI uses anywhere from 17GB to 570GB to 45TB of text data (OpenAI considers the actual database size to be proprietary information). How large a dataset you need is a hot area of research at the moment, as is the amount of parameters and hyper parameters. The general rule of thumb is that you need to have 10 times more examples than parameters. As with all things AI, your use case heavily influences how much data you need, how many parameters and hyperparameters you include, and how those two things interact over time.

Latency requirements

When considering the overall cost of AI creation, it’s essential to also recognize the amount of both durable and temporary storage needed. Throughout the training process, the primary dataset is constantly transforming and in doing so, splitting into parts. Each of these subsets will need to be stored separately. Even when you’re inferencing on an already trained model, which will be the primary use of your model once deployed, the amount of time it takes for the model is affected by caching, processing, and latency.

The physical location of your data storage makes a difference in how quickly tasks can be accomplished. Creating temporary storage on the same chips as the processor completing the task is one way to solve this problem. Another way to solve this problem is keeping the whole processing and storage cluster co-located in a data center and closer to the end user as they do at TritonGPT at UC San Diego.

Bringing in third party assistance

After determining the specific needs of any AI project, one question you have to ask yourself is whether or not you need to outsource help. Many businesses have developed pre-existing models or are providers that can deliver your expected results at a fraction of the price of striking out on your own.

A good place to start is the open source community Hugging Face to see if its wide variety of models, datasets and no-code tools can help you out. On the hardware side, there are specialized services like Coreweave which offer easy access to advanced GPUs at a much lower cost than the legacy vendors or building your own from scratch.

Saving on AI expenses can add up

Keeping up with the ever changing and developing industry of AI innovation doesn’t have to be difficult. But like past hype cycles around the cloud and big data, investing without clear understanding or direction can lead to overspending.

While it’s exciting to speculate over when the industry will reach artificial general intelligence (AGI) or how to get access to the most powerful chips, don’t forget how costs involved with deployments will be just as important in determining how the industry will evolve. Looking into the most cost effective options for developing AI solutions now will help you budget further resources towards AI innovation in the long run.

About the Author

Chris Opat joined Backblaze as the senior vice president of cloud operations in 2023. Before joining Backblaze, he served as senior vice president of platform engineering and operations at StackPath, a specialized provider in edge technology and content delivery. He brings a passion for building teams of experienced technologists who push the envelope to create a best-in-class experience for Backblaze customers. Chris has over 25 years of experience in building teams and technology at startup and scale-up companies. He also held leadership roles at CyrusOne, CompuCom, Cloudreach, and Bear Stearns/JPMorgan. Chris earned his B.S. in Television & Digital Media Production at Ithaca College.

Sign up for the free insideAI News newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

How Organizations Can Avoid AI Sticker Shock

Sponsored Guest Articles

Webinar: Getting Started with Llama 3 on AMD Radeon and Instinct GPUs

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Speak Your Mind Cancel reply

Featured RSS Feed

More News from insideHPC

How Organizations Can Avoid AI Sticker Shock

Sponsored Guest Articles

Webinar: Getting Started with Llama 3 on AMD Radeon and Instinct GPUs

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Join Us On Social Media

Speak Your Mind Cancel reply

Related Posts

Featured RSS Feed

More News from insideHPC