The process of building and deploying AI systems often bears no resemblance to traditional IT. Alegion explores three key strategies to being AI-ready.
Media coverage of AI would have us believe that it is being driven at full speed by expert teams according to well-understood practices and policies. Most of you know that the reality is quite different. It’s really still the early days for AI, and in our experience, many companies aren’t yet AI-ready.
Note that this isn’t a statement about data scientists nor algorithms. Data scientists know what they are doing, and most organizations have no cause to worry about the soundness of their machine learning (ML) algorithms.
Where AI readiness typically lags is in other parts of the process. In most organizations today, the process of building, deploying and maintaining AI systems bears no resemblance to traditional IT.
For organizations that are pursuing their first production ML project, their lack of AI readiness isn’t always apparent at first. In the enterprise organizations we serve it’s quite common for a team of data scientists to conduct an internal proof of concept (POC) of an application of machine learning to a pressing business problem. When successful, the POC — often built on an off-the-shelf algorithm and open source training data — demonstrates an opportunity to save money, increase efficiency, or improve customer experience using AI.
POC’s are heady experiences, full of promise. POC’s do not typically expose an organization’s lack of AI readiness.
Media coverage of AI would have us believe that it is being driven at full speed by expert teams according to well-understood practices and policies.
- Their data house isn’t in order
It’s when data science teams get the green light to build a production system that life gets complicated. It’s at this point that data scientists encounter three non-obvious barriers to AI readiness:
POC’s don’t need a lot of data. But getting an ML system into production, at a confidence level that provides ROI or better, requires enormous volumes of labeled and annotated data.
It’s at this point that the data science team learns that the organization’s data is messier than they thought. All of a sudden, they’re looking at an enterprise data cleanse just to have a usable dataset.
And then they have to turn that data into training data. Data scientists fully understand the volume of data that they need to train their algorithm. What they don’t always understand is that no one else in the organization knows that. Which means that the data science team is on the hook to prepare training data. All of it.
2. Their team is incomplete
The small team of data scientists that stood up the POC quickly discovers that building a production system requires skills they don’t have. They need project skills, people who can design and execute on training data preparation, people who can curate and manage the workforces doing the data, the workforce members themselves, ML software engineers who can build real production software, and a host of others. And it helps to have someone on board who knows how to navigate the rest of the enterprise, because that’s where HR, Finance, Purchasing and Legal live.
3. Their budget doesn’t reflect the enterprise’s enthusiasm for AI
Every year Gartner asks CIO’s in every industry about their priorities. AI is in the top three on everyone’s list. Then they’re asked where they plan to spend money. AI invariably falls outside the top 10.
In first-time ML projects data science teams end up doing everything because the organization hasn’t budgeted properly. In part this reflects the current reality that “everyone wants AI, but no one wants to pay for it.” After hiring hard-to-find and very expensive data scientists, the organization is in sticker shock.
But there’s another reason that ML project budgets are inadequate: No one on the data science team knows how to navigate the enterprise budgeting process. We spend as much time coaching clients about how budgets work as we discuss training data preparation and ensuring they are AI-ready.
With your POC approved, budget secured, data cleaned up and consolidated, and your team hired, you have broached the AI-readiness gap. Now it’s time to start training your algorithm.
We encounter a number of data science teams that prefer to prepare their training datasets internally. They may have concerns with data security. They may be convinced that their needs are unique and unlikely to be met by an outside vendor. Or, they may have skipped the budget step above, and find themselves unable to procure outside help.
If you are considering preparing ML training data in-house we’ve put together a How-to Guide to Training Data Prep. It describes the resources you’ll need to do the job and a checklist for measuring your readiness.
This guest post is from Alegion — training data labeling for machine learning.
Speak Your Mind