Effective Ways to Overcome Project Challenges in AI
Henrik P. Nyman, PhD
Building a business model based on the concept that companies outsource their AI projects is a fairly novel idea. As such, managing AI projects is also a relatively new undertaking. On the other hand, project management in software development projects is a well established concept. I am Henrik P. Nyman, Project Manager at Silo.AI, In this blog I will compare these two fields from a project management perspective and also talk about some of the main challenges we face in our projects.
In many ways project management in AI is the same as in any software development project. Agile and waterfall approaches to development can both be used successfully in AI projects. We use many of the same tools, Trello, Git, and Jira to name a few. On the other hand in AI projects you are likely to face a number of new challenges not present in traditional software development projects.
1. Setting the Right Expectations
Customer expectation management is very important in AI projects. There is a wide misconception regarding what can and what cannot be achieved using AI. For instance, given a training data set, ranging from numbers, text, images etc., an AI model can be trained to detect anomalies, classify new data points, and answer emails, to name only a few use cases. What cannot be done using an AI model, at least not yet, is to formulate a generic query, give it access to the internet and expect it to be able to learn and complete generic tasks all in an automated fashion. While this kind of misconception is understandable, as this is the kind of information we are fed on a daily basis from TV shows and the like, it constitutes a challenge for us. When a project starts, or preferably much earlier, we need to be sure to align the customer’s expectations to what can realistically be achieved.
2. Data: Quality vs Quantity
High quality data is of utmost importance in AI projects. When talking data in software development projects we are usually limited to customer stories, description of workflow, and the odd database integration. While these are all relevant for AI projects as well, data also constitutes a non-negotiable prerequisite. It is an often heard mantra in AI that you can never have too much data. However, quantity is not the only thing that matters. Arguably the quality of the data is even more important. The quality of the data is determined by a range of factors. Many data sources that we encounter contain missing data or data that is erroneous. For instance, we will have text data in a field that should logically contain a number. If there is only a moderate amount of such occurrences pre-processing of the data can solve the issue. If a large portion of the data is corrupted in this way the task of creating a reliable AI model may be infeasible.
Another important aspect of the data, and its value, is if it is labeled or not. A label refers to if a datapoint is associated with a tag useful for training an AI model. For example, when trying to train a model to understand if an image contains a dog or a cat, each image in the training data could be labeled as a dog or a cat image. These kinds of labels enables the use of so called supervised models. If no such labels exist unsupervised models need to be used. Compared to supervised models, results produced by unsupervised models can be less predictable. While this, by no means, implies that the result is not useful it will likely lead to a situation where human input is more heavily required to interpret the results.
3. Data Access Rights
Another issue concerning data is actually getting access to it. This can be a much more cumbersome task than the customer originally estimated and something that can throw a project timeline way off from the get-go. A common reason for data delivery being delayed is privacy issues. A company might have the rights to use data internally or only have permission to use data for certain purposes, not including the training of an AI model. In these cases companies need to acquire special permission to share the data with us. The EU’s general data protection regulation (GDPR) also adds a new set of necessary restrictions on how data can be used. For an interesting blog post regarding GDPR, please see Silo.AI’s own Erlin Gulbenkoglu’s post on the matter.
4. Uncertain Results
The fundamental difference between a software project and an AI project is that positive results are not always assured. Before we have a chance to examine the data and perform an initial feasibility study it is impossible to know whether or not an AI model can be developed. The best we can do in these cases is to advice the customer how they can improve the way they collect and store data. While this is obviously not the most desired result it can be a very valuable realization in itself. Likely it will lead to the customer being able to make a minor adjustment to their workflow and enable the creation of a useful AI model in the near future.
There are not always sure-fire solutions available in projects. Sometimes the customer’s expectations are too high. Sometimes the data does not hold the required standard. The best way to deal with these situations is, as usually is the case, communication. Frequent communication helps manage expectations and facilitates escalation of possible issues related to, for instance, the data. In AI projects, as in software development projects, keeping the customer in the loop can often be the key to a successfully completed project.