Eventually, all organizations will have to leverage machine learning to stay competitive. Just like with the move to web applications, mobile applications, and the cloud, machine learning will be the norm, not the exception, to a modern technology stack.
While there is certainly a lot of hype surrounding artificial intelligence and machine learning technology, there is also a very promising reality of augmenting conventional software development with software that learns from activities and can respond appropriately, and in ways not feasible with standard coding. However, the process involved in building effective machine learning technology is quite a bit different from regular software development and, as such, requires a different approach and mindset. This mindset requires acknowledging that machine learning development is largely a scientific process instead of a software development process. I call this the AI mindset.
Artificial Intelligence vs. Machine Learning: What are the differences?
For the record, my view on terminology is that Artificial Intelligence (AI) is “aspirational,” and Machine Learning (ML) is “practical.” That is, most of what everyone is calling AI today is really ML. The hope is that further progress in ML research and techniques may get us closer to software that resembles true human intelligence, but whether or not we will eventually achieve this is highly debatable. Nonetheless, initiatives that fall under the banner of AI are real and can help businesses improve customer experiences and streamline mission-critical processes.
All stakeholders in an ML project need to understand the AI mindset so that there is a common understanding of the resources required, the measurements of progress, and realities of what is practical with current ML techniques. There are reasons why academia and industry exist separately and serve different purposes. Early, large companies saw the value of the scientific approach and funded mostly autonomous research labs such as IBM, Xerox, and DEC. Today, most businesses cannot fund a fully separate research lab with the hopes of some of the results being productized. But the potential benefits of ML, and the consequences of not embracing ML, require businesses to approach this technology differently.
What is an AI mindset?
As alluded to above, the AI mindset bridges industry and academic perspectives. Having straddled the worlds of academia and the private sector, I can speak well to employing a hybrid approach. For the last 20 years, I’ve researched and taught computer science at the University of San Francisco (USF). I’m also the Chief Scientist at SnapLogic, where I work on forward-looking platform research and lead the machine learning program.
My first project at SnapLogic in the summer of 2010 was a machine learning project in which I developed statistical techniques to learn field name mappings so that future field mappings could be predicted. I spent most of my time understanding the data, specifying a representation, and conducting thousands of experiments to validate the model. I generated hundreds of reports and data visualizations so that I could convince myself and others that the approach had merit. The research project was successful and subsequently added to the product.
Fast forward five years, and I started exploring new areas in the product where ML could be applied. I worked with USF graduate students to explore recommendations and sentiment analysis. Ultimately, we were successful in recommendations, which resulted in our Iris Integration Assistant technology. Today, the ML team continues to find ways to improve the user experience through the application of ML technology.
More than just software
My dual role as an academic and chief scientist has allowed me to apply a research mentality in an industry setting.
One of the most important aspects of the AI mindset is the recognition that developing effective ML technology is not like implementing a software feature. While we have pretty good strategies for estimating the cost for conventional coding, because of the uncertainty in developing ML, it is harder to estimate the time required to achieve success or to be successful at all. While ML is ultimately just software, the way the software is constructed is based on learning from observed past data. The trick is finding the right historical data, the right prepared version of the data, and the right ML algorithm. However, finding the right combination requires an iterative process that spans data profiling, data preparation, algorithm selection, algorithm configuration, experimentation, and evaluation.
This iterative process is essentially an application of the scientific method and everyone involved in an ML project needs to understand the uncertainty in this process. When starting out, consider establishing milestones based on the data sets used, the features selected, and the ML algorithms employed. While this won’t give you a better time bound, it will help everyone understand the progress. It may be the case that a particular ML problem may just require too much time and effort, or is currently not technically tractable.
Once you have gone through the process a few times, you and your team may be able to better estimate the ML development effort if a new initiative shares some aspects of your first projects. However, new applications of ML will require the same experimental process.
Where the industry perspective comes in
While I have argued that industry needs to embrace a more academic approach to machine learning, there are industry practices that can be leveraged to help make ML projects successful. Most importantly is an understanding of the problem, the “use case.” Academics often have trouble establishing a problem or solving problems that aren’t useful. ML projects can greatly benefit from well-defined goals or outcomes. In the case of ML, the outcomes may be initially out of reach, but having a good target in the first place will help get a project moving in the right direction. The original outcome can be revised as you gain knowledge about the limitations of the data and algorithms.
Putting a time bound on a project is reasonable and required in an industry setting. Some academics can spend years on a specific research problem. Most companies do not have the luxury to support fully unbounded research initiatives. So, a time bound can limit costs, but it cannot be guaranteed that a project will succeed. Instead, an ML project should have additional outcomes that can be useful to the organization and future ML initiatives. If you can define these additional outcomes upfront, a ML project can still be successful without a complete ML implementation. For example, a lot of operational data is not well documented. Part of the ML process could be to provide an explanation of the current operational schema and how it is used.
The best of both worlds
I’ve been very lucky to apply my academic mentality and sensibility in an industry setting at SnapLogic. I encourage other companies and professors to create similar collaborations that allow faculty to work part-time and to involve students in the research. Especially in the world of ML, industry is uniquely positioned to provide large-scale data and interesting problems that are difficult to recreate in the lab.
Harnessing this AI mindset can help drive innovation and ultimately improve the human computer interface.
For more information, see the blog post, “IP Expo Europe 2018 recap: AI and machine learning on display.”