The premise of Keith’s discussion is that deep learning has changed since 2012, but little else has changed in AI/ML for the past 20 years
What is AI?
- Natural language = NLU (understanding), NLG (generation), text mining (existing)
- Robotics = driverless cars and automation (new)
- Computer perception = visual recognition (new)
- Machine learning (ML) = deep learning (DL), reinforcement learning (new), supervised and unsupervised (existing)
- Expert systems = knowledge engineering (new), business rules (existing)
He was intrigued by Bank of America’s “Erica” virtual financial assistant and showed a video in which customers:
- Paid bills from anywhere, anytime
- Sent money to friends
- Talked, typed, and tapped
He had a great question for BoA data analysts, “Is the data from this app being captured and is that data being included in the models they are building along with all other customer interaction data?”
Traditional text mining extracts context from the text that gets fed into the predictive model — this has been done for more than 20 years.
Driverless cars run through reinforcement learning and people say, “The model is not great now but will learn from its mistakes.” However, supervised ML drives most of what we do but it degrades over time.
Visual recognition is making huge gains and speech is being driven by DL, which is different than ML. DL requires different raw material and greater volume. It’s not like churn models and predictive maintenance. DL is focused on speech, text, and visuals.
What’s different is that all of this is being done without programming. Neural nets were sleepy. Today they are more popular.
DL is exciting in what it can do but requires a new focus.
With ML the models are generated by the algorithm. Why is the phrase “cognitive” hot now?
ML is a broad term that generally refers to presenting carefully curated data to computer algorithms that find patterns and systematically generate models (formulae and rule sets). While the algorithms are explicitly programmed, the models are not.
Carefully curated data is sent to an algorithm that was built by a human.
Supervised ML is given a dataset with a “target variable” and “input variables.” A modeling algorithm automatically generates a model (a formula or ruleset), which establishes a relationship between the target and some or all of the input variables. There are anywhere from one to many algorithms per use case.
1,000 driverless cars are all hooked up so learning from one gets communicated to the others.
If we have three years of fraud data, we should present it all to the algorithm. However, we need to remember, when doing supervised learning with historical data, it’s only going to degrade.
Virtually everything becomes supervised learning with binary communication because we want to make better decisions. It’s harder to come up with an intervention strategy for next medical diagnosis code versus admit/readmit in healthcare. This is why problem definition is important. A simple decision tree or regression can be good because a black box model may not be suitable.
Then what is unsupervised learning? Not having a target is not sufficient. What makes it unsupervised is when the whole model being right or wrong does not apply.
All unsupervised learning is, is finding natural groupings and determining if they are common or rare.
Watch Malcolm Gladwell TedTalk on spaghetti sauce. One-third of Americans prefer chunks of vegetables in their spaghetti sauce. They discovered something they were not looking for.
Computers “teaching themselves?” Google brain has gone from “diagonal line mode” to “cat mode” to “face mode.” Data is getting more granular.
If I have three billion transactions, should I sample some of the transactions? How can you randomly select when you don’t know what transactions they belong to? Active customers from the previous year? You cannot run into too many rows; however, you typically run too few.
Deep learning tells different stories with tens or hundreds of millions of rows of data.
All roads lead to binary classification. Most solutions are ultimately deployed as classification models.
Real world deployed solutions are not one model, they are a series of models.
Predictive analytics is the selection and analysis of historical data, accumulated during the normal course of doing business. Predictive models are built by finding and confirming previously unknown patterns, deploying models, and scoring the current data to make measurably better data-driven decisions.
Now HR and middle managers are emphasizing the programming side with R, Python, and other tools.
How do we use point-of-sale (POS), healthcare, and transactional data that has not been perfectly defined? We need to be thoughtful about what data is relevant or applies. More is not necessarily better if it includes cases that will not be recorded at closing (i.e., death). Features need to be dynamic in nature.
Translate into a predictive analytics problem. Most predictive analytics models produce a propensity score for a specific decision. It determines which outcome is most likely.
You need a plausible scenario to deploy the model. Start with plausible deployment scenarios. If you get insights on the side, that’s great.
Data -> Models -> Scores = the goals
Know how you’re going to use your findings upfront.
Know that the minimum number of records to score is one.
Model building is not computationally complex.
How often you run the model depends on how much the variables are changing.
Decisions are driven by data and scores, equations, and rule sets.
PMML stands for predictive modeling markup language and has been around for 20 years. Most languages are PMML compatible.
Landline churn in the 1980’s was one of the first big uses of data mining.
After the model is built, how much is the process like the scientific method? There is no statement of the hypothesis. There are mostly yes/no questions in statistics. Revisit the KPI’s of the business that drove the project in the first place.
Perform an initial cost-benefit analysis:
- Start with what you know.
- Estimate the annual dollar cost of the problem.
- Is the potential solution a big enough project to justify a team’s effort?
If you don’t have the opportunity to save the company at least $1 million, it’s not worth pursuing an AI/ML solution because that’s how much time and money will go into building an AI/ML team.