Analytical Skills for AI and Data Science

🚀 The Book in 3 Sentences

This book strives to merge the gap between business and data science. It is about understanding the business value of analytics and what is important to work on.

🎨 Impressions

The most important skills of a data scientist are business savvy and communication mastery. Understanding the action is the driver of the value propositions.

✍️ My Top Quotes

A third group, the practitioners, actually dislike the term and prefer to use the less sexy machine learning (ML) label to describe what they do.
Not so long ago, the queen of tech headlines was big data, and hardly anyone talked about AI
The first pillar involved the now well-known three Vs: volume, variety, and velocity. The internet transformation had provided companies with ever-increasing volumes of data.
“By analytics we mean the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions.”
“Analytical skills are, simply put, problem-solving skills. They are characteristics and abilities that allow you to approach problems in a logical, rational manner in an effort to sort out the best solution.”
In this book I will define analytical reasoning as the ability to translate
Take Tyrion Lannister’s quote in the Game of Thrones “The Dance of Dragons” episode: “It’s easy to confuse what is with what ought to be, especially when what is has worked out in your favor” (my emphasis). Tyrion seems to be claiming that we have the tendency to confuse the descriptive and prescriptive when things turn out well, in what may well be a form of confirmation bias.
How should we value our customers? One approach is to assign the current value derived from each one of them. The problem with this short-term view is that companies invest in their customers all the time, from acquisition to retention, marketing, etc., so to value those investments we also need the long-run view from the revenues side.
The CLV measures the discounted present value of all profits obtained from a relationship with one customer along their expected duration with the company.
Starting from the right, it is useful to repeat one more time that we always start with the business. If your objective is unclear or fuzzy, most likely the decision shouldn’t be made at all.
Companies tend to have a bias for action, so fruitless decisions are sometimes made. This may not only have unintended negative consequences on the business side; it could also take a toll on employees’ energy and morale.
Action: offer a discount Consequence: customers increase their demand for our product Outcome: revenues increase
One of my favorite quotes—commonly ascribed to Albert Einstein—is that “everything should be made as simple as possible. But not simpler.” In the same vein, statistician George Box famously said that “all models are wrong, but some are useful.” Models are simplifications, metaphors that help us understand the workings of the highly complex world we live in.
You may check Thomas Davenport’s now classic Competing on Analytics or
Mastering ‘Metrics’: The Path from Cause to Effect
Most companies sell more than one product or offer more than one service. Economists call the natural advantage that a company may have when offering products that can benefit from similar production processes economies of scope. It is thus logical for most of us to look for ways to deepen our relationship with our customers by trying to do some cross-selling. In consulting jargon, it has been relabeled as the now-famous next-best offer, which already takes us to the prescriptive terrain.
- Your Highlight on Location 1374-1374 | Added on Wednesday, May 8, 2024 5:15:00 PM
Business objectives are usually already defined: but we must learn to ask the right business questions to achieve these objectives.
Always start with the business objective and move backward: for any decision you’re planning or have already made, think about the business objective you want to achieve. You can then move backward to figure out the set of possible levers and how these create consequences that affect the business.
Check Foster Provost and Tom Fawcett’s Data Science for Business (O’Reilly).
In my opinion, the literature has at least two shortcomings: most data scientists rarely care about solving the prescriptive problem and would rather focus on providing high-quality predictive solutions. Also, the literature directed to business people hasn’t been able to provide end-to-end views of decision problems that can be tackled with AI and analytical thinking.
One word of caution: to find levers, we need to know our business. This is not to say that you must have spent many years in one specific industry. That might help, as you must’ve developed strong intuitions about why things work and when they don’t. But it is also true that many times having a non-expert, even naive, view can help us think out of the box and expand our menu of options.
In general, we can divide levers into two types: those that depend mostly on the rules of the physical world to create consequences and those that arise from human behavior.
Thanks to Henry Ford’s assembly line, for instance, the production of cars was greatly improved. It only took a complete redesign of the production process, but once you pulled that “lever” you were able to produce more cars in less time, with the consequent reduction in production costs.
The difficulty comes from what economists call the “Law of Demand”: when we increase our price, our sales generally fall.
The second type has to do with the other side of the network. Think about Uber: if more drivers join, it is easier for passengers to find rides, so now more customers join. But the larger demand also makes joining more profitable for the drivers: you can now see why two-sided platforms generate these huge positive feedback loops. It is common to refer to these as “strategic effects,” since our behavior depends on the choices of others, and vice versa.
Most certainly: one of the most popular levers for two-sided markets is to subsidize the side of the market that is most price sensitive by way of discounts or lower fees.
Once we define a business objective, we must consider whether it’s actionable: most times our problems are actionable, but we may have to think out of the box.
Hypotheses often fail, but we should embrace the learning process: many times we start with a theory about causes and consequences only to see it fail during testing. That’s fine. It’s part of the process. Embrace it and guarantee your team learns from these failures.
One good practice is to start considering only first-order effects with the objective of getting the “sign” or “direction” of the effect right. At this point we may not care about the second-order effects that affect the curvature of the outcome.
A good rule of thumb of when to stop can be found by way of standard cost-benefit analysis: stop when the incremental cost of another iteration exceeds the benefits you expect to get from it.
We impact our business objectives by making decisions: as analytical thinkers, it is our job to find, test, and enrich the set of actions or levers we can pull to achieve our business objectives.
According to one school of thought—the frequentist school—we can think of an experiment as one that takes place many times under the same conditions.
As it is commonly said, uncertainty unravels once we’ve made a decision, at which point we might regret our choice if the realization was not satisfactory.
Bernoulli proposed a solution: we shouldn’t value each prize at face value, rather, we should be using a utility function that displays the diminishing value that each extra dollar represents to us.
Bandit problems are a class of sequential decision problems where we must make a choice that repeats over time, and as time goes on we are learning the workings of the underlying uncertainty, either by improving our probability estimates or the expected values themselves.
Before going on, let’s discuss several points. First, you may be wondering if it’s really necessary to write down everything mathematically and be very explicit about uncertainty. The answer is that the vast majority of practitioners do not go through the trouble of formalizing everything. I think it’s a good practice because it forces you to think really hard about the sources of uncertainty and how to model each of them, as well as the simplifying assumptions made.
A second point has to do with making the model more realistic: here we assumed that customers care only about quality, but I argued that they also care (and trade-off) about price and customer experience.