Model performance metrics
Determine important attributes
Identify the most influential rows
Sometimes the model can overfit the training data and hence underperforms with real data. To reduce complexity, we use regularization.
Derive new features
Identify frequently occurring patterns and rules
Decision trees are versatile tools that can be employed for both Regression and Classification tasks.
When developing a decision tree, it’s crucial to ask a series of targeted questions in order to zero in on the most accurate label. The objective is to construct an efficient tree that minimizes the number of splits necessary to divide the data. This process continues until no further splits can be made.
It’s important to find the right balance, as shallow trees tend to under-fit the data, while deep trees can lead to overfitting, with each example ending up as its own leaf node.
Predict numeric values
Linear Regression builds relationship between inputs (features) and outputs (targets). It is a simple yet effective model to start with to get a benchmark.
Polynomial Regression: Linear Regression can also be used to model a non-linear relationship. To achieve this, we transform an input feature by a non-linear transformation function and create a new feature that can be used as input in the model.
Chinese proverb: One chopstick is fragile, but a group of chopsticks is robust.
A big challenge with ML models is overfitting to training data. And hence, we can use ensemble models — average multiple models into a meta model, which in turn reduces variance.
Predict likely categories
One of the unsupervised classification techniques is clustering.
Identify unusual cases
Chinese proverb: Make a mistake and learn a lesson
Similar to unsupervised learning, it does not need labeled data. The model learns from interactions (trial and error)and hence takes time to perfect. Initially the algorithm acts randomly, exploring the environment, but it learns with time by being rewarded when it makes the right choices.
It is best suited for use cases that require achieving a goal, for example, autonomous vehicles, headline testing or personalizing a website.
Drive Top of the Funnel from various Channels.
Best practice: Use UTM parameters appropriately so you can attribute traffic to each channel.
When retention fails, you churn, i.e., lose customers.
Why it matters: In any business you want to make sure that you are adding customers faster than you are losing them and at the right Customer Acquisition Cost.
Types: E-commerce businesses will measure transaction churn. In contrast, Subscription businesses will measure two forms of churn:
Challenges: Churn is a delayed backward looking and you cannot get a live number.
Examples. Everyone has faced challenges with churn:
Any new, and better way of doing things is technology.
Why it matters: Technology allows you to do more with less.
How: Technology has never been an automatic feature of history. We have to build it.
Cost Per Acquisition
You can get referral traffic from earned media — press coverage (PR), word-of-mouth, endorsements from celebrities, etc.
Once someone is on your Internet Products, always push them to consume one more piece of content. Pages Per Session captures how interested are audiences in exploring your website beyond the landing page.
You can do this by capturing the audience’s intent of coming to your platform and serving them basis that intent using:
To understand their intent you can rely on the piece of content they are consuming right now or the organic keywords they were searching for prior to landing on your platform.
Users who use the product at least once a week.
Why it matters: Segmentation can reveal interesting patterns. For example, a certain segment of users visit every Monday morning to see their weekly horoscope and don’t return otherwise.