Decision trees are versatile tools that can be employed for both Regression and Classification tasks.
When developing a decision tree, it’s crucial to ask a series of targeted questions in order to zero in on the most accurate label. The objective is to construct an efficient tree that minimizes the number of splits necessary to divide the data. This process continues until no further splits can be made.
It’s important to find the right balance, as shallow trees tend to under-fit the data, while deep trees can lead to overfitting, with each example ending up as its own leaf node.
Linear Regression builds relationship between inputs (features) and outputs (targets). It is a simple yet effective model to start with to get a benchmark.
Using training data, the goal is to minimize the loss.
You can measure loss function using SSE
Polynomial Regression: Linear Regression can also be used to model a non-linear relationship. To achieve this, we transform an input feature by a non-linear transformation function and create a new feature that can be used as input in the model.
Chinese proverb: Make a mistake and learn a lesson
Similar to unsupervised learning, it does not need labeled data. The model learns from interactions (trial and error)and hence takes time to perfect. Initially the algorithm acts randomly, exploring the environment, but it learns with time by being rewarded when it makes the right choices.
It is best suited for use cases that require achieving a goal, for example, autonomous vehicles, headline testing or personalizing a website.
Why it matters: In any business you want to make sure that you are adding customers faster than you are losing them and at the right Customer Acquisition Cost.
Sometimes churn is good because your product isn’t for everyone! It is better to let them go than to keep supporting them or serving their needs.
Types: E-commerce businesses will measure transaction churn. In contrast, Subscription businesses will measure two forms of churn:
Customer Churn in a specific period (say a month) is the number of customers lost divided by the number of customers you had at the start of the period. Use this if you have only one Pricing Tier. However, if you’ve more than one Pricing Tier, then you should calculate revenue churn or visualize churn by customer cohorts.
Revenue Churn in a specific period (say a month) is the total revenue on the first day of the period minus the revenue from the same customers at the end of the period minus any value from Upsell. We need to subtract the upsells because they hide loss from churned customers.
Challenges:Churn is a delayed backward looking and you cannot get a live number.
Examples. Everyone has faced challenges with churn:
The Wall Street Journal worries about aging subscribers.
Once someone is on your Internet Products, always push them to consume one more piece of content. Pages Per Session captures how interested are audiences in exploring your website beyond the landing page.
You can do this by capturing the audience’s intent of coming to your platform and serving them basis that intent using: