This is Part 3 of my decision trees series. This time around we are going to code a decision tree in Python. So I’m going to try to make this code as understandable as possible, but if you are not familiar with Object Oriented Programming (OOP) or recursion you might have a tougher time.
This is Part 2. of my decision tree series. Here we will see how we can build a decision tree algorithmically using Leo Breiman’s (One of the big, big names in decision trees) CART algorithm.
The first subject I want to tackle on this page is decision trees. What are they? How do they work? How can I make one?
I am planning to make a small series, ranging from explaining the concept, to implementing a decision tree inference algorithm and hopefully all the way up to implementing Random Forests.
All right let’s get started.
Ok so as we saw in previous parts, the CART algorithm allows us to build decision trees. Up till now we have built these trees until all leaves are pure, meaning they have only one class of examples (for classification trees), however this can lead to overfitting the training data which decreases the generalizability of our model, and therefore it’s usefulness. This is where cost-complexity pruning comes into play.