🌑

☀️

Stephen's Blog

Home Archives About

Decision Trees Implementation in R and Python

Stephen Cheng

Intro

For R users and Python users, decision tree is quite easy to implement. Let’s quickly look at the set of codes which can get you started with this algorithm. For ease of use, I’ve shared standard codes where you’ll need to replace your dataset name and variables to get started.

R

For R users, there are multiple packages available to implement decision tree such as ctree, rpart, tree etc.

> library(rpart)
> x <- cbind(x_train,y_train)
# grow tree
> fit <- rpart(y_train ~ ., data = x,method="class")
> summary(fit)
#Predict Output
> predicted= predict(fit,x_test)
> library(rpart)
> x <- cbind(x_train,y_train)
# grow tree
> fit <- rpart(y_train ~ ., data = x,method="class")
> summary(fit)
#Predict Output
> predicted= predict(fit,x_test)

In the code above:

y_train – represents dependent variable.
x_train – represents independent variable
x – represents training data.

Python

For Python users, below is the code:

#Import necessary libraries like pandas, numpy...
from sklearn import tree
#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset
# Create tree object
# for classification, here you can change the algorithm as gini or entropy (information gain) by default it is gini  
model = tree.DecisionTreeClassifier(criterion='gini')
# model = tree.DecisionTreeRegressor() for regression
# Train the model using the training sets and check score
model.fit(X, y)
model.score(X, y)
#Predict Output
predicted= model.predict(x_test)

Python, R, Tree — Aug 7, 2018

Search

Made with ❤️ and ☀️ on Earth.