Machine Learning Project: Bank Marketing and Machine Learning Analysis
VerifiedAdded on 2022/08/12
|6
|859
|36
Project
AI Summary
This machine learning project analyzes a bank's marketing campaign data to predict which clients are most likely to subscribe to a term deposit. The analysis begins with exploratory data analysis, including univariate and bivariate analysis to understand the data's characteristics, identify outliers, and assess relationships between variables. Several machine learning models, including DecisionTreeClassifier, RandomForestClassifier, BaggingClassifier, GradientBoostingClassifier, and AdaBoostClassifier, are implemented and evaluated. The GradientBoostingClassifier achieved the highest accuracy of 89.5%. The project concludes with recommendations for future marketing campaigns, such as focusing on the month of March, and suggests further exploration with other machine learning algorithms and data analysis techniques. The project highlights the importance of data preprocessing, model selection, and evaluation in achieving accurate predictions. The project is a student submission for the website Desklib, which provides AI-based study tools for students.

Running head: Machine Learning
Machine Learning
Student Name:
Student Id:
University Code:
Machine Learning
Student Name:
Student Id:
University Code:
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

2Machine Learning
Table of Contents
Executive Summary.........................................................................................................................3
Introduction......................................................................................................................................4
Discussion........................................................................................................................................4
Conclusion.......................................................................................................................................5
References........................................................................................................................................6
Table of Contents
Executive Summary.........................................................................................................................3
Introduction......................................................................................................................................4
Discussion........................................................................................................................................4
Conclusion.......................................................................................................................................5
References........................................................................................................................................6

3Machine Learning
Executive Summary
Machine learning is subpart of artificial intelligence which gives the machine the ability to learn
automatically from past instances and execute things without any human interference. Machine
learning are widely used in many sectors including hospitality, banking and many more. More
organizations are adopting these technology to get benefited. In these analysis different analysis,
visualization have been performed also five different models were built to check the accuracy of
each model. At the end a conclusion will be concluded regarding the future scope and
improvements.
Executive Summary
Machine learning is subpart of artificial intelligence which gives the machine the ability to learn
automatically from past instances and execute things without any human interference. Machine
learning are widely used in many sectors including hospitality, banking and many more. More
organizations are adopting these technology to get benefited. In these analysis different analysis,
visualization have been performed also five different models were built to check the accuracy of
each model. At the end a conclusion will be concluded regarding the future scope and
improvements.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

4Machine Learning
Introduction
With the recent advancement of technology, machine learning has a huge impact
nowadays. Many industries uses these technology in terms to gain benefit and also for customer
satisfaction. Machine learning algorithm are used to predict future instances and for forecasting
data.
The dataset used is from a bank which has been collected from a marketing campaign.
The main goal for the analysis is to predict if the client will subscribe a term deposit (variable y).
In the analysis five different machine learning algorithms has been implemented which helps the
marketing team to identify potential customers who are relatively more likely to subscribe to the
term deposit and this increase the hit ratio.
Discussion
The analysis is based on term deposit, hence a term deposit is considered to be a kind of
money deposit that every bank offers with some fixed rate of interest in which the money will be
returned back to their respective customer at specific maturity time.
Different exploratory analysis has been performed, and from the analysis it can be said
that the data the data has no null or missing values. Data description has been shown in table 1for
proper statistical understanding of data and data types.
age balance day duration campaign pdays previous
count 45211.000
000
45211.00000
0
45211.0000
00
45211.0000
00
45211.0000
00
45211.0000
00
45211.0000
00
mean 40.936210 1362.272058 15.806419 258.163080 2.763841 40.197828 0.580323
std 10.618762 3044.765829 8.322476 257.527812 3.098021 100.128746 2.303441
min 18.000000 -
8019.000000
1.000000 0.000000 1.000000 -1.000000 0.000000
25% 33.000000 72.000000 8.000000 103.000000 1.000000 -1.000000 0.000000
50% 39.000000 448.000000 16.000000 180.000000 2.000000 -1.000000 0.000000
75% 48.000000 1428.000000 21.000000 319.000000 3.000000 -1.000000 0.000000
max 95.000000 102127.0000
00
31.000000 4918.00000
0
63.000000 871.000000 275.000000
Table 1: Description of data
Introduction
With the recent advancement of technology, machine learning has a huge impact
nowadays. Many industries uses these technology in terms to gain benefit and also for customer
satisfaction. Machine learning algorithm are used to predict future instances and for forecasting
data.
The dataset used is from a bank which has been collected from a marketing campaign.
The main goal for the analysis is to predict if the client will subscribe a term deposit (variable y).
In the analysis five different machine learning algorithms has been implemented which helps the
marketing team to identify potential customers who are relatively more likely to subscribe to the
term deposit and this increase the hit ratio.
Discussion
The analysis is based on term deposit, hence a term deposit is considered to be a kind of
money deposit that every bank offers with some fixed rate of interest in which the money will be
returned back to their respective customer at specific maturity time.
Different exploratory analysis has been performed, and from the analysis it can be said
that the data the data has no null or missing values. Data description has been shown in table 1for
proper statistical understanding of data and data types.
age balance day duration campaign pdays previous
count 45211.000
000
45211.00000
0
45211.0000
00
45211.0000
00
45211.0000
00
45211.0000
00
45211.0000
00
mean 40.936210 1362.272058 15.806419 258.163080 2.763841 40.197828 0.580323
std 10.618762 3044.765829 8.322476 257.527812 3.098021 100.128746 2.303441
min 18.000000 -
8019.000000
1.000000 0.000000 1.000000 -1.000000 0.000000
25% 33.000000 72.000000 8.000000 103.000000 1.000000 -1.000000 0.000000
50% 39.000000 448.000000 16.000000 180.000000 2.000000 -1.000000 0.000000
75% 48.000000 1428.000000 21.000000 319.000000 3.000000 -1.000000 0.000000
max 95.000000 102127.0000
00
31.000000 4918.00000
0
63.000000 871.000000 275.000000
Table 1: Description of data
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

5Machine Learning
There are many outlier present in the data. Few attributes which contains outliers includes
age, campaign and duration. Specially the numerical columns have outliers also the distribution
of all numerical variables other than age is highly skewed. Also from the analysis it has been
observed that the attributes which has highest affect to the target attribute is the duration
attribute. Thus the attribute need to be removed from the dataset for better predictive models.
Few interesting outcomes that has been analyzed from the analysis are-
The mean balance is higher for customers who has subscribed to the term deposit facility
compared to those who didn’t subscribed. Also it has been seen that before the campaign the
number of contacts performed were higher for customers who subscribed.
All of the above facts indicate that customers with a higher balance and those who have
been contacted frequently before the campaign tend to subscribe for the term deposit.
The model which has been build are-
DecisionTreeClassifier
RandomForestClassifier
BaggingClassifier
GradientBoostingClassifier
AdaBoostClassifier
According to the analysis it has been observed that GradientBoostingClassifier has got
the highest accuracy of 89.5% due to its advantage as it provides better predictive accuracy
which cannot be beaten by other algorithms. Also, this algorithm has the ability to optimize on
different loss functions provides several hyper parameter tuning options.
Conclusion
Thus from the analysis it can be concluded that for the upcoming marketing campaign it
will be a great decision to conduct the marketing campaign during the months of March. Also
different other machine learning algorithms need to be developed in order to get classification
more accurately and have higher accuracy percentage. Different other campaign data need to be
analyzed to develop in depth understanding of the data and explore various patterns using
analysis and visualization.
There are many outlier present in the data. Few attributes which contains outliers includes
age, campaign and duration. Specially the numerical columns have outliers also the distribution
of all numerical variables other than age is highly skewed. Also from the analysis it has been
observed that the attributes which has highest affect to the target attribute is the duration
attribute. Thus the attribute need to be removed from the dataset for better predictive models.
Few interesting outcomes that has been analyzed from the analysis are-
The mean balance is higher for customers who has subscribed to the term deposit facility
compared to those who didn’t subscribed. Also it has been seen that before the campaign the
number of contacts performed were higher for customers who subscribed.
All of the above facts indicate that customers with a higher balance and those who have
been contacted frequently before the campaign tend to subscribe for the term deposit.
The model which has been build are-
DecisionTreeClassifier
RandomForestClassifier
BaggingClassifier
GradientBoostingClassifier
AdaBoostClassifier
According to the analysis it has been observed that GradientBoostingClassifier has got
the highest accuracy of 89.5% due to its advantage as it provides better predictive accuracy
which cannot be beaten by other algorithms. Also, this algorithm has the ability to optimize on
different loss functions provides several hyper parameter tuning options.
Conclusion
Thus from the analysis it can be concluded that for the upcoming marketing campaign it
will be a great decision to conduct the marketing campaign during the months of March. Also
different other machine learning algorithms need to be developed in order to get classification
more accurately and have higher accuracy percentage. Different other campaign data need to be
analyzed to develop in depth understanding of the data and explore various patterns using
analysis and visualization.

6Machine Learning
References
Bowles, Michael. Machine learning in Python: essential techniques for predictive analysis. John
Wiley & Sons, 2015.
Bowles, Michael. Machine learning in Python: essential techniques for predictive analysis. John
Wiley & Sons, 2015.
Garreta, Raul, and Guillermo Moncecchi. Learning scikit-learn: machine learning in python.
Packt Publishing Ltd, 2013.
Kumar, Ashish. Learning predictive analytics with Python. Packt Publishing Ltd, 2016.
Lakshmi, J. V. N. "Stochastic gradient descent using linear regression with python'."
International Journal on Advanced Engineering Research and Applications 2, no. 7
(2016): 519-524.
References
Bowles, Michael. Machine learning in Python: essential techniques for predictive analysis. John
Wiley & Sons, 2015.
Bowles, Michael. Machine learning in Python: essential techniques for predictive analysis. John
Wiley & Sons, 2015.
Garreta, Raul, and Guillermo Moncecchi. Learning scikit-learn: machine learning in python.
Packt Publishing Ltd, 2013.
Kumar, Ashish. Learning predictive analytics with Python. Packt Publishing Ltd, 2016.
Lakshmi, J. V. N. "Stochastic gradient descent using linear regression with python'."
International Journal on Advanced Engineering Research and Applications 2, no. 7
(2016): 519-524.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 6
Related Documents

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.