logo

Titanic Survival Analytics

Predicting Titanic Survivors in a Business Analytics Project Report

10 Pages2102 Words471 Views
   

Added on  2023-06-03

About This Document

This paper is focused on developing a predictive model to predict the probability that an individual would have survived the accident given different factors, which affected the victims differently. Explore the data, prepare it, and develop a logistic model to predict the survivability. Validate the model and predict the results.

Titanic Survival Analytics

Predicting Titanic Survivors in a Business Analytics Project Report

   Added on 2023-06-03

ShareRelated Documents
Titanic Survival Analytics 1
TITANIC SURVIVAL ANALYTICS
Name
Course Number
Date
Faculty Name
Titanic Survival Analytics_1
Titanic Survival Analytics 2
Titanic Survivability Prediction
Introduction
The Titanic tragedy was one of the most devastating and deadliest events that ever
happened in modern history. Prediction models have been developed to estimate the probability
of survival among the passengers in the liner, in consideration to factors such as class, gender
and age among others. Lots of machine learning activities and predictive methods have been
tried to develop a model with the highest predictive power of survivability in the incident.
1. Defining Business Objectives
This paper is focused on developing a predictive model to predict the probability that an
individual would have survived the accident given different factors, which affected the victims
differently. The passenger liner was divided into 3 classes –first class being in the topmost,
second class in the middle and third class being at the bottom. This already shows that people in
the third class were more likely to die compared to the other classes. However, it is important to
prove this hypothesis, hence supporting our ideas and theories.
It has been documented that most people die because there were no enough lifesaving
jackets, which rendered most of the people who could have survived death. Due to theories of
nature, scarcity of the lifesaver jackets exposed men more compared to the other groups –
women and children. In addition, this effect would have been affected by levels of class. It would
be hypothesized that men in the first class were more romantic compared to those in second and
third classes. Therefore, the trends of survivability would vary between class for men and
women. In an ideal situation, men and women in the third class would have struggled in the
same manner to save their lives.
It is possible to predict their survivability based on the dynamic structure of the
catastrophe. As much as the survivability levels would have been due to chance, these dynamics
can explain to some level of confidence. Exploratory data analysis will be conducted to identify
the predictive variables for survivability. Therefore, a model will be developed to explain the
probability of survival using the provided variables explained in the metadata below.
Titanic Survival Analytics_2
Titanic Survival Analytics 3
Methods
2. Preparing Data
Survival, ticket class and port of embarkation were recorded as categorical variables
using the factor() function for ease of analysis. Using the number of siblings and the number of
parents, family size was calculated. Also, a large family was defined as which has more than
three individuals. Extraction of individuals’ titles was done to generate other categorical
variables which would possibly contribute in the model development. For instance, men were
differentiated from male kids by extracting ‘Mr.’ titles. Subsets of the data were created to
effectively analyse the data for insights into the model development stage.
Table 1: Data dictionary
Variable Definition Key
survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1 st (Upper)
2 = 2 nd (Middle)
3 = 3 rd (Lower)
sex Sex 0 = females, 1= males
Age Age in years
sibsp Number of siblings/spouses aboard the Titanic
parch Number of parents/children aboard the Titanic
fare Passenger fare
embarked Port of Embarkation C = Cherbourg, Q =
Queenstown, S =
Southampton
3. Exploratory Data Analysis
According to our data set, 62.3% died and 37.7% survived. Among the males, 87.1% died
while 17.4% died among the females. On average, those who survived had paid double as much
fare as the survivors.
Titanic Survival Analytics_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Titanic Survivability Prediction - Doc
|11
|1948
|132

Analysis of Titanic Datasets
|15
|3119
|313

Predictive Modelling Solve a Business Problem - PDF
|28
|3064
|13

Titanic: Machine Learning from Disaster
|4
|506
|325

Data Mining - Desklib
|20
|4415
|262