logo

Data Mining - Desklib

   

Added on  2023-04-24

20 Pages4415 Words262 Views
Running head: 1
[Author Name(s), First M. Last, Omit Titles and Degrees]
Author Note
[Include any grant/funding information and a complete correspondence address.]

Running head: 2
SUMMARY
The data set contains 60 observations which has been extracted from Titanic dataset.
The dataset contains both continuous and categorical variables. The variables are passengerId
which contains the index of the passengers; survived: 1 represents survived and 0 represents
not; passengerClass: the class of the passenger on ship; sex: Indicate a passenger’s sex; age:
passenger’s age at time of ship departure; siblingSpouse: the number of Siblings/Spouses that
a passenger has on the ship; parentChild: the number of parents or children that are present on
the ship (Matsuoka et al., 2014).
The response variable is survived, and the explanatory variables are passengerClass, sex, age,
siblingSpouse and parentChild.
QUESTION 1
The aim of the study is to use the given data to train decision tree models to predict
whether certain passengers on the Titanic will survive or not. During data cleaning and pre-
processing there was a missing value in the variable siblingSpouse which was replaced using
manual data cleaning technique of using the attribute mean to fill in the missing value
(Guruler, Istanbullu, & Karahasan, 2010) . In attribute age there is an outlier that is male
age=221, this might have resulted from typing error. To correct this, the approach is:
Replacing the outlier with the median age; 26 replaces 221.
Passen+A1:J5
6gerId
Surviv
ed
Passenger
Class
Sex Ag
e
Age
categ
ory
SiblingSp
ouse
sp
categ
ory
ParentC
hild
pc
cat
1 0 3 male 22 Adult 1 non-
zero
0 zero
2 1 1 fem
ale
38 Adult 1 non-
zero
0 zero

Running head: 3
3 1 3 fem
ale
26 Adult 0 zero 0 zero
4 1 1 fem
ale
35 Adult 1 non-
zero
0 zero
5 0 3 male 35 Adult 0 zero 0 zero
6 0 1 male 54 Adult 0 zero 0 zero
7 0 3 male 2 Child 3 non-
zero
1 non-
zero
8 1 3 fem
ale
27 Adult 0 zero 2 non-
zero
9 1 2 fem
ale
14 Teena
ge
1 non-
zero
0 zero
10 1 3 fem
ale
4 Child 1 non-
zero
1 non-
zero
11 1 1 fem
ale
58 Adult 0 zero 0 zero
12 0 3 male 20 Adult 0 zero 0 zero
13 0 3 male 39 Adult 1 non-
zero
5 non-
zero
14 0 3 fem
ale
14 Teena
ge
0 zero 0 zero
15 1 2 fem
ale
55 Adult 0 zero 0 zero
16 0 3 male 2 Child 4 non-
zero
1 non-
zero
17 0 3 fem
ale
31 Adult 1 non-
zero
0 zero
18 0 2 male 35 Adult 0 zero 0 zero
19 1 2 male 34 Adult 0 zero 0 zero
20 1 3 fem
ale
15 Teena
ge
0 zero 0 zero
21 1 1 male 28 Adult 0 zero 0 zero
22 0 3 fem
ale
8 Child 3 non-
zero
1 non-
zero
23 1 3 fem
ale
38 Adult 1 non-
zero
5 non-
zero
24 0 1 male 19 Teena
ge
3 non-
zero
2 non-
zero
25 0 1 male 40 Adult 0 zero 0 zero
26 0 2 male 66 Adult 0 zero 0 zero
27 0 1 male 28 Adult 1 non-
zero
0 zero
28 0 1 male 42 Adult 1 non-
zero
0 zero
29 0 3 male 21 Adult 0 zero 0 zero
30 0 3 fem
ale
18 Teena
ge
2 non-
zero
0 zero
31 1 3 fem
ale
14 Teena
ge
1 zero 0 zero

Running head: 4
32 0 3 fem
ale
40 Adult 1 zero 0 zero
33 0 2 fem
ale
27 Adult 1 zero 0 zero
34 1 2 fem
ale
3 Child 1 zero 2 non-
zero
35 1 3 fem
ale
19 Teena
ge
0 zero 0 zero
36 0 3 fem
ale
18 Teena
ge
1 non-
zero
0 zero
37 0 3 male 7 Child 4 non-
zero
1 non-
zero
38 0 3 male 21 Adult 0 zero 0 zero
39 1 1 fem
ale
49 Adult 1 non-
zero
0 zero
40 1 2 fem
ale
29 Adult 1 non-
zero
0 zero
41 0 1 male 65 Adult 0 zero 1 non-
zero
42 1 2 fem
ale
21 Adult 0 zero 0 zero
43 0 3 male 29 Adult 0 zero 0 zero
44 1 2 fem
ale
5 Child 1 non-
zero
2 non-
zero
45 0 3 male 11 Child 5 non-
zero
2 non-
zero
46 0 3 male 22 Adult 0 zero 0 zero
47 0 1 male 45 Adult 1 non-
zero
0 zero
48 0 3 male 4 Child 3 non-
zero
2 non-
zero
49 1 2 fem
ale
29 Adult 0 zero 0 zero
50 0 3 male 19 Teena
ge
0 zero 0 zero
51 1 3 fem
ale
17 Teena
ge
4 non-
zero
2 non-
zero
52 0 3 male 26 Adult 2 non-
zero
0 zero
53 0 2 male 32 Adult 0 zero 0 zero
54 0 3 fem
ale
16 Teena
ge
5 non-
zero
2 non-
zero
55 0 2 male 26 Adult 0 zero 0 zero
56 0 3 male 26 Adult 1 non-
zero
0 zero
57 1 3 male 32 Adult 0 zero 0 zero
58 0 3 male 25 Adult 0 zero 0 zero
59 1 2 male 1 Child 0 zero 2 non-
zero

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Analysis of Titanic Datasets
|15
|3119
|313

Titanic: Machine Learning from Disaster
|4
|506
|325