Ask a question from expert

Ask now

BUS5PA - Predictive Analytics- Assignment

31 Pages3683 Words461 Views
   

La Trobe University

   

Predictive Analytics (BUS5PA)

   

Added on  2020-02-24

About This Document

BUS5PA - Decision tree induction is a method mainly based on two principles. The first principle is called the divided rule. This indicates, that in every step, the data would be split into two or more parts and the algorithm continues recursively on individual parts (Barros, Basgalupp, Carvalho, The second principle is called the greedy principle, which indicates that the splitting is based only on limited information. Decision trees are mainly used for predictions, classifications, and descriptions. The decision tree is a classifier with a very high capacity. But, in real scenarios, decision trees may tend to overfit.

BUS5PA - Predictive Analytics- Assignment

   

La Trobe University

   

Predictive Analytics (BUS5PA)

   Added on 2020-02-24

BookmarkShareRelated Documents
Assignment 1 – Building and Evaluating PredictiveModels(BUS5PA Predictive Analytics – Semester 2, 2017)By<Student Name>(18815197) La Trobe Business School Melbourne, Australia
BUS5PA - Predictive Analytics- Assignment_1
Table of Contents1.Setting up the project and exploratory analysis12.Decision tree based modeling and analysis23.Regression based modeling and analysis44.Open ended discussion65.Extending current knowledge with additional reading9References12-13AppendixA-M
BUS5PA - Predictive Analytics- Assignment_2
List of FiguresFigure 1 Creation of project: BUS5PA_Assignment1_18815197..................................................AFigure 2 Creation of Library...........................................................................................................AFigure 3 Roles of variables.............................................................................................................BFigure 4 Distribution of Organics purchase indicator.....................................................................BFigure 5 Organics data source.........................................................................................................CFigure 6 Organics data source in Organics diagram workspace.....................................................CFigure 7 Addition of Data partition.................................................................................................CFigure 8 Data set Allocations..........................................................................................................DFigure 9 Decision Tree Addition....................................................................................................DFigure 10 Interactive method has not been selected.......................................................................DFigure 11 Use average square error as Assessment measure..........................................................DFigure 12 Subtree Assessment Plot.................................................................................................EFigure 13 Decision Tree Model......................................................................................................EFigure 14 Decision Tree after adding Tree 2...................................................................................FFigure 15 Three-way Split...............................................................................................................FFigure 16 Assessment Measure.......................................................................................................F
BUS5PA - Predictive Analytics- Assignment_3
Figure 17 Average square error for the model with Tree 2............................................................GFigure 18 StatExplore tool with ORGANICS data source.............................................................GFigure 19 Default input method of class and interval variables.....................................................HFigure 20 Imputation indicators for all imputed inputs..................................................................HFigure 21 Addition of Regression node..........................................................................................HFigure 22 Model Selection...............................................................................................................IFigure 23 Regression Result.............................................................................................................IFigure 24 Summary of Stepwise Selection......................................................................................JFigure 25 Odd ratio Estimates..........................................................................................................JFigure 26 Average squared error (ASE).........................................................................................KFigure 27 Model Comparison Process............................................................................................KFigure 28 Model Comparison Result..............................................................................................LFigure 29 ROC Chart......................................................................................................................LFigure 30 Cumulative Lift..............................................................................................................MFigure 31 Fit Statistics...................................................................................................................M
BUS5PA - Predictive Analytics- Assignment_4
List of TablesTable 1 Model performance comparison6
BUS5PA - Predictive Analytics- Assignment_5
1.Setting up the project and exploratory analysisa.Project named BUS5PA_Assignment1_18815197 has been created which has been shown inFigure 1.1)SAS Library has been created named Assign, and data source has been created using SASdataset ORGANICS, which has been mentioned above in Figure 2 and Figure 5.2)Roles have been set as mentioned in the business case assignment, Figure 3 shows all theroles defined for the data source ORGANICS.3)TargetBuy” is defined as target variable. 24.77% individuals have purchased organicproducts and rest i.e. 75.23% have not purchased organic products, which has been depictedin Figure 4.4)As mentioned in Figure 3, Demcluster has been set rejected.5)Data source named organics has been defined, which has been shown in Figure 5.6)Data source ORGANICS has been added to Organics diagram workspace, which has beenshown in Figure 6.b.TargetAmt cannot be used as the input for a model that is used to predict TargetBuy,TargetBuy indicates if the individuals have purchased the organic item or not, whereasTargetAmt indicates the number of organic amounts bought. TargetAmt will only be recordedfor those who have purchased any organic products i.e. when Targetbuy is Yes. Hence,-1-
BUS5PA - Predictive Analytics- Assignment_6
TargetAmt can never be the predictor of TargetBuy. In this business case, as an initial buyerincentive plan, the management’s objective is to develop loyalty model by whethercustomers have purchased any of the organic products. So, TargetBuy is the perfectlysuitable as target variable.2.Decision tree based modeling and analysisa.Data partition node is added to the Organics diagram workspace from Sample Tab, and it hasbeen connected to the data source node i.e. ORGANICS. 50% of the data have been assignedin training and rest 50% have been added in validation, which has been depicted in Figure 7and Figure 8.b.Decision Tree node has been added to the Organics diagram workspace and it has beenconnected to the Data partition node, which has been depicted in Figure 9.c.Decision Tree has been built autonomously, not interactively, and sub tree model assessmentcriteria has been chosen as Use average square error which has been shown in Figure 10 and11.1)As per average square error, there are total 29 leaves in the optimal tree, subtree assessmentplot has been shown in Figure 12.-2-
BUS5PA - Predictive Analytics- Assignment_7
2)Age has been used for the first split, it has partitioned the training data in two parts, firstsubset was for the age less than 44.5, for this subset TargetBuy = 1 has higher than averageconcentration. Second subset is for age greater than or equals to 44.5, for the second subsetTargetBuy = 0 has higher than average concentration. Using average square error assessment,autonomously created decision tree model has been depicted in Figure 13.d.Second Decision Tree has been added to the diagram, and it has been connected to the DataPartition node (shown in Figure 14). 1)The maximum number of branch has been set 3 to allow three-way splits, shown in Figure15.2)Creation of decision tree model using average square error has been shown in Figure 16.3)As per average square error, there are 33 leaves in the optimal tree, subtree assessment plothas been shown in Figure 17. In C, there were 29 leaves in the optimal tree. With thedecision tree, Tree 2, misclassification rate (Train:0.1848) of the model is very marginallylower than the model with the decision tree, Tree 1 (Train: 0.1851) and average square errorof the model with the decision tree, Tree 1 (Train: 0.1329) is lower than the model with thedecision tree, Tree 2 (Train: 0.1330). Hence it can be said that tree with 29 leaves performsmarginally better in terms of average square error and tree with 33 leaves performsmarginally better in terms of misclassification rate. But with the higher number of leavescomplexity increases, a less complex and reliable tree may be appropriate.-3-
BUS5PA - Predictive Analytics- Assignment_8

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
BUS5PA : Assignment on Predictive Analytics
|27
|3467
|420

BUS5PA - Predictive Analytics Assignment
|25
|3024
|434

BUS5PA: Building and Evaluating Predictive Models | Assignment
|25
|2918
|37

BUS5PA Predictive Analysis: Building and Evaluating Predictive Models Using SAS Enterprise Miner Assignment 2
|25
|4729
|392

BUS5PA Predictive Analysis Building and Evaluating Assignment 2022
|8
|1405
|30

Business Statistics and Analysis
|9
|840
|344