Ask a question from expert

Ask now

ISYS3374 - Business Analytics Assignment

15 Pages3290 Words199 Views
   

Added on  2021-06-02

ISYS3374 - Business Analytics Assignment

   Added on 2021-06-02

BookmarkShareRelated Documents
RMIT University1-Course/Unit codeAssignment numberAssignment due dateGroup/Session name (if applicable)ISYS3374Assignment 331 May 2020Course/Unit nameProgram titleBusiness Analytics (2010)Master Business of Information TechnologyLecturer/Teacher’s nameTutor / Marker’s name (if applicable)Dr Babak AbbasiDr Joerin MotavallianThis statement should be completed and signed by the student(s) participating in preparation of the assignment.Declaration and statement of authorship:1.I/we hold a copy of this assignment, which can be produced if the original is lost/damaged.2.This assignment is my/our original work and no part of it has been copied from any other student’s work or from any other source except where due acknowledgment is made.3.No part of this assignment has been written for me/us by any other person except where such collaboration has been authorisedby the lecturer/teacher concerned and is clearly acknowledged in the assignment. 4.I/we have not previously submitted or currently submitting this work for any other course/unit.5.This work may be reproduced and/or communicated for the purpose of detecting plagiarism.6.I/we give permission for a copy of my/our marked work to be retained by the School for review by external examiners.7.I/we understand that plagiarism is the presentation of the work, idea or creation of another person as though it is your own. It isa form of cheating and is a very serious academic offence that may lead to expulsion from the University. Plagiarised material can be drawn from, and presented in, written, graphic and visual form, including electronic data, and oral presentations. Plagiarism occurs when the origin of the material used is not appropriately cited.8.Enabling plagiarism is the act of assisting or allowing another person to plagiarise or to copy your work.Family nameGiven nameStudent numberStudent signatureDateLYTRONG TIENS379042531 May 2020Further information relating to the penalties for plagiarism, which range from a notation on your student file to expulsion from the University, is contained in Regulation 6.1.1 ‘Student Discipline’www.rmit.edu.au/browse;ID=11jgnnjgg70y and Academic Policy: ‘Plagiarism’www.rmit.edu.au/browse;ID=sg4yfqzod48g1.Assessor’s commentsGradeSchool date stamp(Office use only)
ISYS3374 - Business Analytics Assignment_1
SECTION A:Question 1:Overfitting is when a model is too closely or exactly corresponding to a particular sample datasetwhich may cause that model not able to match with other sets of data, and therefore would providesome inaccurate prediction. However, this is easy to avoid by some following techniques:-Using just independent variables that have close and meaningful relationship with dependentvariable to lessen the risk of-Using more complex models like linear regression models or quadratic models to test if themodel can generate accuracy value by evaluating its performance on a different set of dataand base on that can approximate the typical hidden data that could cause overfitting to themodel.-Putting more data inside the sample dataset to increase the accuracy of testing.Question 2:Predictive analytics has been deployed to use in many industries, especially in retailing industry,where it is considered as the most useful tool help forecasting the stocks and improving thecustomer experience. Retailingbusinesses can have a better plan of stocking to avoid the over-stockor out-of-stock problem. For instance, in the peak season like Christmas or Black Friday when thedemands of shopping increase, predictive analytics could use historical data to predict the amount ofgoods could be sold to avoid out-of-stock in stores. Predictive analytics also provides a better insightof customer behavior when analyzing the shopping preferences or the buying history in order topredict new opportunities to engage with their customers, and when it comes to a new marketing orsale campaign predictiveanalytics could help to build-up a better personalized shopping experience.Question 3:Missing not at random values is when that missing value has relationship with the attribute. To deal with MNAR, there are many methods to do, but the most popular is to use multiple-regression analysis to estimate a missing value. By using this technique to figure out the missing SUS scores. Regression substitution could help to predict the missing value from the other values of the same category. Example of not missing at random values is when doing an income survey, the people who have higher income tend to hide their true income or don’t want to provide the answer cause missing not at random in the final report.Question 4:To develop the logistic regression model, the variable X1 can be replaced by two dummy variables, inwhich each would correspond to one of the levels of the X1 and have binary values of one and zero.For instance, X1A and X1B can be used for X1. When X1A value is one and X1B is zero the category wouldbe low; or when X1A value is zero and X1B is one the category would be average; and when both havethe value of zero the category would be high. And the same rule is applied for X2 with three dummyvariables. Based on that, logistic regression model could be developed with five coefficients (two forX1 and three for X2).2
ISYS3374 - Business Analytics Assignment_2
SECTION B: Question 5:Question 5 - Part a:SUMMARY OUTPUTRegression StatisticsMultiple R0.060745R Square0.00369Adjusted R Square-0.00436Standard Error295.8267Observations500ANOVAdfSSMSFSignificanceFRegression4160435.840108.940.4583170.766334Residual4954331916487513.46Total49943479600CoefficientsStandardErrort StatP-valueLower95%Upper95%Lower95.0%Upper95.0%Intercept1164.2048.64223.931.12E-841068.61259.771068.631259.77Age (blanks means we do not know their age)0.840.76721.0920.275-0.66932.345-0.6692.345Gender (Male is 1)-7.6526.511-0.2880.772-59.74044.436-59.7444.436Family size-6.047.7080-0.7830.433-21.1869.102-21.1869.102Membership (with membership is 1)-1.3726.510-0.0510.958-53.45850.715-53.45850.715Spent amount = 1,164.2 + 8.84*Age – 7.65*Gender – 6.04*Family size – 1.37*Membership1,164.2 is the constant amount spent that is not depending on the variables. The coefficients valuemeans if the variable is 1, the spending will be impacted by this amount.This model’s accuracy is low because of low R2 and high sig.Question 5 - Part b:This model should use more quantitative variables instead of quality variables to increase accuracy.The significant variances should be removed to lower the P-value within the accepted range.These following charts represent that there is no relationship between the spending amount andproduct types and discount card type, which the model can remove without changing the accuracy.3
ISYS3374 - Business Analytics Assignment_3
Question 6:Question 6 – Part a:The amount of time for each repair person is calculated as in Figure 1, which shows that Bob has thehighest amount of repair time at 56%, John is at the second place with 35% and James has the leastwith 9%.Bob56%James9%John35%The services that had been done in the morning is mostly higher than in the afternoon by 11%. Thereare 8% of the services unknow due to data missing recorded in the time of service.4
ISYS3374 - Business Analytics Assignment_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Design Layout of Warehouse and Distribution Centre
|16
|4316
|45

ACCT2127: Accounting for Management Decision
|11
|2728
|80

Individual Assignment - Doc
|7
|2161
|135

Yield Curves of US and Europe
|12
|4159
|64

Assignment 2 Instructions
|10
|2714
|190

CAR20001 Future Work Skills Assignment 2 Labour Market (Employer Option)
|15
|3708
|372