logo

Business Analytics Assignment Sample

   

Added on  2021-02-21

27 Pages3864 Words76 Views
Business Analytics

TABLE OF CONTENTSINTRODUCTION...........................................................................................................................4SECTION A: DISCUSSION QUESTIONS....................................................................................41. Explaining Confusion Matrix in the Classification Methods along with example............42. Defining two practical examples on applications of classification methods with explanations.53. Over Sampling Partitioning before building the model.....................................................64. Explaining how explanatory and categorical variables can be used in logistic regression7SECTION B: QUANTITATIVE QUESTIONS..............................................................................85..............................................................................................................................................8a. Explaining steps of KNN model in relation to making predictions about customers who will spend more than $10008b. Developing a predictive model for making assessment about new female customer........86..............................................................................................................................................8a. Analysing data set and giving recommendations...............................................................8b. Building a model to predict the repair time for a future booking service..........................8c. Giving recommendations in relation to adding variables in dataset for better assessment97............................................................................................................................................10

a. Explaining approach in relation to filling missing values in excel sheet.........................10b. Calculating average total protein for each blood type......................................................10c. Computing range of total protein and explaining approach for the same.........................11d. Presenting the extent to which protein is declined by age................................................13e. Presenting two best visualisation tool for which is highly relevant to the current data set16a............................................................................................................................................16b............................................................................................................................................218............................................................................................................................................24a. Assessment of annual investments and returns................................................................26b. If Matthew aims to gain $1,500,000 at the end of the 30th year, what percentage of his salary he should put in the investmentannually................................................................................................................................279. Linear programming assessment......................................................................................28a. Writing linear optimization model for company in order to make best decision.............28b. Presenting and interpreting results...................................................................................28c. Rewriting model when discounting strategy is applied....................................................28REFERENCES..............................................................................................................................29

INTRODUCTION Business analytic is the process that involves methodological exploration of the company's data with focusing on the statisticalanalysis. It is been used by the organization for making decisions committed to the data driven. Business analytic is utilized forgaining the insights that provides for suitable decisions regarding the business. It is said to be most useful in optimizing and inautomation of the business processes. It is majorly categorized into two major segments that are business intelligence and statisticalanalysis. The present study is based on various aspects that relates with the business analytic. Furthermore, it includes the quantitativequestions and different models are also described under the study. SECTION A: DISCUSSION QUESTIONS 1. Explaining Confusion Matrix in the Classification Methods along with example. Confusion Matrix also known as Error Matrix. It is a process which assist in machine learning, mainly related to the problem ofstatistical classification. In this process, there is a specific table layout which permits visualisation of performance of an algorithm. Inthis method, each row of the matrix depicts about instances in the predicted or estimated class whereas each column defines theinstances related to the actual class or vice versa case (Salamon and Bello, 2017). It is considered as one of the most special type of contingency table which is having two classes viz. are 'actual' as well as'predicted'. It also consists of identical sets of classes. In the contingency table, each combination of dimension and class is of variablenature. For example: From a sample of 27 animals viz. 8 cats, 6 dogs and 13 rabbits, confusion matrix will be as follows:Actual ClassCat DogRabbit

Predicted ClassCat520Dog332Rabbit0111In confusion matrix of 8 cats of actual, it was predicted by the system that 3 were dogs. Out of 6 dogs, prediction was madethat 1 was rabbit and 2 were cats. With the help of above matrix it can be interpreted that system is having trouble while distinguishingbetween dogs and cats. Whereas distinction can be made easily between rabbit and other animal. 2. Defining two practical examples on applications of classification methods with explanations. Classification is defined as a data mining process which is useful for estimating and predicting membership of group for datainstances. It is a technique of data mining which main role is to assign different categories to a data collection so as to make properand more accurate analysis as well as prediction. Application of classification methods are as follows:Decision Tree Technique – This technique helps in producing a sequence of rules and standards with the given data attributesalong with its classes, helps in classifying data. It is easy to understand, interpret and visualise which needs preparation of data. Thismethod can handle both numerical as well as categorical data type (Deng and et.al., 2016). K – Nearest Neighbor Technique – In this technique, the nearest neighbor is measured in the context of k value which definesthat how many nearest neighbors is required to be assessed so as to describe class of sample data point. It is used for Microarray dataclassification, short term traffic flow forecasting, Agarwoord oil quality grading, face recognition etc. It is easy and simple forimplement, more effective in case of noisy and large training data.

3. Over Sampling Partitioning before building the model.The term Over Sampling Partitioning is a statistical tool which assist in the process of analysis of data. This technique helps inadjusting the class distribution of a data set. In other word, it helps in representing the ratio between different classes as well asdifferent categories.The main role of this technique is related to statistical sampling, methodology related to survey design and provides support inmachine learning. Oversampling technique involves a process of introducing a bias to select more and best samples from one classthan from another class (Ebadi, Antignac and Sands, 2016). It is also compensated for an imbalance which is either present in the dataalready or likely to develop in case when a purely random sample were taken. Oversampling techniques for classification problems areas follows:Random oversampling – This technique of random oversampling involves supplementing the data trained with multiple copies ofsome of the minority classes. The process of oversampling can be done for more than once (2x, 3x, 5x etc.) In this system, instead ofrepeating or duplicating each sample in the minority class, many of them can be randomly chosen with replacement.ADASYN – Stands for the adaptive synthetic sampling approach or ADASYN algorithm. It is a process which builds on themethodology of SMOTE by shifting the importance of classification boundary or standard to those of minority classes which are ofdifficult nature (Slagter, Hsu and Chung, 2015). This technique makes use of weighted distribution for various minority class exampleaccording to their difficulty level in learning, where more synthetic data are produced for minority class examples which are of hardnature for learning. 4. Explaining how explanatory and categorical variables can be used in logistic regressionLogistic regression model is the appropriate technique for analysing and conducting the study for describing the data and foreffectively establishing the relationship between the dependent and the independent variable. It is also called as the predictive analysisfor evaluating the outcome. This resultant outcome is been measured under the model with the dichotomous variable. It is analysis that

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Business Analytics - Discussion and Quantitative Questions
|17
|2504
|96

Business Analytics Assignment (Solved)
|12
|1793
|236

ISYS3375 Business Analytics
|3
|1421
|257

Credit Card Fraud Prediction
|10
|1541
|25

Business Analytics: Mathematical Model, Costing and Revenue Behaviour, Breakeven Analysis, and GDP and Employment Analysis
|17
|4558
|339

1. Executive Summary Objective To examine the factors t
|23
|5105
|51