logo

Data Management for Decision Support.

   

Added on  2022-09-07

21 Pages3110 Words16 Views
Data Science and Big DataArtificial IntelligenceStatistics and Probability
 | 
 | 
 | 
DATA MANAGEMENT FOR DECISION SUPPORT
Data Management for Decision Support._1

Table of Contents
1. Introduction.......................................................................................................................1
2. Introduction of Dataset, Problems, and Methods..........................................................1
2.1 Dataset Description.......................................................................................................1
2.2 Description of Dataset Variables.................................................................................2
2.2.1 Dependent and Independent Variables................................................................2
2.3 Case Description............................................................................................................2
2.4 Software Requirement..................................................................................................3
3. Analytic Models.................................................................................................................3
3.1 Linear Regression..........................................................................................................3
3.1.1 Objectives Linear Regression...............................................................................3
3.1.2 Multiple Linear Regression Terminologies.........................................................3
3.1.3 Strength and Weaknesses of Linear Regression.................................................4
3.2 Artificial Neural Networks (ANN)...............................................................................4
3.2.1 Learning Process....................................................................................................4
3.2.2 Objectives of ANN.................................................................................................4
3.2.3 Concept of ANN.....................................................................................................5
3.2.4 Feed Forward Neural Network Description of Components.............................5
3.2.5 Description of Components: FFNN......................................................................5
3.2.6 Strengths and Weaknesses of ANN......................................................................6
4. Data Analysis and Results................................................................................................6
4.1 Data Randomization.....................................................................................................6
4.2 Splitting and Training...................................................................................................9
4.3 Multiple Linear Regression with Weka (MLR).......................................................10
4.4 Artificial Neural Network with Weka (ANN)...........................................................12
4.5 Observation of ANN Training Results......................................................................14
5. Evaluation........................................................................................................................16
5.1 Correlation Coefficient...............................................................................................16
5.2 Root Mean Squared Error (RMSE)..........................................................................16
5.3 Mean Absolute Error (MAE).....................................................................................16
5.4 Summary Result for both MLR and ANN................................................................16
6. Decision Support.............................................................................................................17
7. Conclusion.......................................................................................................................17
References...............................................................................................................................18
Data Management for Decision Support._2

1. Introduction
Artificial neural network is commonly known as a computational model which works
and looks like the biological neural network's structure and functions (What is Artificial
Neural Network - Structure, Working, Applications, 2018). The neural networks could be
employed as a data analysis tools to forecast and predict depending on the historical data in a
data-driven Decision Support System. On the other hand, the neural networks could also be
viewed as the quantitative models for being utilized in the model-driven Decision Support
System.
The main objective of this project is to apply the data mining techniques to analyze
the real dataset and evaluate the results. This project uses BATS BRITISH AMERICAN
TOBACCO PLC ORD 25P dataset, and also takes help of linear regression and artificial
neural network method for evaluating the results with the assistance of Weka data mining
tool. Here, data analysis is used to explain and predict the selected real data and possible
phenomena behind it. Finally, it will analyze and summarize the results to provide effective
decision support and conclusion.
2. Introduction of Dataset, Problems, and Methods
2.1 Dataset Description
This project uses British American Tobacco PLC, a holding company dataset, where
the last five years data is selected (British American Tobacco p.l.c. (BATS.L), 2020). The
dataset is represented in the below figure.
1
Data Management for Decision Support._3

2.2 Description of Dataset Variables
The provided dataset contains the following variables:
Date
Open
High
Low
Close
Adj. close
Volume
2.2.1 Dependent and Independent Variables
This dataset contains the following dependent and independent variables:
Dependent variable: Volume
Independent variables: Date, Open, High. Low, close, and Adj. Close.
2.3 Case Description
The British American Tobacco PLC company is a multi-category consumer good
company and it provides nicotine and tobacco products. This project uses this company data
to analyze its last five years data, with the help of linear regression and artificial neural
network method. The results are evaluated by using Weka data mining tool (Han, Kamber
2
Data Management for Decision Support._4

and Pei, 2012). This tool explains and predicts the company’s data and helps in analyzing and
summarizing the results to take effective decision support for improving the company’s
trading view.
2.4 Software Requirement
This project makes use of Weka tool, which is a data mining tool for analyzing the
selected data. Weka tools refers to a collection of machine learning algorithms used to
resolve the real-world data mining issues (Witten, Frank, Hall and Pal, 2017). This tool is
programmed in Java and executes on any platform. The algorithms could either be directly
applied to the dataset or it can be called using your Java code. It facilitates various tools for
data pre-processing, to implement various Machine Learning algorithms, and gives access of
visualization tools for developing machine learning techniques, which are applied on the real-
world data mining issues (Stahlbock, Abou-Nasr and Weiss, 2018).
3. Analytic Models
3.1 Linear Regression
3.1.1 Objectives Linear Regression
Linear regression models are used for prediction purpose, where the regression
models are used for the inference and prediction purpose. The predictive goal ensures to
evaluate the model’s performance on a validation set and for using the predictive metrics
(Xanthopoulos, Pardalos and Trafalis, 2013).
3.1.2 Multiple Linear Regression Terminologies
Multiple linear regression (MLR) is even called as a multiple regression, which is a
statistical technique utilizing various explanatory variables for predicting the response
variable’s outcome. MLR’s goal includes modeling linear relationship between the
explanatory (independent) variables and response (dependent) variable (Modern Machine
Learning Algorithms: Strengths and Weaknesses, 2019).
A simple linear regression denotes a function which permits an analyst or statistician
for making the predictions about one variable depending on the information known about the
other variables (Hastie, Friedman and Tisbshirani, 2017). Linear regression could just be
utilized when one contains two continuous variables—an independent variable and a
3
Data Management for Decision Support._5

dependent variable. The independent variable refers to a parameter which is utilized for
calculating the outcome or the dependent variable.
3.1.3 Strength and Weaknesses of Linear Regression
Strengths:
It could be regularized for avoiding over fitting.
The linear models could be easily updated with new data with the help of
stochastic gradient descent.
Weaknesses:
Linear regression has poor performance, when there are non-linear relationships.
Linear regression lacks natural flexibility for capturing highly complicated patterns.
It is time consuming and tricky to add the right interaction terms or polynomials.
3.2 Artificial Neural Networks (ANN)
3.2.1 Learning Process
Learning rule or learning process of an Artificial neural network refers to a method,
mathematical logic or algorithm that improvises the performance and training time of
network. In general, this rule is frequently implemented on a network, by updating
the network levels of weights and bias when a network is simulated in a particular data
environment.
Learning rule might accept the present network conditions i.e., weights and biases, and
compares the expected results with the actual network results for providing new and
improved values for weights and bias (Introduction to Learning Rules in Neural Network,
2018). The learning rule addresses the factors which helps to decide how fast or accurately
the artificial network could be developed. For developing a network, it requires the following
three main machine learning models (Perner, 2015):
Unsupervised learning
Supervised learning
Reinforcement learning
3.2.2 Objectives of ANN
Artificial Neural Networks is abbreviated as ANN, and it is a computational model.
The following are the objectives of ANN:
4
Data Management for Decision Support._6

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Financial Econometrics Concept PDF
|18
|5600
|218

Study on Living Standard of Unemployed and Disparity in Income Levels
|27
|3666
|173