Data Handling And Business Intelligence Assessment 2

Verified

Added on  2023/06/18

|15
|3056
|305
AI Summary
This assessment covers the use of Excel for pre-processing the data, data analysis and data visualization, determination of decline in sales and profit using chart and graphs, conjunction with SPSS with screenshots and findings of K-means, most common data mining methods used in business with real world examples, and advantages/disadvantages of SPSS over Excel.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
DATA HANDLING AND
BUSINESS INTELLIGENCE

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
TABLE OF CONTENTS
PART 1............................................................................................................................................1
Use of Excel for pre-processing the data, data analysis and data visualization. Determination
of decline in sales and profit using chart and graphs...................................................................1
PART 2............................................................................................................................................5
2.1 Conjunction with SPSS with screenshots and findings of K-means.....................................6
2.2 Most common data mining methods used in business with real world examples...............11
2.3 Discuss advantages/disadvantages of SPSS over Excel with theoretical arguments and
practical arguments....................................................................................................................12
REFERENCES..............................................................................................................................13
Document Page
PART 1
Use of Excel for pre-processing the data, data analysis and data visualization. Determination of
decline in sales and profit using chart and graphs
Excel is one of the most important tool which is required to be used by organizations as it
helps in analysing and interpreting raw data into meaningful data that can be used by
organizations for various purposes such as decision making etc. Excel can be used for evaluating
raw information or data so that it can be processed easily and can be understood easily by its
users so that required and important information can be extracted from it (Martino, 2019). One of
the most important purpose because of which Excel is being used by organizations is pre-
processing in which data or information is entered into Excel in a manner in which it can be
understood properly. Excel provides rows and column in which data can be entered in such a
manner that its users can understand and use this data easily. Pre-processing data also help in
extracting important and useful part of Excel that can further help in extracting or evaluating any
kind of important information and gain other important information associated with it.
When pre-processing of data is done then in order to evaluate and extract important data
analysis of the data is done. Analysis of data is another important feature of excel (Kajáti, Miškuf
and Papcun, 2017). For analysis of data stored within Excel, Excel provides various kinds of in-
built formulas that can be used by users for analysis of data. Excel provides various kinds of
inbuilt features such as calculation of average, mean, mode, searching for value, etc. These
features can be used by users as per their convenience, requirement and use. But in order to make
analysis more presentable and easy to understand Excel provide visualization of data feature as
well. Using this feature, users can create graph, charts, histograms of their analysed data so that
extracted or analysed results can be understood in more accurate and effective manner.
For analysing and visualizing data in Excel, Excel provide various other features for
making extraction and interpretation of data more easy and one of those feature is Pivot tables
and charts (Becker and Gould, 2019). This feature of Excel help users in identifying and
selecting the data that they want to analyse and save that data in Pivot table so that users can
mold that data in any way they want, present any data they want, bring chances within data in
many manner users want without affecting actual an final data.
Excel is one of the most effective tool when it comes to pre-processing of data but has
some drawbacks as well such as in Excel values that are left bank can work as a barrier when it
1
Document Page
comes to analysis of data, any value in pre-processing of data is entered wrong them all the
analysis can go wrong and there is no feature in excel that can help in understanding that value
entered is wrong (Kajáti, Miškuf and Papcun, 2017). But despite of this, Excel is one of the best
and most appropriate or accurate tool that can be used for bingers when it comes to data storage,
analysis and visualization.
For analysing superstore data for analysing decline in overall sales and profit of store
over years, analysis of data using MS Excel will be done. In this analysis of data stored in Excel
will be done using in built Excel functions and using those functions and graphs in Excel
analysis of superstore data will be done and in order to understand results more clearly chart or
graph of those data will be developed for visualization of data analysed in order to identify
(Kajáti, Miškuf and Papcun, 2017). Below step by step process will clearly help in understand
decline in sales/ profit year by year in an accurate manner.
First step is to apply filter option from in built functions of Excel on first row of dataset, so
that relevant data can be selected and analysed in an appropriate manner.
Then in order to select appropriate sales data further filter can be applied on the data as per
the requirement for selecting year by year data, in order analyse and evaluate sales and profit
data year by year from huge set of data.
After selecting sales and profit data individually for 2009, 2010, 2011, 2012, sum formula
can be applied to sales and profit data individually. This step can be repeated for all the four
years and for both sales and profit. This will provide appropriate data of sales year by year
and of profit year y year. Then in order to analyse this data visual representation of data will
be done by developing chart of analysed data so that appropriate information can be observed
and relationship between sales and profit can be analysed and evaluated.
After development of charts and graphs of both sales and profit, both the visualized data will
be analysed. For the below data it has been analysed that there is relationship between profit
and sales i.e. increase of decrease in sales data has directly impacted and resulted in increase
or decrease in profitability data. From the graph it has been identified that sales of superstore
from 2009 to 2010 has reduced and result profitability of superstore has also reduced from
2009 to 2010. So in order to increase profitability of store overall sales of superstore can be
2

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
increase so it can also be said that superstore should focus upon adopting different tactics that
can be used by them to enhance their overall sales in a positive manner.
Sales
From the aobve graph it has been analysed that initally sales of superstore from year 2009
to 2010 to 2011 sales started decreasing but in year 2012 it again started increasing which clealry
helps in indicating that sales of superstore values are changing year by year. This chance in sales
direclty impacted overall profitability of superstoee and resulted in chance in profitability values
of superstore as well.
Profit
3
Document Page
In above profit graph it has been analysed that initially profitability graph is decreasing
but after decreasing it is increasing another year which is similar to sales graph. In fluctuation or
change in profit graph is due to change and fluctuation in ales graph and data. But this
profitability data vary from year to year and is slightly different from sales data which is only
because of discount provided to customers. If discount provided to customers is increased then
profitability will get reduced. And this change and variation in profitability is due to the same.
This data can help in analysing that change in sales data can directly impact profit of superstore
and in order to bring positive change within profitability. It can also be said that with increasing
sales it is important to reduce provision of discount so that profitability of superstore can also be
increased.
This data can directly help in understanding that there is a relationship between sales and
profitability but this profitability and sales relationship can vary as per the provision of discount
to customers. This can directly be supported with the help of literature that there is a relationship
between sales, profit, and discount. According to the view of Li, Yada and Zennyo, (2019)
discount is reduction in sales price when a product or service purchased by customers. If no
discount is provided to customers then increase in sales can directly increase overall profitability
and reduction in sales and reduce overall profitability. But this relationship is hampered if
discount is provided to customers on sales of products or services. Provision of discount can
directly help in increasing overall sales but it can result in reduction of overall profitability of
organization. Due to this it becomes important for companies to understand amount of discount
which is required to be provided to customers because if excessive discount is provided to
customers then it can directly impact overall profitability and if no discount is provided and sales
is already decreasing then also profitability can reduce along with sales. Li, Yada and Zennyo,
(2019) further elaborates that, since generation of sales can directly help in generation of
profitability that can lead an organization towards generating a source of income that can be used
to run business successfully but with generation of sales it is important to understand percentage
till which discount can be provided to customers such that it can cover operational cost,
maintenance cost and all other kinds of expenses and still can help organization in generation of
profitability. This directly helps in understanding that increase in sales can increase profitability
but increase in discount can reduce profitability of an organization.
4
Document Page
PART 2
Customers of the Smile Clinic do eat rice
Rice
Frequency Percent Valid Percent Cumulative
Percent
Valid
No 40 40.0 40.0 40.0
Yes 60 60.0 60.0 100.0
Total 100 100.0 100.0
Interpretation: From the above graph it has been analysed that 600 percent of customers of Smile
Clinic do eat rice
Customers are Male and Female
Gender
Frequency Percent Valid Percent Cumulative
Percent
Valid
Male 50 50.0 50.0 50.0
Female 50 50.0 50.0 100.0
Total 100 100.0 100.0
5

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Interpretation: From the above graph it has been interpreted that number of customers of Smile
Clinic who are male are 50 percent and females are 50 percent.
Mean and Median of the ages
Age
N Valid 100
Missing 0
Mean 20.35
Median 19.00
Interpretation: From the above results of table it has been interpreted that mean age of customers
of Smile Clinic is 20.35 and median age of customers is 19.
Mean & Median of participants that do eat rice
Rice
N Valid 100
Missing 0
Mean .60
Median 1.00
Interpretation: From the above results of table it has been summarized that mean of customers
who do eat rice is 0.60 percent.
2.1 Conjunction with SPSS with screenshots and findings of K-means
6
Document Page
Step 1: first step is to create variables in SPSS and entr data within those variable columsn in
data view as shown below:
Step 2: When data is entered, analysis tab is clicked and then classify is selected. When
dropdown menu appears then from it k-means is selected as shown in below image.
7
Document Page
Step 3: Select variables whose K-means is required to be calculated and then add those variables
to other side. When everything is done then ‘Iterate and classify’ is selected.
8

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
9
Document Page
Step 4: Following above steps K-means of data entered can be calculated easily. Following these
steps, below specified results can be achieved.
Initial Cluster Centers
Cluster
1 2
Gender 1 2
Age 13 26
Rice 1 0
Iteration Historya
Iteration Change in Cluster Centers
1 2
1 4.489 2.731
2 .228 .384
3 .000 .000
a. Convergence achieved due to no or
small change in cluster centers. The
maximum absolute coordinate change
for any center is .000. The current
iteration is 3. The minimum distance
between initial centers is 13.077.
Final Cluster Centers
Cluster
1 2
Gender 1 2
Age 18 24
Rice 1 1
10
Document Page
Number of Cases in each
Cluster
Cluster 1 56.000
2 44.000
Valid 100.000
Missing .000
Interpretation: K-means is a clustering algorithm which is used in unsupervised learning for
solving clustering problems in an appropriate and effective manner. In above tables values have
been assigned in clusters and maximum number of values have been assigned in cluster 1.
2.2 Most common data mining methods used in business with real world examples
Data mining can be defined as a method that can be used by organizations for identifying
or extracting useful information or patterns that can be used for taking important decisions and
draw important conclusion from the same. There are many different kinds of data mining
methods that can be used by organizations for many different purposes, some of the most
commonly used data mining methods that are used by organizations with real world example
have been explained below:
Cluster analysis: it is one of the most common data mining methods that can be used by
business organizations. In this clusters from data are made on the basis of their similarities
and dependencies on data items. Different kinds of clusters can be made and used by
organizations depending upon the type of data set used by organizations (Atluri, Karpatne
and Kumar, 2018). Cluster analysis is also known as data segmentation because in this huge
data can be segmented in small clusters as per the use of organizations. For example: Banks
can use this data mining technique for segregating customers who are at high credit risk and
customers who are at low credit risk.
Decision trees: it is another most commonly used data mining method which is used by
organizations for classifying their items or information so that this classified information cab
be used for taking appropriate and effective decisions. It is one of the most commonly used
data mining method that can be used by organizations for segregating data into categories
11

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
and sub categories (Alasadi and Bhaya, 2017). For example: government can use this
decision tree data mining method for identifying people who are 18 plus and are eligible for
driving licence whereas people who are below 18 and are not eligible for driving licence.
Neural network: it is used for defining relationship between input and output for analysing
expected behaviour or expected patter which is expected. For example: This method can be
used by retail stores for analysing or identifying expected behaviour of their customers by
studying their past sales and behaviour pattern.
2.3 Discuss advantages/disadvantages of SPSS over Excel with theoretical arguments and
practical arguments
There are many different kinds of advantages and disadvantages that SPSS has over MS
Excel Some of them are as follows:
Advantages that SPSS has over MS Excel
It is extremely easy to work on SPSS as compared to MS Excel this is because SPSS has
inbuilt statistical analysis functions that can be easily used by users as per their convenience
where in MS excel all kinds of statistical calculations are required o be done manually using
inbuilt formulas (Cleff, 2019).
In SPSS any kind of statistical test can be done easily and simply by using some inbuilt
features whereas for doing statistical test in Excel it is important to learn and understand step
by step process of conducing statistical test.
In SPSS missing values is never ever a problem and data can be analysed even if a value is
missing whereas in Excel missing value is always a problem till missing value is not filled
analysis is extremely difficult.
Disadvantages of SPSS over MS Excel
SPSS is a bit costly as compared to Excel because is freely available within complete
package of MS office (Abellanosa, Ander and Gowing, 2018).
Chart and graph feature in SPSS is not much effective as in SPSS only one type of chart is
available whereas in Excel different kinds of charts and graphs are available.
12
Document Page
REFERENCES
Books and Journals
Abellanosa, C.I.D.G., Ander, E.L. and Gowing, C.J.B., 2018. Validation for the transition of
SPSS QI Analyst to the SPC for Excel program for quality control charting.
Alasadi, S.A. and Bhaya, W.S., 2017. Review of data preprocessing techniques in data
mining. Journal of Engineering and Applied Sciences, 12(16), pp.4102-4107.
Atluri, G., Karpatne, A. and Kumar, V., 2018. Spatio-temporal data mining: A survey of
problems and methods. ACM Computing Surveys (CSUR), 51(4), pp.1-41.
Becker, L.T. and Gould, E.M., 2019. Microsoft power BI: extending excel to manipulate,
analyze, and visualize diverse data. Serials Review, 45(3), pp.184-188.
Cleff, T., 2019. Applied statistics and multivariate data analysis for business and economics: A
modern approach using SPSS, Stata, and Excel. Springer.
Kajáti, E., Miškuf, M. and Papcun, P., 2017, January. Advanced analysis of manufacturing data
in Excel and its Add-ins. In 2017 IEEE 15th International Symposium on Applied
Machine Intelligence and Informatics (SAMI) (pp. 000491-000496). IEEE.
Li, Z., Yada, K. and Zennyo, Y., 2019. Duration of Price Promotion and Retail Profit: An In-
depth Study Based on Point-of-Sale Data.
Martino, J.C.R., 2019. Hands-On Machine Learning with Microsoft Excel 2019: Build complete
data analysis flows, from data collection to visualization. Packt Publishing Ltd.
13
1 out of 15
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]