ProductsLogo
LogoStudy Documents
LogoAI Grader
LogoAI Answer
LogoAI Code Checker
LogoPlagiarism Checker
LogoAI Paraphraser
LogoAI Quiz
LogoAI Detector
PricingBlogAbout Us
logo

Data Handling & Business Intelligence 2 - Desklib

Verified

Added on  2023/06/18

|19
|3196
|406
AI Summary
This report provides an insight into the usage of the large data and the application of excel as a tool for the purpose of analysing the data along with practical example of the same. In addition to this, it involves analysing the usage of SPSS which is a data mining software in the data analysis.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Data Handling &Business
Intelligence 2

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
TABLE OF CONTENTS
INTRODUCTION......................................................................................................................3
Part 1......................................................................................................................................3
TASK 2......................................................................................................................................7
Presenting screen shorts and explaining the interpretation...................................................7
Explaining data mining methods..........................................................................................14
Explaining the advantages & disadvantages of SPSS over excel.......................................15
CONCLUSION........................................................................................................................16
REFERENCES.........................................................................................................................17
Document Page
INTRODUCTION
Data handling mainly accounts for the process which helps in ensuring that the
research data is being stored in a secure manner. It also involves complying with the policies
and procedures pertaining to managing the large data sets. On the other side, the business
intelligence accounts for the procedural and technical infrastructure which involves
collection, storage and then analysing the collected data through the application of analytical
tools and techniques. This report provides an insight into the usage of the large data and the
application of excel as a tool for the purpose of analysing the data along with practical
example of the same. In addition to this, it involves analysing the usage of SPSS which is a
data mining software in the data analysis.
Part 1
Evaluating the use of Excel for pre-processing, analysing and visualising the data
Excel is having a number of options or features which can be utilized by the
individuals and the firms for the purpose of effectively analysing the large data sets. Excel
helps in displaying data analysis report in a number of ways which helps in getting better
insights into the data collected and in deriving meaningful information from the same. In
excel, data can be sort out using the filter option helps in only viewing the information which
is necessary. In excel data visualization can be carried out using the charts which supports in
graphical representation of the data (Raubenheimer, 2017). This is depicted in the form of
bar, pie or line charts and many others. It offers the user with the number of chart types which
one can choose from and customise their charts as per requirement and better understanding.
In addition to this, the most widely used function of excel is Pivot charts which shows data
series and axes in the same way as the normal standards charts but it provides additional
filtering options and controls on the charts. In addition to this, it involves effectively making
use of the Pivot table as it can incorporate huge data along with various complex worksheet
data which also involves numbers and text. This helps in better analysing the data. Along
with this, Pivot chart can be created with the filter options which results into effectively
accomplishing the desired results and undertaking various business and strategic decisions.
Excel is having bulk of formulas and functions which results into making it easy for
the user to effectively analyse the data as per the requirement. In this, various types of charts
can be created with the help of wide range such as the clustered chart, stacked chart, gauge
chart, pie chart, Venn diagram, scatter chart, bullet chart, funnel chart etc. In excel in simple
terms, large data set can be easily visualized through effective classification and
Document Page
categorization of it. Finding out the relationship between the two, understanding the
composition, distribution and overlapping of the data (Prodromou, 2017). It also assists in
determining any patterns, trends within the given data. In addition to this, it also supports in
detecting outliers along with other anomalies within the given data. Through this, the user can
also carry out the prediction about the future trends which results into providing menacing
full and engaging insights about the data which consequently leads to undertaking a better
decision about the future.
Practical Application
The below Pivot table shows the decline in profits and sales over the year from 2009 to 2012.
The table underneath provides the detail about the profits and sales with respect to region
across all the 4 years.
Row Labels
Sum of
Profit
Sum of
Sales
2009 434096.02
4209896.84
6
2010 364917.33
3560087.04
5
2011 380310.5
3429944.98
1
2012 342444.13
3715671.95
3
Grand Total 1521767.98
14915600.8
2
Row Labels
Sum of
Profit
Sum of
Sales
Atlantic 238960.66
2014248.20
4
2009 81948.62 668393.28
2010 55233.32
508428.088
5
2011 47012.14 357271.574
2012 54766.58 480155.261
North Carolina 2841.11
116376.483
5
2009 -1282.19 20337.03
2010 3188.56 42098.744
2011 347.42 32028.2
2012 587.32 21912.5095
Northwest Territories 8307.05 83817.746
2009 4510.35 22145.2305
2010 1616.28 16484.4905
2011 3625 26332.5095
2012 -1444.58 18855.5155

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Ontario 439214.57
3780242.06
3
2009 140766.88
1073872.02
5
2010 121434.93
956311.965
5
2011 81471.4
871108.212
5
2012 95541.36 878949.86
Prarie 321160.12
2837304.60
2
2009 87684.7
782526.417
5
2010 48849.4 545009.823
2011 98162.47
711560.601
5
2012 86463.55
798207.759
5
Quebec 140426.65 1510195.08
2009 47563.22
370323.019
5
2010 29331.69
369738.958
5
2011 17382.68 319538.288
2012 46149.06 450594.814
West 297008.61
3597549.27
6
2009 59188.71
1045843.40
9
2010 78920.24
851790.426
5
2011 103338.27 840788.285
2012 55561.39
859127.155
5
Yukon 73849.21 975867.371
2009 13715.73
226456.434
5
2010 26342.91 270224.548
2011 28971.12
271317.310
5
2012 4819.45 207869.078
Grand Total 1521767.98
14915600.8
2
The below table states about the sales in different years with respect to the various product
categories and sub-categories.
Sum of Sales
Colum
n
Labels
Row Labels <01- 2009 2010 2011 2012 Grand
Document Page
01-
2009 Total
Furniture 1472671.724
1252518
.416
1268656.
078
11847
44.324
5178590
.542
Bookcases 253942.12
263974.
29
140925.4
3
16381
0.2
822652.
04
Chairs & Chairmats 518943.44
391712.
18
457000.3
5
39418
0.58
1761836
.55
Office Furnishings 220564.95
165111.
94
163254.2
8
14916
2.64
698093.
81
Tables 479221.214
431720.
006
507476.0
18
47759
0.904
1896008
.142
Office Supplies 1035399.64
910359.
95
796383.7
9
10106
18.72
3752762
.1
Appliances 170657.15
216047.
3 169051.1
18123
5.99
736991.
54
Binders and Binder
Accessories 360193.28
178343.
3
184452.7
8
29996
8.23
1022957
.59
Envelopes 31376.02
55584.1
6 32754.12
54371.
5
174085.
8
Labels 10363.59 7392.61 12919.18
8306.1
7
38981.5
5
Paper 118290.23
116873.
31 99929.73
11135
9.59
446452.
86
Pens & Art Supplies 42421.39
46866.6
5 42603.72
35215.
46
167107.
22
Rubber Bands 3188.07 3629.32 4219.94 3969.3
15006.6
3
Scissors, Rulers and
Trimmers 16653.41 6877.25 10644.95
46820.
7
80996.3
1
Storage & Organization 282256.5
278746.
05
239808.2
7
26937
1.78
1070182
.6
Technology 1701825.482
1397208
.679
1364905.
113
15203
08.909
5984248
.182
Computer Peripherals 215565.07
175393.
28
190112.3
8
21480
5.21
795875.
94
Copiers and Fax 273286.86
339712.
51
280821.1
2
23654
0.81
1130361
.3
Office Machines 732505.45
446780.
14
426103.4
1
56330
8.14
2168697
.14
Telephones and
Communication 480468.1015
435322.
7485
467868.2
03
50565
4.749
1889313
.802
(blank)
14915
600.82
1491560
0.82
(blank)
14915
600.82
1491560
0.82
Grand Total
14915
600.82 4209896.846
3560087
.045
3429944.
981
37156
71.953
2983120
1.65
This Pivot table is also similar to the above but it depicts the profits generated across various
products in the 4 years.
Sum of Profit Column
Document Page
Labels
Row Labels
<01-01-
2009 2009 2010 2011 2012
Grand
Total
Furniture 61804.53 9397.4 50422.45 -4191.35 117433.03
Bookcases -3737.69 -732.77 -5164.59
-
23947.08 -33582.13
Chairs & Chairmats 54375.94 20433.28 40215.4 34625.11 149649.73
Office Furnishings 27455.59 29418.19 25389.12 18165.03 100427.93
Tables
-
16289.31 -39721.3
-
10017.48
-
33034.41 -99062.5
Office Supplies
177646.2
7
118143.2
4 86960.01
135271.9
1 518021.43
Appliances 19849.29 32646.78 21918.29 22743.7 97158.06
Binders and Binder Accessories
123917.0
8 44151.03 45016.67 94328.61 307413.39
Envelopes 7450.6 16324.61 9727.04 14680.35 48182.6
Labels 2623.09 2361.01 5702.02 2991.05 13677.17
Paper 16968.01 12405.84 4460.37 11428.98 45263.2
Pens & Art Supplies 2116.79 2504.73 1980.6 962.66 7564.78
Rubber Bands -18.23 -203.12 -144.42 263.1 -102.67
Scissors, Rulers and Trimmers -2906.06 -1340.33 -1821.91 -1730.95 -7799.25
Storage & Organization 7645.7 9292.69 121.35
-
10395.59 6664.15
Technology
194645.2
2
237376.6
9
242928.0
4
211363.5
7 886313.52
Computer Peripherals 21279.68 15639.9 29195.9 28172 94287.48
Copiers and Fax 33389.3 50521.2 58896.48 24554.51 167361.49
Office Machines 60494.52
104221.5
7 76732.38 66264.46 307712.93
Telephones and
Communication 79481.72 66994.02 78103.28 92372.6 316951.62
(blank)
1521767.9
8
1521767.9
8
(blank)
1521767.9
8
1521767.9
8
Grand Total
1521767.9
8
434096.0
2
364917.3
3 380310.5
342444.1
3
3043535.9
6
The below table depicts the profits and sales pertaining to the order priority across the years
which helps in determining the actual cause of decline in sales and profits.
Row Labels
Sum of
Profit
Sum of
Sales
Critical 182876.87
2724798.67
3
2009 55447.4 868731.628
2010 19145.96
468198.869
5
2011 60035.6 691329.078
2012 48247.91 696539.097
High 406526.84
3256531.11
1

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
2009 73407.89 745281.054
2010 94086.76
814946.946
5
2011 113536.8
786069.581
5
2012 125495.39 910233.529
Low 386471.81
3283367.45
6
2009 107661.33
859238.083
5
2010 122967.85
922895.397
5
2011 84703.38 712179.826
2012 71139.25 789054.149
Medium 327098.44
2862696.68
9
2009 108855.72 971522.288
2010 103241.9 746696.749
2011 50022.82
563189.568
5
2012 64978
581288.083
5
Not Specified 218794.02
2788206.89
6
2009 88723.68 765123.792
2010 25474.86 607349.082
2011 72011.9 677176.927
2012 32583.58
738557.094
5
(blank) 1521767.98
14915600.8
2
<01-01-2009 1521767.98
14915600.8
2
Grand Total 3043535.96
29831201.6
5
TASK 2
Presenting screen shorts and explaining the interpretation
K Means
Document Page
Document Page

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Initial
Cluster
Centers
Cluster
1 2
Gender 1 2
Age 13 26
Rice 1 0
Document Page
Iteration
Historya
Iteration Change in
Cluster Centers
1 2
1 4.450 2.659
2 .241 .420
3 .000 .000
a.
Convergence
achieved due
to no or small
change in
cluster
centers. The
maximum
absolute
coordinate
change for
any center
is .000. The
current
iteration is 3.
The minimum
distance
between
initial centers
is 13.077.
Final
Cluster
Centers
Cluster
1 2
Gender 1 2
Age 18 24
Rice 1 1
Document Page
Number
of Cases
in each
Cluster
Cluster
1 54.000
2 41.000
Valid
9
5
.
0
0
0
Missing
.
0
0
0
From the evaluation of above presented cluster analysis it can be interpreted that in
final center's category 1 & 2 includes gender ,age and rice. The age related data is seen more
in category 2 and other parts are similar. The value derived for valid is 95 and no missing
number si found.

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Discrip[tive
Gender Age R
i
c
e
N Valid 95 95 95
Missing 0 0 0
Mean 1.53 20.33
.
5
9
Median 2.00 19.00
1
.
0
0
Mode 2 19a 1
Sum 145 1931 5
6
Document Page
a.
Multiple
modes
exist.
The
smallest
value is
shown
From the evaluation of above illustrated table it can be said that mean, mode and
median are computed. On the basis of age it can be articulated that rice are more eaten by
this category.
Explaining data mining methods
There are different types of data mining methods utilized by several organization in turn
better productivity by accomplishing objective of conducting appropriate analysis. It
comprises associations, classification, clustering analysis, prediction, sequential pattern ,
decision trees, outlier and neural network. Each method has different steps for meeting the
business objective of mining data.
ď‚· The association method is used to identify the relationship between two variables via
finding correlation (Dutt, Ismail and Herawan, 2017). The main purpose of utilizing
this particular method is to accomplish the objective of business to study market
basket to analyses in order to predict the behavior of customers. This is based on two
rules which includes single & multiple dimensional association rule.
ď‚· Classification is utilized to distinct items in the data set by categorizing them as
groups. It helps the user to make appropriate prediction of behavior of entities within
group accurately. It si basically exerted by conducting two steps which includes
training & classification. In training algorithm builds the classifiers by evaluating a
training set & then classifying the data used to estimate the accuracy or precision of
related rules. For instance- banking sector uses this to identify at low , medium and
high credit financial risk .
ď‚· Decision tree is structure based method that present the data in proper hierarchy
with help of internal node, branches, terminal nodes, etc. This is usually taken into
consideration by those authorities that want to present data in simple &
understandably manner to take appropriate decision (Kavakiotis, and et.al.,2017).
Document Page
ď‚· Cluster analysis are made depending upon the homogeneous nature of data which is
as well known as data segmentation according to similarities. It has different
approaches like hierarchical agglomerate, grid , partitioning, model & density based
methods for distinction data according to similarities. For example- risk classification
is exerted on the basis of client salary so that credit bearing capacity can be identified.
ď‚· Sequential patter is used to recognize the pattern prevailing from the frequent period.
For example- fashion organization 's sales manager evaluates the winter clothes are
in inclining trend just before the season. It can be exerted by analyzing the prevailing
pattern in sales of company.
ď‚· Prediction is taken into consideration to predict future based on trends prevailing.
This is widely utilized to combine the different methods such as classification, trend
analysis, etc. For essence the the sales & marketing manger predict the potential
revenue by analyzing the past trends for making better efforts to incline outcomes.
For this purpose regression analysis is used in turn higher compatibility can be
obtained.
ď‚· Neural network is collection of neurons like processing units with weighted
connections between them. It is stable on three pillars which includes model, learning
algorithm and activation function. It is used to set relationship between in and
outputs for the purpose of classifications, regression analysis, etc. In addition this,
these are the methods that can be sued for data mining.
Explaining the advantages & disadvantages of SPSS over excel
There are various areas in which SPSS is found to be better than excel as it allows
the user to get several advantages. The statistical analysis in SPSS can be exerted effective
as compared to excel as it allows to have proper access to purely statistical aspects by
building in functions for data analysis (Regin,Rajest and Singh, 2021). The reason behind this
is concerned with speed & feasibility which is more in SPSS as compared to MS excel. The
statistical tests are easier in SPSS as it comprises the package equipped with complete sets
(Satyahadewi and Perdana, 2021,). For example_ creation of pivot table is excellent
in SPSS which includes the stronger built in function & add in tools in data analysis where as
in excel the user does not obtain such options for carrying froward proper evaluation.
In excel it is not possible to have detailed from of charts present & reflecting deeper insights
about data chosen to for formulating the same. The quality of presentation & visual graph si
more better in SPSS as compared to excel. SPSS highly gives focus on presenting the

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
statistical data in quality manner. Creation of table in specified tool is easier as compared to
excel which provides ability to generate report by utilization built in functions.
The other kinds of advantages that user can derive by utilizing the SPSS over excel
which involves converting codes in values easily. In addition to this, SPSS allows getting
missing figures of statistical data in accurate manner that prevents data relate errors. On the
other side, there are certain disadvantages as well that user can obtain from utilizing SPSS in
place of excel for the purpose of data analysis (What are the advantages of Excel over SPSS?
2021). It comprises the biggest limitation over excel is that this is expensive. There are
certain complications t hat users get while focusing on SPSS such as making graph is
critical which is simple to formulate in excel. Large data analysis in SPSS creates
complications as compare to excel as in MS tool it is easier to draw conclusion on more
data sets.
CONCLUSION
It can be concluded from the above that the excel is a very effective tool which can
be utilized for the purpose of carrying out the data analysis and visualization. It is having
number of functions, formulas and the features which can be utilized for the purpose of
extracting relevant data from the large data set and applying the right function to generate
useful information. In terms of practical application of it, the Pivot tables is being created
which determines the decline in sales and profits in a multiple formats and perspectives and
angles over the four years. On account of data mining there are various methods which are
being sed by the organizations for the purpose of conducting appropriate analysis such as
association method, classification method, decision tree, cluster analysis, sequential patter,
prediction, neutral network and so forth. These are few methods which are used under
different situations and conditions. SPSS is another statistical tool used in place of excel but it
is having certain pros and cons over excel. Thus, both the applications are good at their place
and by effectively analyzing the pros and cons would result into taking better decision about
which tool to be used.
Document Page
REFERENCES
Books and Journals
Dutt, A., Ismail, M.A. and Herawan, T., 2017. A systematic review on educational data
mining.
Ieee Access.
5. pp.15991-16005.
Kavakiotis, I. And et.al.,2017. Machine learning and data mining methods in diabetes
research.
Computational and structural biotechnology journal. 15. pp.104-116.
Prodromou, T. ed., 2017.
Data visualization and statistical literacy for open and big data. IGI
Global.
Raubenheimer, J., 2017. Excel-lence in Data Visualization?: The Use of Microsoft Excel for
Data Visualization and the Analysis of Big Data. In
Data visualization and statistical
literacy for open and big data (pp. 153-193). IGI Global.
Regin, R., Rajest, S. S. and Singh, B., 2021. Spatial Data Mining Methods Databases and
Statistics Point of Views.
Innovations in Information and Communication
Technology Series. pp.103-109.
Satyahadewi, N. and Perdana, H., 2021, May. Web Application Development for Inferential
Statistics using R Shiny. In
1st International Conference on Mathematics and
Mathematics Education (ICMMEd 2020) (Vol. 50, pp. 425-429). Atlantis Press.
OnlineWhat are the advantages of Excel over SPSS? 2021. [Online]. Available through:<
https://www.quora.com/What-are-the-advantages-of-Excel-over-SPSS>.
Document Page
1 out of 19
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]