Data Handling & Business Intelligence: Excel for Pre-processing, SPSS Analysis, and Common Data Mining Methods in Business

Verified

Added on  2023/06/18

|17
|3025
|343
AI Summary
This report covers the use of Excel for pre-processing data, SPSS analysis, and common data mining methods in business for effective data handling and business intelligence. It includes k-means clustering, advantages/disadvantages of SPSS, and more. The subject covers various aspects of data analysis and is relevant for courses in business intelligence, data handling, and related fields.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Data Handling & Business
Intelligence
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
TABLE OF CONTENTS
INTRODUCTION...........................................................................................................................3
MAIN BODY..................................................................................................................................3
PART 1............................................................................................................................................3
The decline in sales/profits over the years and the use of Excel for pre-processing the data......3
PART 2............................................................................................................................................7
The smile_clinic.csv provided in conjunction with SPSS...........................................................7
2.1 k means................................................................................................................................11
The most common data mining methods used in business........................................................13
Advantages/disadvantages of SPSS...........................................................................................14
CONCLUSION..............................................................................................................................14
REFERENCES................................................................................................................................1
Document Page
INTRODUCTION
Data Handling & Business Intelligence is the process in which the technology and make
the use of strategies for analyzing more about the historical and current data. This report will
explain about the superstore data set which help for declining more about their sales or profits.
This will also demonstrate about the Excel function which make the proper use of various
concepts in business. Along with this conjunction of SPSS and various some common concepts
data mining which are used in business context. Furthermore, advantage and disadvantage of
SPSS which use for business context.
MAIN BODY
PART 1
The decline in sales/profits over the years and the use of Excel for pre-processing the data
The crude information contains missing in turn, uproarious and blunders, so this cannot
among the utilize that configuration in AI models. The better information pre-processing has
expanded precision of the model. So the information pre-processing is generally significant in AI
and profound learning models. The all necessary bundles are imported for information with high
value and make about their more easy way for calculation (Hamoud and et.al., 2021). The
utilized for cluster activity performed on the datasets. This could also important for knowing
about the Excel in which they can easily make sure for having some proper data collection and
sources for completion about the given their proper data collection.
Document Page
From the above graph it had been come to know about how the sources have been taken
with having the high sales and profits. While having such things this could also important for
knowing about how the organization is working for their customer and making their superstore
for working with more high. This could also essential for taking the sales and profits data for the
organization while in which they can easily come to know about they can work and manage their
work with having many terms and condition. This might be important for taking their proper
sales about their products and services for which they can easily think about many strategy so
that could help them for increasing their sales volume with 76% in year and make sure for
having their high estimation about various things (Ali, Ramli and Awalin, 2020). The dataset has
increasing more high volumes in string design. The ai model just permitted arithmetical
qualities. The string changed over into the mathematical qualities esteem methods. While such
kind of things this could also important for knowing about their per years sales volumes in which
they can work and make sure about their sales volume to be growth. 34.56 sales have been taken
their high active places and keeping many major aspects about their sources in which those terms
and condition can work. In such kind of things this could also being so helpful for them for about
the profits can increased while by keeping employee for working with many activities in
organization. 45% to 89% superstore can easily their products and keeping more high
technology process for which they can easily come to know about their market strategy.
From the above it could be analysis that, organization need to work with their employee
for having the high estimation about their customer. This could also come to know about those
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
effective places in which the goals and objective can work and keeping their high places for
completion and make sure for having their sales volumes as per seems in 2009. Along with this
are more required with having motivation of employee and rewards in which they can easily
come to know about how their employee can work with more effective process. While for
sometime including more about some for their places and increasing their high goals and
objective for which they are working and keeping some of their employee to be work. This might
be also essential for knowing about demands for the customer through which they can easily
keeping their working process (Pasmawati and et.al., 2020). The way toward placing in sequence
into an outline or other visual arrangement that advises study and translation. Information visuals
present the examined information manners that open and connect with a variety of partners.
Append significance information. Understanding requires creation decisions about speculation,
relationship and causation, and planned address key learning inquiries concerning venture. It
incorporate frequencies or tallies and proportions of focal inclination and mode. It additionally
add in proportions infidelity, which advise or variety of reactions. Quantitative assessment
creates frequencies, midpoint and levels distinction that exist in your information. Subjective
examination recognize topics and examples. Along with group and other significant partners
decipher informational index by offering importance to it. The significance tale undertaking
story will use settle on project choices and offer outcomes with others. This could also important
for keeping them and make sure for having examination about those activities which might be
important for them.
This could be seems that high effective places and make efficient system for their working
system. While for keeping 75% of sales volume in 2010 and for this organization need to keep
Document Page
their growth about the 85% in 2011 for that they can easily come to know about the sales volume
that could be growth. In such kind of activities for running superstore the organization are
required to high profits in given the year. While as per having the high process for the customer
in which they can easily come to know about their those effective places for which customer can
keep their proper attraction. In such kind of activities the organization are required to work with
having more profits in data set. Arranging reports, especially performance management plan,
will disclose when to investigate more high information (Janani and et.al., 2020). By and large,
information and translation happens former delivering a report record or component basic
observing or judgment work out. In any case, specialists suggest that arrange these cycles feature
of normal examination endeavours will further develop project learning and help with adaptable
management, subsequently further initial venture execution all. It could be also suggested that
connect with key partners deciphering results to warranty information use and choices that are
suitable nearby setting. It is consistently valuable approve or test subjects and suggestions create
because examination and understanding. This approval should be likely with quite a few
partners, counting the undertaking members themselves.
The entire information separated into portion of equivalent size and afterward different
techniques are performed job. One can supplant all in sequence portion mean or limit esteems
can be utilized to finish the job. Since information removal plan that utilized deal with enormous
measure of in order. While working with huge volume of information, theory test became more
diligently in such cases. It means to expand the capacity skill and lessen information stockpiling
and study costs. Since information is regularly taken from diverse sources which are ordinarily
not very dependable and that various organizations, the greater part our time burned-through in
Document Page
managing information worth issues when dealing with an AI issue. It is essentially ridiculous to
expect that the information will be awesome. There might be issues because of human mistake,
limits of estimate gadgets, defects information variety measure. A true information by and large
contains commotions and perhaps unusable arrangement which can't be easily utilized for
various kinds of models. Information pre-processing is required undertakings for organization
the information and making it reasonable for which additionally expands the exactness and
productivity (Andriansyah and Nulhakim, 2020). Dataset might organizations for various
purposes, for example, assuming need to make for business reason, at that point dataset will be
distinctive with dataset needed for liver patient. So each dataset is not the same as another
dataset. These libraries are utilized to play out a little particular positions. There are three explicit
libraries that we will use for in order pre-processing.
PART 2
The smile_clinic.csv provided in conjunction with SPSS
Gender and age analysis
Gender
Frequency Percent Valid
Percent
Cumulative
Percent
Valid
Male 50 50.0 50.0 50.0
Female 50 50.0 50.0 100.0
Total 100 100.0 100.0
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Among the 100 participants 50% of people are male and remaining 50 are female. The
mean value of age group is 1.5 which shows that on the basis of gender population can be
divided equally into groups.
Statistics
Gender Age
N Valid 100 100
Missing 0 0
Mean 1.5000 20.3500
Median 1.5000 19.0000
Mode 1.00a 22.00
a. Multiple modes exist. The smallest
value is shown
The average age of customers is of 20 years. The median age group of chosen customers
is 19 years and their mode value is 22 years.
Gender
Frequency Percent Valid Percent Cumulative
Percent
Valid
Male 50 50.0 50.0 50.0
Female 50 50.0 50.0 100.0
Total 100 100.0 100.0
Age
Frequency Percent Valid Percent Cumulative
Percent
Valid 13.00 5 5.0 5.0 5.0
15.00 5 5.0 5.0 10.0
17.00 8 8.0 8.0 18.0
18.00 13 13.0 13.0 31.0
19.00 20 20.0 20.0 51.0
20.00 5 5.0 5.0 56.0
22.00 21 21.0 21.0 77.0
23.00 1 1.0 1.0 78.0
Document Page
25.00 12 12.0 12.0 90.0
26.00 10 10.0 10.0 100.0
Total 100 100.0 100.0
The average age of the customers is 22 years old and it comprises nearly 21% of the total
number of customers. The customers of age group 18 years and 25 years also make good
proportion of the population.
Document Page
From the graph it can be clearly seen that majority of the customers are in age group 19
and 22 years and minimum number of customers lies in the age group of 23 years.
Number of customers eating rice
Statistics
Customerseatingriceornot
N Valid 100
Missing 0
Mean .6000
Median 1.0000
Mode 1.00
Customerseatingriceornot
Frequency Percent Valid Percent Cumulative
Percent
Valid
No 40 40.0 40.0 40.0
Yes 60 60.0 60.0 100.0
Total 100 100.0 100.0
Out of 100 participants 60% customers used to prefer rice and they have it in their daily
meal. Contrary to this 40 people does not include rich in their meal. It is also indicated by the
mean value of 0.60
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
2.1 k means
For k means clustering at first classify option is selected in analysis menu. After this a dialogue
box is opened in which all variables are selected as classify and iterate.
Document Page
On clicking ok output window is opened and results can be interpreted.
Quick Cluster
Initial Cluster Centers
Cluster
1 2
Gender 1.00 2.00
Age 13.00 26.00
Customerseatingriceornot 1.00 .00
Document Page
Iteration Historya
Iteration Change in Cluster Centers
1 2
1 4.489 2.731
2 .228 .384
3 .000 .000
a. Convergence achieved due to no or small
change in cluster centers. The maximum
absolute coordinate change for any center
is .000. The current iteration is 3. The
minimum distance between initial centers is
13.077.
Final Cluster Centers
Cluster
1 2
Gender 1.46 1.55
Age 17.68 23.75
Customerseatingriceornot .64 .55
Number of Cases in each Cluster
Cluster 1 56.000
2 44.000
Valid 100.000
Missing .000
The most common data mining methods used in business
Information mining has opened universe of opportunity for business. This field
computational capacity looks great many private bits of in sequence and is utilized by
organization to identify and foresee buyer manner. Different method like relapse investigation
and bunching, characterization, and anomaly examination are practical to information to know
helpful results. These procedures use encoding and backend calculation that dissect in rank and
show designs (Van Nguyen and et.al., 2020). Here are the example of the data mining such as
marketing, retailing, banking, medicine and so on. This could help the business for knowing
about their some major concepts and make sure for their high impacts over through which
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
business can work. Data mining is widely used for the organization in building more about the
marketing strategy, e-commerce for cross-setting through selling about their products websites
and many more other things. Data mining also make the business for having the high growth and
completion over through which the process of estimation in keeping the high estimation about
those things which could be more important. As results for having the high impact and make sure
for having their those concepts in which risk can be reduced (Nalepa and et.al., 2021). This help
the supermarket for the retailing business in which they can work with having the proper
communication about their more effective sources. Moreover, offering more products and
services or make their more demand high and discounts and making their high profits for which
the sales can increased. The concepts and keeping the data set for the organization in high sales.
Advantages/disadvantages of SPSS
SPSS that utilized for quantitative information examination. It might be not having order
line highlight yet rather that absolutely point and click and moderately looks like Microsoft
Excel. Despite fact that it looks great deal like Excel, it can pact with bigger informational
collections quicker and absent hardly stealing finger. One of the basic protests about SPSS
restrictive expensive to use, with singular bundles (Popa and et.al., 2020). Most top exploration
offices use SPSS to cut apart review information and so they can take advantage of their
assessment projects. A non-technical person that could also help them for learning more about
the online system and make sure for having their high estimation about those sources. Advantage
can be : quick and easy try to learn, great user interface, can easily handle the big data amount.
Disadvantage can be the most expensive, limited functionally, very similar to excel. This might
be also seems that try to learn many new things but this could be more expensive in using new
things. Along with this also make sure for having their many things and that might be also
important for taking those effective places and that might be also useful in business and keeping
their factor for working with many terms (Kumari, Singh and Patel, 2020). Information seems
that could also make their high rules for which having those activities which could make their
learning more easier. In such kind of activities this could also necessary for developing many
things.
CONCLUSION
From the above report it had been concluded that, charts and graphs are seems to be more
practically and that could also led them for having many business sales. Along with this high
Document Page
evaluation about the many customer are males and females as having the superstore. Moreover,
mean and median with having the prices and working with screenshots. This have been also
estimated about the data mining method which have been used for the business and make sure
for having those data mining about those activities. Advantage and disadvantage of spss and how
they can used with the business and many other activities are used.
Document Page
REFERENCES
Books and journals
Ali, R., Ramli, N.A. and Awalin, L.J., 2020, October. Analysis on Power Outage by using Big
Data Analytics. In 2020 International Conference on Data Analytics for Business and
Industry: Way Towards a Sustainable Economy (ICDABI) (pp. 1-6). IEEE.
Andriansyah, D. and Nulhakim, L., 2020, November. The Application of Power Business
Intelligence in Analyzing the Availability of Rental Units. In Journal of Physics:
Conference Series (Vol. 1641, No. 1, p. 012019). IOP Publishing.
Hamoud, A.K. and et.al., 2021. Implementing data-driven decision support system based on
independent educational data mart. International Journal of Electrical & Computer
Engineering (2088-8708). 11(6).
Janani, V. and et.al., 2020. Dengue Prediction Using (MLP) Multilayer Perceptron-A Machine
Learning Approach (No. 2444). EasyChair.
Kumari, M., Singh, M. and Patel, T., 2020. Cytorich fixative system-A new modality in
haemorrhagic fine needle aspiration smears.
Nalepa, G.J. and et.al., 2021. Semantic Data Mining in Ubiquitous Sensing: A
Survey. Sensors. 21(13). p.4322.
Pasmawati, Y. and et.al., 2020, December. Exploiting online customer reviews for product
design. In IOP Conference Series: Materials Science and Engineering (Vol. 909, No. 1, p.
012080). IOP Publishing.
Popa, D. and et.al., 2020. Using mixed methods to understand teaching and learning in Covid 19
times. Sustainability. 12(20). p.8726.
Van Nguyen, T. and et.al., 2020. Predicting customer demand for remanufactured products: A
data-mining approach. European Journal of Operational Research. 281(3). pp.543-558.
1
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
2
chevron_up_icon
1 out of 17
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]