Superstore Data Analysis and Visualization Report using Excel
VerifiedAdded on 2023/06/05
|33
|4485
|128
Report
AI Summary
This report provides a comprehensive analysis and visualization of data using Microsoft Excel, focusing on preprocessing techniques, descriptive statistics, and graphical representations. The initial data preprocessing involves using Excel's filtering and pivot table functions to organize and summarize the data, specifically examining a superstore dataset. Key functions such as SUMIFS are employed to determine sales and profit figures over several years, along with calculations for changes in sales and profits. The analysis includes descriptive statistics to identify trends, such as average sales and profits, standard deviation, and maximum/minimum values. Data visualization techniques, including charts and graphs, are used to represent sales and profit data based on product category and yearly performance. The report also touches on cluster analysis and compares the advantages and disadvantages of using SPSS and Excel for data analysis. This student-contributed assignment is available on Desklib, a platform offering AI-based study tools and a wide range of solved assignments for students.

Analysis and
Visualization
Visualization
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Table of Contents
PART 1............................................................................................................................................1
Pre – processing of data in Excel.................................................................................................1
Analyzing the data.......................................................................................................................8
Descriptive statistics..................................................................................................................10
Visualization of data..................................................................................................................11
PART 2..........................................................................................................................................13
1. How many students like vanilla flavour of ice cream...........................................................13
2. How many students are male and female?............................................................................14
3. Mean and Median of participants who like chocolate and strawberry flavor ice cream.......14
2.4 Cluster analysis example- K means clustering....................................................................16
2.5 Most common data mining and text mining methods.........................................................22
2.6 Advantages and disadvantages of using SPSS and Excel...................................................23
REFERENCES..............................................................................................................................25
APPENDIX....................................................................................................................................26
PART 1............................................................................................................................................1
Pre – processing of data in Excel.................................................................................................1
Analyzing the data.......................................................................................................................8
Descriptive statistics..................................................................................................................10
Visualization of data..................................................................................................................11
PART 2..........................................................................................................................................13
1. How many students like vanilla flavour of ice cream...........................................................13
2. How many students are male and female?............................................................................14
3. Mean and Median of participants who like chocolate and strawberry flavor ice cream.......14
2.4 Cluster analysis example- K means clustering....................................................................16
2.5 Most common data mining and text mining methods.........................................................22
2.6 Advantages and disadvantages of using SPSS and Excel...................................................23
REFERENCES..............................................................................................................................25
APPENDIX....................................................................................................................................26

PART 1
Pre – processing of data in Excel
Microsoft – Excel is considered to be a great tool for the purpose of pre – processing &
handling of structured data. Pre – processing of data is one of the step in data analysis and data
mining process which allows for transforming raw data into a format which can be analyzed and
understood through machine learning & computers (Kandpal, 2021). It is one of the component
of data preparation where processing of raw data is performed to make is suitable for several
another procedures pertaining to data processing. In other words, data preprocessing involves
data manipulation prior to using it for any other processes or procedures in order to enhance the
performance of data analysis tasks.
For instance, in the case of superstore, the application of filter in excel facilitates grouping of
sales and profit data on the basis of years from which they are relevant to. With the application
of filter on the order data column, selection criteria is as the “year” like 2009, 2010, 2011 and
2012 in the present case and accordingly, the excel gives the results associated with the particular
year selected. It facilitates summing up of the profits and sales for that particular year and
accordingly, the same procedure has been used for rest of the year and the results of total profits
and sales are obtained as displayed below.
Year Numbe
r of
order
placed
Sales Profit
2009 897 175406
1
152252
2010 816 131886
7
132154.
9
2011 856 147335
5
161414.
1
2012 879 160155
2
130967
Filter function in excel: Through filter function, the analyst can filter the set of data on the basis
of defined criteria (López-Zambrano and et.al., 2018). For the application of filter over a range
of data, the following steps has been used with respect to the superstore.
1
Pre – processing of data in Excel
Microsoft – Excel is considered to be a great tool for the purpose of pre – processing &
handling of structured data. Pre – processing of data is one of the step in data analysis and data
mining process which allows for transforming raw data into a format which can be analyzed and
understood through machine learning & computers (Kandpal, 2021). It is one of the component
of data preparation where processing of raw data is performed to make is suitable for several
another procedures pertaining to data processing. In other words, data preprocessing involves
data manipulation prior to using it for any other processes or procedures in order to enhance the
performance of data analysis tasks.
For instance, in the case of superstore, the application of filter in excel facilitates grouping of
sales and profit data on the basis of years from which they are relevant to. With the application
of filter on the order data column, selection criteria is as the “year” like 2009, 2010, 2011 and
2012 in the present case and accordingly, the excel gives the results associated with the particular
year selected. It facilitates summing up of the profits and sales for that particular year and
accordingly, the same procedure has been used for rest of the year and the results of total profits
and sales are obtained as displayed below.
Year Numbe
r of
order
placed
Sales Profit
2009 897 175406
1
152252
2010 816 131886
7
132154.
9
2011 856 147335
5
161414.
1
2012 879 160155
2
130967
Filter function in excel: Through filter function, the analyst can filter the set of data on the basis
of defined criteria (López-Zambrano and et.al., 2018). For the application of filter over a range
of data, the following steps has been used with respect to the superstore.
1
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

1. select the column of the data which is needed to be filtered. Like in case of superstore data set,
the column headed as “order date” has been selected.
2. Then on the Home tab itself, in the editing group, click on the Sort & Filter option and from
the drop down list, the filter option has been selected which results in application of filter on all
the column head of the sheet.
3. On the column head, the down arrow is clicked to select the criteria for the required data set.
In case of superstore data set, the year 2009 is set for getting the sales, profit and number of
orders placed data for the year. Accordingly, the sum of the sales and profits relevant to 2009
was entered in the table corresponding to the year 2009.
While conducting preprocessing of data, filter function allows displaying of that data meeting the
specified criteria. Therefore, preprocessing of data facilitates gaining of precise knowledge of the
data set prior to conducting any analysis and drawing conclusions about it.
Further, Pivot table has been used as a tool for preprocessing of data in order to gain
understanding of the movement in the sales and profits figure over the years with respect to the
superstore.
Pivot table: It is usually regarded as the most interactive way for the quick summarization of
large amount of data. It is used for the analysis of numerical data in a detailed manner which
helps in answering several questions regarding the business performance in the number of years
2
the column headed as “order date” has been selected.
2. Then on the Home tab itself, in the editing group, click on the Sort & Filter option and from
the drop down list, the filter option has been selected which results in application of filter on all
the column head of the sheet.
3. On the column head, the down arrow is clicked to select the criteria for the required data set.
In case of superstore data set, the year 2009 is set for getting the sales, profit and number of
orders placed data for the year. Accordingly, the sum of the sales and profits relevant to 2009
was entered in the table corresponding to the year 2009.
While conducting preprocessing of data, filter function allows displaying of that data meeting the
specified criteria. Therefore, preprocessing of data facilitates gaining of precise knowledge of the
data set prior to conducting any analysis and drawing conclusions about it.
Further, Pivot table has been used as a tool for preprocessing of data in order to gain
understanding of the movement in the sales and profits figure over the years with respect to the
superstore.
Pivot table: It is usually regarded as the most interactive way for the quick summarization of
large amount of data. It is used for the analysis of numerical data in a detailed manner which
helps in answering several questions regarding the business performance in the number of years
2
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

of different basis and aspects (Guerrero, 2018). Also, it is a very powerful tool meant for
analyzing, calculating and summarizing the data patterns & trends can be identified and
comparisons can be established among different year’s sales, profit and product performance. It
works differently which is based on the type of platform where it has been used in excel.
The steps to be followed for creating the pivot table for the superstore data set are as follows:
1. Selection of all the columns and rows for which the pivot table is required to be created.
2. By clicking on insert tab and in the tables group, there would be a pivot table option.
3. Click on the pivot table option and the dialog box would be displayed showing the table range
as the cell reference for the entire data set.
3
analyzing, calculating and summarizing the data patterns & trends can be identified and
comparisons can be established among different year’s sales, profit and product performance. It
works differently which is based on the type of platform where it has been used in excel.
The steps to be followed for creating the pivot table for the superstore data set are as follows:
1. Selection of all the columns and rows for which the pivot table is required to be created.
2. By clicking on insert tab and in the tables group, there would be a pivot table option.
3. Click on the pivot table option and the dialog box would be displayed showing the table range
as the cell reference for the entire data set.
3

4. Click on new worksheet to get the pivot table results on the new sheet and then click OK. This
would create a pivot table template on the new worksheet.
5. On the pivot table fields at the right hand side, drag the order date in filters quadrant, sales and
profits in columns quadrant and product category in row quadrant.
4
would create a pivot table template on the new worksheet.
5. On the pivot table fields at the right hand side, drag the order date in filters quadrant, sales and
profits in columns quadrant and product category in row quadrant.
4
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

6. To see the sales and profits with respect to the product category sold in 2009, the down arrow
on order date option would be clicked and 2009 would be set as the criteria, which in turn will
display the required results.
5
on order date option would be clicked and 2009 would be set as the criteria, which in turn will
display the required results.
5
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Accordingly, the following results for the year 2009 have been obtained.
6
6

By selecting all in the order date, the following results have been generated with respect to the
sales and profits generated from all the three product categories over the years.
Order Date (All)
Row Labels
Sum of
Sales
Sum of
Profit
Furniture
5178590.54
2 117433.03
Office
Supplies 3752762.1 518021.43
Technology
5984248.18
2 886313.52
Grand
Total
14915600.8
2
1521767.9
8
Another function in excel which has been used for preprocessing of data is SUMIFS.
SUMIFS: This is one of the best function in excel through which values can be added up to meet
multiple criteria. Through this function, sum of values can be obtained if the corresponding cells
satisfy the criteria specified in terms of numbers, text and dates (Cleff, 2019). The syntax used
for applying this function is as follows:
=SUMIFS (sum_range, range1, criteria1, [range2], [criteria2], ...)
7
sales and profits generated from all the three product categories over the years.
Order Date (All)
Row Labels
Sum of
Sales
Sum of
Profit
Furniture
5178590.54
2 117433.03
Office
Supplies 3752762.1 518021.43
Technology
5984248.18
2 886313.52
Grand
Total
14915600.8
2
1521767.9
8
Another function in excel which has been used for preprocessing of data is SUMIFS.
SUMIFS: This is one of the best function in excel through which values can be added up to meet
multiple criteria. Through this function, sum of values can be obtained if the corresponding cells
satisfy the criteria specified in terms of numbers, text and dates (Cleff, 2019). The syntax used
for applying this function is as follows:
=SUMIFS (sum_range, range1, criteria1, [range2], [criteria2], ...)
7
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

In the case of superstore data, the SUMIFS function has been applied to determine the sales and
profits figures over the years with respect to the superstore.
The formula used for the same is as follows:
= SUMIFS (Sales data, order date range, “>=” &DATE (year, month, date), order date range,
“<=” &DATE (year, month, date)
Further, for the determination of reduction in sales and profits over the years the formula used is
as follows:
= (Sales for the current year – Sales of the previous year) / Sales for the current year * 100
Accordingly, the results for the change in sales and profits over the years generated are as
follows:
Year Sales Profit change in sales Change in profits
2009 1754061 152253
2010 1318867 132154.9 -24.81% -13.20%
2011 1473355 161414.1 11.71% 22.14%
2012 1601552 130967 8.70% -18.86%
8
profits figures over the years with respect to the superstore.
The formula used for the same is as follows:
= SUMIFS (Sales data, order date range, “>=” &DATE (year, month, date), order date range,
“<=” &DATE (year, month, date)
Further, for the determination of reduction in sales and profits over the years the formula used is
as follows:
= (Sales for the current year – Sales of the previous year) / Sales for the current year * 100
Accordingly, the results for the change in sales and profits over the years generated are as
follows:
Year Sales Profit change in sales Change in profits
2009 1754061 152253
2010 1318867 132154.9 -24.81% -13.20%
2011 1473355 161414.1 11.71% 22.14%
2012 1601552 130967 8.70% -18.86%
8
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Analyzing the data
Microsoft excel is regarded as one of the best application for conducting analysis of data (Kaula
and Kaula, 2018). This is possible through several tools equipped within it such functions, pivot
table, etc. One of such function is filtering option which is being to obtain the data based on the
specified conditions. Accordingly, the sales, profits and number of orders generated data for the
four years have been identified in this report, which is as follows:
Through the above table, it can be analyzed that number of orders placed has been declined from
2009 to 2010 while the same has continuously increased in the next two years that is, 2011 and
2012. The same is true for sales and profit where from 2009 to 2010, the sales have declined
significantly and then again have increased in 2011and 2012 while profits have increased from
2010 to 2011 and then again decreases in 2012.
SUMIFS: Further, the SUMIFS function have generate the following results with respect to sales
and profit of the superstore:
Through the above table, it can be seen that sales have declined from 2009 to 2010 by 24.81%
and then increased by 11.71% and 8.70% respectively in 2011 and 2012. Further, on looking at
the profits, it has been analyzed that it has been decline by 13.20% from 2009 to 2010 while
subsequently in 2011, it has increased by 22.14% and in 2012 decreased by 18.86%.
9
Microsoft excel is regarded as one of the best application for conducting analysis of data (Kaula
and Kaula, 2018). This is possible through several tools equipped within it such functions, pivot
table, etc. One of such function is filtering option which is being to obtain the data based on the
specified conditions. Accordingly, the sales, profits and number of orders generated data for the
four years have been identified in this report, which is as follows:
Through the above table, it can be analyzed that number of orders placed has been declined from
2009 to 2010 while the same has continuously increased in the next two years that is, 2011 and
2012. The same is true for sales and profit where from 2009 to 2010, the sales have declined
significantly and then again have increased in 2011and 2012 while profits have increased from
2010 to 2011 and then again decreases in 2012.
SUMIFS: Further, the SUMIFS function have generate the following results with respect to sales
and profit of the superstore:
Through the above table, it can be seen that sales have declined from 2009 to 2010 by 24.81%
and then increased by 11.71% and 8.70% respectively in 2011 and 2012. Further, on looking at
the profits, it has been analyzed that it has been decline by 13.20% from 2009 to 2010 while
subsequently in 2011, it has increased by 22.14% and in 2012 decreased by 18.86%.
9

Pivot table: Through this tool, the sales and profits with respect to different product category
have been determined with respect to four years, which are as follows:
Through the above table, it has been analyzed that the sales of technology is highest among all
the products while the sales of office supplies is lowest among all the product category. Further,
the highly profit generating product category is identified as technology while the lowest profits
have been generated from furniture over the four years.
Descriptive statistics
Sales Profit
Mean 1775.878 Mean 181.1844
Standard Deviation 3585.051 Standard Deviation 1196.653
Minimum 2.24 Minimum -14140.7
Maximum 89061.05 Maximum 27220.69
Sum 14915601 Sum 1521768
Through the above table, it has been analyzed that average sales and profits over the four years
was 1775 and 181 respectively and standard deviation is identified as 3585 and 1196
respectively. Due to a very high standard deviation, it has been analyzed that there is a huge
difference among the sales order placed during the four-year period (Evans and Evans, 2020).
The maximum and minimum sales generated over the years is 89061 and 2.24 respectively while
the maximum and minimum profits for the four – year period is identified as -14140 and 27220.
The negative profits indicate that superstore has losses to the extent of 14140. The total sales and
profits generated over the years are identified as 14915601 and 1521768 respectively.
Regression Statistics
10
have been determined with respect to four years, which are as follows:
Through the above table, it has been analyzed that the sales of technology is highest among all
the products while the sales of office supplies is lowest among all the product category. Further,
the highly profit generating product category is identified as technology while the lowest profits
have been generated from furniture over the four years.
Descriptive statistics
Sales Profit
Mean 1775.878 Mean 181.1844
Standard Deviation 3585.051 Standard Deviation 1196.653
Minimum 2.24 Minimum -14140.7
Maximum 89061.05 Maximum 27220.69
Sum 14915601 Sum 1521768
Through the above table, it has been analyzed that average sales and profits over the four years
was 1775 and 181 respectively and standard deviation is identified as 3585 and 1196
respectively. Due to a very high standard deviation, it has been analyzed that there is a huge
difference among the sales order placed during the four-year period (Evans and Evans, 2020).
The maximum and minimum sales generated over the years is 89061 and 2.24 respectively while
the maximum and minimum profits for the four – year period is identified as -14140 and 27220.
The negative profits indicate that superstore has losses to the extent of 14140. The total sales and
profits generated over the years are identified as 14915601 and 1521768 respectively.
Regression Statistics
10
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 33
Related Documents

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.