logo

Data Analysis & Visualisation - Pre-processing, Analysis, Descriptive Statistics, Visualization

   

Added on  2023-06-05

33 Pages4485 Words128 Views
 | 
 | 
 | 
Analysis and
Visualization
Data Analysis & Visualisation - Pre-processing, Analysis, Descriptive Statistics, Visualization_1

Table of Contents
PART 1............................................................................................................................................1
Pre – processing of data in Excel.................................................................................................1
Analyzing the data.......................................................................................................................8
Descriptive statistics..................................................................................................................10
Visualization of data..................................................................................................................11
PART 2..........................................................................................................................................13
1. How many students like vanilla flavour of ice cream...........................................................13
2. How many students are male and female?............................................................................14
3. Mean and Median of participants who like chocolate and strawberry flavor ice cream.......14
2.4 Cluster analysis example- K means clustering....................................................................16
2.5 Most common data mining and text mining methods.........................................................22
2.6 Advantages and disadvantages of using SPSS and Excel...................................................23
REFERENCES..............................................................................................................................25
APPENDIX....................................................................................................................................26
Data Analysis & Visualisation - Pre-processing, Analysis, Descriptive Statistics, Visualization_2

PART 1
Pre – processing of data in Excel
Microsoft – Excel is considered to be a great tool for the purpose of pre – processing &
handling of structured data. Pre – processing of data is one of the step in data analysis and data
mining process which allows for transforming raw data into a format which can be analyzed and
understood through machine learning & computers (Kandpal, 2021). It is one of the component
of data preparation where processing of raw data is performed to make is suitable for several
another procedures pertaining to data processing. In other words, data preprocessing involves
data manipulation prior to using it for any other processes or procedures in order to enhance the
performance of data analysis tasks.
For instance, in the case of superstore, the application of filter in excel facilitates grouping of
sales and profit data on the basis of years from which they are relevant to. With the application
of filter on the order data column, selection criteria is as the “year” like 2009, 2010, 2011 and
2012 in the present case and accordingly, the excel gives the results associated with the particular
year selected. It facilitates summing up of the profits and sales for that particular year and
accordingly, the same procedure has been used for rest of the year and the results of total profits
and sales are obtained as displayed below.
Year Numbe
r of
order
placed
Sales Profit
2009 897 175406
1
152252
2010 816 131886
7
132154.
9
2011 856 147335
5
161414.
1
2012 879 160155
2
130967
Filter function in excel: Through filter function, the analyst can filter the set of data on the basis
of defined criteria (López-Zambrano and et.al., 2018). For the application of filter over a range
of data, the following steps has been used with respect to the superstore.
1
Data Analysis & Visualisation - Pre-processing, Analysis, Descriptive Statistics, Visualization_3

1. select the column of the data which is needed to be filtered. Like in case of superstore data set,
the column headed as “order date” has been selected.
2. Then on the Home tab itself, in the editing group, click on the Sort & Filter option and from
the drop down list, the filter option has been selected which results in application of filter on all
the column head of the sheet.
3. On the column head, the down arrow is clicked to select the criteria for the required data set.
In case of superstore data set, the year 2009 is set for getting the sales, profit and number of
orders placed data for the year. Accordingly, the sum of the sales and profits relevant to 2009
was entered in the table corresponding to the year 2009.
While conducting preprocessing of data, filter function allows displaying of that data meeting the
specified criteria. Therefore, preprocessing of data facilitates gaining of precise knowledge of the
data set prior to conducting any analysis and drawing conclusions about it.
Further, Pivot table has been used as a tool for preprocessing of data in order to gain
understanding of the movement in the sales and profits figure over the years with respect to the
superstore.
Pivot table: It is usually regarded as the most interactive way for the quick summarization of
large amount of data. It is used for the analysis of numerical data in a detailed manner which
helps in answering several questions regarding the business performance in the number of years
2
Data Analysis & Visualisation - Pre-processing, Analysis, Descriptive Statistics, Visualization_4

of different basis and aspects (Guerrero, 2018). Also, it is a very powerful tool meant for
analyzing, calculating and summarizing the data patterns & trends can be identified and
comparisons can be established among different year’s sales, profit and product performance. It
works differently which is based on the type of platform where it has been used in excel.
The steps to be followed for creating the pivot table for the superstore data set are as follows:
1. Selection of all the columns and rows for which the pivot table is required to be created.
2. By clicking on insert tab and in the tables group, there would be a pivot table option.
3. Click on the pivot table option and the dialog box would be displayed showing the table range
as the cell reference for the entire data set.
3
Data Analysis & Visualisation - Pre-processing, Analysis, Descriptive Statistics, Visualization_5

4. Click on new worksheet to get the pivot table results on the new sheet and then click OK. This
would create a pivot table template on the new worksheet.
5. On the pivot table fields at the right hand side, drag the order date in filters quadrant, sales and
profits in columns quadrant and product category in row quadrant.
4
Data Analysis & Visualisation - Pre-processing, Analysis, Descriptive Statistics, Visualization_6

6. To see the sales and profits with respect to the product category sold in 2009, the down arrow
on order date option would be clicked and 2009 would be set as the criteria, which in turn will
display the required results.
5
Data Analysis & Visualisation - Pre-processing, Analysis, Descriptive Statistics, Visualization_7

Accordingly, the following results for the year 2009 have been obtained.
6
Data Analysis & Visualisation - Pre-processing, Analysis, Descriptive Statistics, Visualization_8

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Data Analysis & Visualization in Excel: Pre-processing, Analysis, and Visualization Techniques
|34
|3888
|478

Data Analysis & Visualisation for Desklib
|25
|3983
|222

Data Handling and Business Intelligence-2
|16
|3368
|303

Data Handling and Business Intelligence
|17
|3192
|53

Business Intelligence: Data Analysis and Visualization with Excel and SPSS
|18
|3517
|190

Data Handling and Business Intelligence
|7
|2024
|60