This article discusses the role of Excel in pre-processing, analyzing and visualizing data. It also explores the importance of charts and graphs in data analysis. Additionally, it covers data mining techniques like classification, clustering, prediction and sequential pattern.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
DATA HANDLING AND BUSINESS INTELLIGENCE
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Table of Contents PART-1............................................................................................................................................3 Use of excel for pre-processing, analysing and visualizing the data...........................................3 PART-2............................................................................................................................................7 Descriptive statistics....................................................................................................................7 2.1 Presentation of Screenshots of steps for clustering...............................................................9 2.2 Data mining method............................................................................................................14 2.3 SPSS vs Excel......................................................................................................................15 REFERENCES................................................................................................................................1
PART-1 Use of excel for pre-processing, analysing and visualizing the data As data is always available in raw form which is very difficult to make understand. Likewise, the quantity of data in terms of its wideness is also large. Here excel plays an important role in terms of pre-processing the data under which the data is being converted into understandable mode. This is usually performed with the help of inculcation of table and arrangement of data in rows and columns in such a manner that it will be easy to understand by user. It is also to be noted that the arrangement of data is performed within rows and columns so that relationship between variables will be established (Kaur and Garg, 2019). This will be counted as base and important step under which raw data is converted in understandable manner with its presentation in rows and columns. Along with pre-processing, excel also plays an important role in analysing the data with the involvement of its features of formulas and functions of calculation including regression, pivot tables and various others. This will lead to have better analysis of data so that its outcome and results will be analysed in effective manner. It is also to be considered that along with the function of graphs and charts of excel data can not only be pre-processed but it will also be presented in such a manner that its analysis can be performed in effective manner. As pre-processing involves presentation of data in understandable manner so with the help of excel and its features including table and graphs data will not only be gathered in summarize form but it will raise the interpretation and understanding of the data (Ellis and Leek, 2018). Charts and graphs will enable the user to make analysis of the trends along with determining the meaning of data. This is further related with the data visualization because when data will be in raw mode and when it is being arranged in tables and charts then it will be visually visible and presentable (Sandnes and et.al., 2020). This means that with a look over the prepared charts and tables the meaning of data will be determined. Thus, it would be right to said that excel plays an important role with regard to pre-processing, analysing and visualization of data. However, on the other hand, it is to be noted that although excel is used as best mode of pre-processing but it includes various risk and loopholes with respect to this concept. This is because when data is being inserted in rows and columns then it will lead to have occurrence of errors in terms of missing of data, wrong typing, difficulty while handling big data, occurrence of
manual errors and various others (Abasova and et.al., 2018). With regard to the presence of these loopholes, it would not be wrong to said that although excel is best with respect to data pre- processing but there is an involvement of high percentage of errors. This will lead to have mis- interpretation or creation of wrong understanding with respect to data. With respect to the data of sales and profit of Superstore, excel and its function of charts and graphs plays an important role. This is because with the involvement of charts and graphs big data can be summarized in easy and presentable mode so that the trends and information pertaining to data will be analysed adequately. With regard to the data of Superstore and its presentation with the function of charts and graphs following steps were taken: ï‚·The initial step begin with the use of filter function of excel which lead to make arrangement and extraction of relevant data. ï‚·After applying filter, range related with sales will be selected so that the relevant data of sales will arrived out of the huge available data. ï‚·Following to this range of specific year along with the sum formula will be applied so that the sales of that specific year will come. In this way sum total of sales figures of 2009, 2010, 2011 and 2012 will be determined. ï‚·Similar steps right from applying filter till application of sum formula will lead to arise the sum of profit figures of 2009, 2010, 2011 and 2012. ï‚·In the later stage, the range of data will be selected followed by selecting the graphs and charts function. After that a suitable graph will be selected so that the data will be presented in informative manner. ï‚·Along with the preparation, it is being analysed that there is an existence of positive relationship between sales and profit of Superstore. This means that with a rise in sales the percentage of profit will also raise. ï‚·With the following of above steps and making presentation of data it would be concluded that there is an existence of positive relationship between sales and profit and the Superstore must need to focus over the aspect of raising sales of products so that profit will also increase. Graphical representation:
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
From, the analysis of above graph it can be interpreted that the value of sales with respect to 2009, 2010, 2011 and 2012 is fluctuating. This means that initially it shows declining trend from 2009 to 2010 and later it increases with inclining trend. Since the sales figure in 2009 was 1754061.196 which decline to 1318867.415 in 2010. However, later t shows an inclining trend under which sales figure raise to 1473354.591 in 2011 and 1601552.126 in 2012. This means that initially sales decline and later it moves towards the inclining trend.
In case of profit too, there is a fluctuating trend. This means that the value of profit fluctuate across the years. As it was 152252.99 in 2009 which decline to 132154.9 in 2010. However, the value of profit again raise to 161414.13 in 2011 with an again showing of downfall in 2012 by struck at 130966.97. This is a clear fluctuating trend under which the value of profit first decline than raise and then followed the same trend. From the above observation it would not be wrong to said that there is an existence of positive and direct relationship between sales and profit. This is because with a fall in sales the profit also decline. However, with a rise in sales value of profit again increase. This means with the changing trends of sales, profit and its trend will also change. Thiscanalsobesupportedfromliteraturethatthereisanexistenceofpositive relationship between sales and profit. AsTenucci and Supino (2020)states that, profit earning is the main aim of every organization. Likewise, every firm perform certain function that will lead to have emergence of products and services. In the same way when those produced products will be sold by company and services will be rendered by company then this will lead to have generating of earning. And when cost of operation will be deducted from that share of earning then it will rise profit. This means that without making sales of product no profit would be generated. This clearly shows an existence of direct and positive relation between the aspects. Likewise, as income with the context of company would be generated only when there will be sales of the products of the company. Since sales will lead to have generation of income and thereby profit value. Bhattacharya, Morgan and Rego (2021)also states that, along with rising sales the value of profit will also enhance. This means that the firm holding high value of sale will lead to have higher earning of profit. As sales price comprised of cost and profit percentage so with the sale of every unit of commodity the percentage of profit will also raise. In addition, of this it is also to be noted that the profit is that part of the earning which is being left over after making deduction of all the expenses and cost of operation. This means that sales of product also share relation with the coverage of expenses along with profitability. This mean with a rise in sales along with the coverage of cost and other expenses, profitability will also raise (Williams Jr and et.al., 2018). It is also to be noted that if there would be no sales of the product of the company then it will not be able to operate its function adequately which would further relate with the profitability too.
From the above analysis and literature support it can be concluded that there is an existence of positive relation between two variables in terms of occurrence of change in one variable will lead to have a simultaneous change in other variable too i.e. with a rise in sales the profitability will also raised. PART-2 Descriptive statistics Customers not eating Rice Rice FrequencyPercentValid PercentCumulative Percent Valid No4040.040.040.0 Yes6060.060.0100.0 Total100100.0100.0
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Interpretation: From the above chart, it can be interpreted that only 40 per cent of the respondents do not eat rice. Therefore, majority of the respondents consume rice. Gender – frequency distribution Gender FrequencyPercentValid PercentCumulative Percent Valid Male5050.050.050.0 Female5050.050.0100.0 Total100100.0100.0 Interpretation: From the above chart, it can be interpreted that half of the respondents are male while half are females. Therefore, it can be analyzed that the sample comprised of an equal number of male and female respondents. Mean and median of ages Statistics Age
NValid100 Missing0 Mean20.35 Median19.00 Interpretation: Calculation of mean indicates a value of 20.35. From this, it can be analysed that on an average the participants fall under the age of 20.35 years. therefore, on an average, the age of the respondents is 20.35 years. Mean and median of people who do eat rice Statistics Rice NValid100 Missing0 Mean.60 Median1.00 Interpretation: from the above table it can be found that the mean of the people who eat rice is 0.60. 2.1 Presentation of Screenshots of steps for clustering Step 1: Data is entered in the input table of SPSS as depicted in the image below:
Step 2: In the next step, click on the analyse Tab and then select classify. A dropdown menu will appear. Select K means cluster in the menu.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Step 3: Select all the variables and add them to the other side. Next, select ‘Iterate and classify’.
Step 4: Following are the results: Initial Cluster Centers Cluster 12 Gender12 Age1326 Rice10 Iteration Historya IterationChange in Cluster Centers 12 14.4892.731 2.228.384 3.000.000 a. Convergence achieved due to no or small change in cluster centers. The maximum absolute coordinate change for any center is .000. The current iteration is 3. The minimum distance between initial centers is 13.077. Final Cluster Centers Cluster 12 Gender12 Age1824 Rice11
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Number of Cases in each Cluster Cluster156.000 244.000 Valid100.000 Missing.000 Interpretation: K-means analysis is a method used for partitioning the input data set into K partitions also known as clusters. In this, similar types of items are grouped into clusters. From the above table, it can be analysed that a large number of cases are assigned to the first cluster. Also, from the results above, it can be interpreted that the change in gender is a minor change. Both the groups consumed rice. It can be analysed that the values of each cluster depict an interrelationship between themselves. 2.2 Data mining method There are various techniques of data mining which are being used by business including: Classification analysis: Here the data is classified and bifurcated into various segments and classes. Under this technique the structure as well as identity of the data is defined and known. The best example include the classification of email into spam, important legitimate and others. Clustering: Under this technique under which structure of data is determined on the basis of its processing along with comparing with similar form of data. In short under this technique similar group of data is identified in a data base (Mendes and Vilela, 2017). For example making clusters of customers into various group by company in order to determine their trend and deal with them. Prediction: This is the most important data mining technique under which by making a study and consideration of past data the future prediction and decision will be taken (Van Nguyen and et.al., 2020). For example enabling loan to customer on the basis of study of its past credit
record. Here past information and details will be taken as standard and base that can be used to make future prediction. Sequential pattern: As per this technique sequential series of events are discovered by business. Here data is being recorded in a sequence so that relation within the data is being identified. The best example include sales pattern under which a study is being conducted on the basis of sequence of data and trends. Regression analysis: With regard to this technique relationship between variables are determined so that characteristics value of depended on variable will be determined (Brook and Arnold, 2018). In other words this method is used to make analysis of changes i.e. change in one variable and its impact over the other. This is generally used to make prediction of future. For example change in price and its impact on demand on product of the organization. 2.3 SPSS vs Excel SPSS is more beneficial in comparison of Excel because SPSS allow performing complex analysis including factor, logistic or cluster analysis too. Likewise, under SPSS every column is treated as one variable which is used to determine relationship with one another. However, Excel does not allow treating rows and columns in such a manner (Cleff, 2019). In addition, of this there is no availability of paper trail in Excel under which taken steps will be allowed to replicate. As SPSS enable to make analysis of data even when variables and observation are too large but this become difficult and not possible in case of Excel, this also shows that SPSS is far better than Excel. Other than this SPSS also includes certain features including preparation of pivot tables, performance of statistics test, creation of table, conversion of codes into values in a much easier mode in comparison of excel and hence SPSS is more advantageous than Excel. On the other hand, there is a high involvement of cost in SPSS with contrast to Excel. This means that it would be difficult with respect to students in terms of using SPSS in comparison of excel because of high cost expenses. Likewise, there is a higher need of training in order to deal with SPSS, but is not necessary that adequate training will be required in order to deal with Excel (Opie, 2019). Also, the graph feature in SPSS is not effective in comparison of Excel which shows that Excel is easy to use and more advantageous in contrast with SPSS along
with presentation of findings. In addition, of this excel allows making real time calculation as per formula which is not present in SPSS. Due to the above reason it would be right to said that excel is better than SPSS.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
REFERENCES Books and journals Abasova, and et.al., 2018, June. Proposal of effective preprocessing techniques of financial data. In2018 IEEE 22nd International Conference on Intelligent Engineering Systems (INES)(pp. 000293-000298). IEEE. Bhattacharya, A., Morgan, N.A. and Rego, L.L., 2021. Examining why and when market share drives profit.Journal of Marketing. Brook, R.J. and Arnold, G.C., 2018.Applied regression analysis and experimental design. CRC Press. Cleff, T., 2019.Applied statistics and multivariate data analysis for business and economics: A modern approach using SPSS, Stata, and Excel. Springer. Ellis, S.E. and Leek, J.T., 2018. How to share data for collaboration.The American Statistician.72(1). pp.53-57. Kaur, J. and Garg, K., 2019. Efficient Management of Web Data by Applying Web Mining Pre- processing Methodologies. InSoftware Engineering(pp. 115-122). Springer, Singapore. Mendes, R. and Vilela, J.P., 2017. Privacy-preserving data mining: methods, metrics, and applications.IEEE Access.5. pp.10562-10582. Opie, C., 2019. USING EXCEL/SPSS IN YOUR RESEARCH.Getting Started in Your Educational Research: Design, Data Production and Analysis, p.309. Sandnes, and et.al., 2020, October. Searching for extreme portions in distributions: A comparison of pie and bar charts. InInternational Conference on Cooperative Design, Visualization and Engineering(pp. 342-351). Springer, Cham. Tenucci, A. and Supino, E., 2020. Exploring the relationship between product-service system and profitability.Journal of Management and Governance.24(3). pp.563-585. Van Nguyen, and et.al., 2020. Predicting customer demand for remanufactured products: A data- mining approach.European Journal of Operational Research.281(3). pp.543-558. Williams Jr, and et.al., 2018. The relationship between a comprehensive strategic approach and small business performance.Journal of Small Business Strategy.28(2). pp.33-48. 1