Data Handling and Business Intelligence Report

Verified

Added on  2023/01/12

|13
|3501
|33
AI Summary
This report provides insights on data handling and business intelligence. It covers the use of Excel for data pre-processing, including the IF function, VLookup, charts and graphs. It also explains common data mining methods and the advantages of Weka over Excel. Additionally, it gives a specific example of clustering using audidealership.csv and Weka.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Data Handling and
Business Intelligence
Report

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Table of Contents
PART 1............................................................................................................................................3
1.1 Use of excel of pre-processing the data.................................................................................3
1.1.1 Use of IF function in Excel.............................................................................................5
1.1.2 VLookup.........................................................................................................................6
1.1.3 Charts and Graphs...........................................................................................................7
PART 2............................................................................................................................................9
2.1 Using the audidealership.csv provided in conjunction with Weka give a specific example
of clustering.................................................................................................................................9
2.2 Explain the most common data mining methods that can be used in business with real
examples....................................................................................................................................10
2.3 Advantage and disadvantage of Weka over excel...............................................................12
REFERENCES..............................................................................................................................13
Document Page
PART 1
1.1 Use of excel of pre-processing the data
Within Microsoft, MS Excel is an application that is very useful and very important software.
You will know its importance when you start doing a job. Microsoft Excel is a spreadsheet
program that is one of the applications under the Microsoft Office suite. Is a commercial
spreadsheet application created by Microsoft. It is used to make basic calculations, graphic tools,
pivot tables and macros. Similar to other applications in Excel, basic features are available to
create spreadsheets in which the collection of cells in the form of rows and columns are arranged
and in which to organize and manipulate the data (Niglas, 2007).
Here, the data is also shown in the form of line graph, chart and histogram. In Excel, we can
experiment with many types of facts from different aspects of data. MS Excel is used everywhere
from manufacturing industry to a small shop, government office.
In any office, people working on it know that which is the main speech meaning in this
application which is used the most:
1) Pivot Tables
PivotTables summarizes large amounts of Excel data from a database that is formatted where the
first row contains Heading and the second row contains Categories or Values. The way the data
is summarized is flexible but usually Pivot Tables will contain the values of some or all of the
categories. If you are new to creating Pivot Table, Excel 2013 can analyze your data and
recommend Pivot Table for you. Once you are comfortable with Pivot Tables, you can start from
scratch and make your own. To create a Pivot Table, make sure that your data has column
headings or table headers and no blank rows. Click on any cell in the range of cells or tables
(Berry and Linoff, 2004).
How this function works:
In the Launch Recommended Pivot Tables dialogue box, click on any Pivot table layout to get a
preview, then choose the one that works best for you and click OK. Excel will provide a
Document Page
selection of recommended Pivot Tables for your data. Excel then places the Pivot Table on a new
worksheet and shows the field list so that you can rearrange the data according to your needs.
2) Conditional Formatting
Conditional formatting, as its name suggests, changes the format of the cell depending on the
content of the cell, or a range of cells, or any other cell or cells in the workbook. Conditional
formatting helps users quickly focus on important aspects of a spreadsheet or highlight errors and
identify important patterns in the data. Conditional format basic font and cell formatting such as
number format, font color and other font attributes, cell borders, and cell fill color can be
applied. In addition, there is a range of graphical conditional formats that help visualize data
using icon sets, color scales, or data bars.
The conditional format chosen is applied to a cell based on the position that you set or the
position that Excel produces by comparing the values of the cell in a range. So, for example, in a
list of employees' salaries, a conditional format can be applied to any salary in excess of a certain
amount, any employee who precedes a specific date, or anyone with a specific name. the
employee joins.
Graphical conditional formats will apply to the Salary column and, by default, will be based on
an analysis of the highest and lowest values in the list, but can be overridden if required.
3) Sorting and Filtering
Excel spreadsheets help us understand large amounts of data. To make it easier to find what you
need, you can reorder the data or simply select the data you need, based on the criteria you set
within Excel. Sorting and filtering your data will save you time and make your spreadsheet more
effective. Suppose you have a list of hundreds of records, including dates, ages, names, cities,
and more. You can quickly organize data according to your needs using the sort and filter
features of Excel. When you sort information in a worksheet, you can organize data quickly and
find values quickly. You can sort the entire worksheet or a range or table of data. Sorting can be
done by one or more columns (Yan, Lee and Li, 2009).

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
4) Basic Math
The numbers within the data of any Excel spreadsheet are the main issues. Using basic math
functions to manipulate those numbers is one of the features that make Excel so powerful.
Simple calculations can be entered in a formula bar in Excel, as they would be written on paper.
Like all Excel formulas, begin the calculation with the = sign. You can type the calculation that
you want to display directly in the cell or formula bar and when you press Enter, the answer will
appear in the cell. Another option is the use of multiple cells to form formulas, as we see here
(where cell A1 (or 87) + cell A2 (or 16) is equal to cell A3 (or 103).
To perform basic mathematical operations such as addition, subtraction, multiplication or
division to perform basic mathematical operations, we use the following arithmetic operators:
+ (Plus sign) Also
- for subtraction (minus sign)
* (Asterisk) for multiplication
/ (Forward slash) for segmentation
A calculation is to be performed in the form of Excel's interpretation = (equals) signal and
calculated according to the operators calculated from left to right (Fisher, 2006).
5) Mixed Type Charts
Mixed type or combo (combination) charts combines two styles of charts, such as Excel's
column chart and line chart. This format can be helpful for displaying two different types of
information or a range of values that vary greatly. For example, we can use a line chart to make
it easier to identify the number of homes sold between June and December and the average
selling price of the month.
1.1.1 Use of IF function in Excel
With the help of the IF function, you can apply a condition to your data so that if your condition
is TRUE then one result is returned and if the condition is FALSE then another result is returned.
The IF function can either be isolated, which means that only one condition applies to your data,
Document Page
or it can be nested, where many different criteria are applied to get the TRUE or FALSE results
(Pujari, 2001).
In this model, we will explore the date on which contracts and profits were decreased and the
date on which the supermarket 2 contracts increased, after which we will see how to gradually
"IF function":
As a matter of first significance reorder date of solicitation, arrangements and advantages
on discrete sheet.
Presently alter the data taking old date first through orchestrating limit of surpass desires
sheet.
Select the cell where you have to make the "IF Function".
Type the code in the cell: = if (
Type condition with comma: B2>B3,
Type what you have to show when the condition is met. In case you have to show
content, form inside statements: "Reduction"
Type a comma:,
On the remote possibility that you don't fulfill the condition inside statements, type:
"Augmentation"
At that point close the area and press Enter key.
The IF work made above will look like this: = if (B2>B3, "Rot", "Augmentation")
So IF work says if the speed or motivating force in cell B2 is more than B3, by then show
Decline and if it is under B3, by then show Increase.
In the wake of pressing the Enter key, you will get the outcome of cell B4. If you have to
see the result, drag the handle of the cell on D4 down to cell D8400.
1.1.2 VLookup
We mainly use this command to search for the value of table. How it works, I will tell you in full
detail, so let's know it now. Suppose there are 2 different sheets A and B. If we want the value of
one column of the first sheet in the other sheet, then for this, one data in both sheets should have
one data in common, which is present in both sheets, only then it will find the data in the second
sheet in terms of it. Because Vlookup () takes the value in front of the reference or common data,
and moves it to another sheet (Linoff and Berry, 2011).
Document Page
Syntax: =VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])
The lookup value here is the value for which we have to search through the data. It can be text,
number, and date. We have to select it on the first sheet. Table array means that the sheet from
which we want the data, we have the common column as well as the data of other columns in
which the data is present. It can also select more than 2 columns. In this, we select a column in
another sheet. Col_index_num means to enter the number of the column whose data we need.
We have to move the data from the second sheet to the first sheet, so remember that we first want
the data of the column along with the column of reference data, also select it and then we put the
number of that column here and tell us that this column Data required.
Demonstration of Look up work on given circumstance of Superstore Sale:
Using the equal surpass desires sheet, the going with advances will be taken:
Lookup Value: Select cell G2, H2 and I2 for putting the characteristics for Order date,
arrangements and advantage. The results will be showed up on G3, H3 and I3. Select the cell H3
and put work Lookup; select cell G3 as a Lookup Value.
Table series: For this select whole range from A2 to C8400 (A2:C8400).
[Range_ lookup]: Select came about cell for bargains; that is B2 to B8400 (B2:B8400).
1.1.3 Charts and Graphs
A very prominent feature in Excel is the Pie Chart. It helps a lot in analyzing any data. We can
analyze complex data very easily by turning it into Pie Chart. Suppose you must have watched
Sensex. When a particular performs well among top five shares, then we can understand it very
easily by understanding the performance of the whole year of this share i.e. how many
fluctuations has been recorded in particular share during a year.
Mixed or Combination Charts
Mixed or combination is one of the best features used in Charts excel. By using just one chart,
we can add 2 or more charts and look at the same chart.
Steps:

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Select the cell, for which line graph is to be made.
Go to insert option and select line graph.
01/01/2009
03/04/2009
04/07/2009
04/10/2009
04/01/2010
06/04/2010
07/07/2010
07/10/2010
07/01/2011
09/04/2011
10/07/2011
10/10/2011
10/01/2012
11/04/2012
12/07/2012
12/10/2012
-20000
0
20000
40000
60000
80000
100000
Sales
Profit
(Source: superstores sales.csv)
Interpretation: The data showed up in above diagram shows that in January month of the year
2009; association has followed most raised arrangements and most raised disaster. On the other
hand, second most important arrangements are showed up in the year 2012. The outline shows
huge changes in arrangements and advantages both; association has faced huge hardships during
the year 2009 to 2010 and in the year 2012. The most extraordinary abatement in the
arrangements can be found in year 2009, where various years didn't show up as much rot as
showed up in 2009.
Document Page
PART 2
2.1 Using the audidealership.csv provided in conjunction with Weka give a
specific example of clustering
Interpretation: The above outcome shows relationship esteem a lot of lower than 1; this
demonstrates a client's pay won't be influenced by whether a client sees an Audi (Olson and
Delen, 2008).
Document Page
Interpretation: The figures shown in the graph above show that in January of 2009; the company
pursued the highest offers and most mayhem. On the other hand; the second most necessary
agreement of the year was in 2012. This graph shows significant changes in both agreement and
profit; the association went against bad luck between 2009 and 2010 and 2012. The most
noticeable decline is in agreements reached in 2009, where the decline has not been seen for
many years (Birks, and et.al., 2012).
2.2 Explain the most common data mining methods that can be used in
business with real examples
Data mining is the practice of automatically searching large storage of data to discover patterns
and trends that go beyond simple analysis. Data mining uses sophisticated mathematical
algorithms to fragment data and evaluate the likelihood of future events. Data Mining is also
known as (KDD - Knowledge Discovery in Data). Data mining is the process of analyzing
hidden patterns of data according to various approaches to classifying useful information, which

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
is gathered in common areas for data analysis, data mining algorithms (Soukup and Davidson,
2002). In general, the benefits of data mining come from the ability to uncover hidden patterns
and relationships in data that can be used to make predictions that affect businesses. Some of the
methods of data mining are discussed below:
1. Neural Network: This information mining technique or model depends on organic neural
systems. It is an assortment of neurons like handling units with a weighted association between
them. They are utilized to show the connection among info and yield. It is utilized for
arrangement, relapse examination, information handling and so forth (Pujari, 2001). This
strategy chips away at three columns –
Ideal
Learning calculation (directed or unusable)
Actuation work
General investigation or disparity examination:
This information mining technique is utilized to distinguish information things that don't consent
to the normal example or anticipated conduct. These unforeseen information things are treated as
exceptions or commotion. They are useful in numerous spaces, for example, charge card
misrepresentation discovery, interruption location, flaw identification and so forth. It is likewise
called exception mining. For instance, assume the diagram beneath is plotted utilizing a few
informational indexes in our database. So the best fit line is drawn. The focuses close to the line
show the normal conduct while the point one from the line is outward.
2. Choice Trees: A choice tree is a tree structure (as its name recommends), where
Each inner hub speaks to a test on the property.
The branch shows the test outcomes.
Terminal hubs hold the class name.
The highest hub is the root hub which comprises of straightforward inquiries with at least two
answers. In like manner, the tree develops and a stream graph like structure is created.
3. Successive Pattern or Pattern Tracking: This information mining technique is utilized to
distinguish designs that happen as often as possible over a given timeframe. For instance, the
Document Page
project lead of an apparel organization sees that coat deals increment not long before the winter
season, or deals in the bread shop during Christmas or New Year's Eve (Linoff and Berry, 2011).
2.3 Advantage and disadvantage of Weka over excel
Advantage:
WEKA (Waikato Environment for Knowledge Analysis) is an open source software developed
by the University of Waikato, in New Zealand. The data mining tool is based on Java and can be
used on Windows, macOS and Linux. It is famous for its complete machine learning functions
and supports all the most important data mining tasks such as clustering, association, regression
and classification (Read and et.al., 2016).
Disadvantage:
It is less effective in other techniques, such as group analysis: in this case only the main
procedures are offered. WEKA can have processing problems when dealing with large amounts
of data, as the data mining tool tries to load them all together in RAM. A solution is therefore
offered by the easy command line (CLI) which allows you to better process large amounts of
data (Read and et.al., 2016).
Document Page
REFERENCES
Books and Journals
Berry, M.J. and Linoff, G.S., 2004. Data mining techniques: for marketing, sales, and customer
relationship management. John Wiley & Sons.
Birks, H.J.B., Lotter, A.F., Juggins, S. and Smol, J.P. eds., 2012. Tracking environmental change
using lake sediments: data handling and numerical techniques (Vol. 5). Springer Science
& Business Media.
Fisher, P.F. ed., 2006. Developments in Spatial Data Handling: 11th International Symposium
on Spatial Data Handling. Springer Science & Business Media.
Linoff, G.S. and Berry, M.J., 2011. Data mining techniques: for marketing, sales, and customer
relationship management. John Wiley & Sons.
Niglas, K., 2007. Media Review: Microsoft Office Excel Spreadsheet Software. Journal of
Mixed Methods Research, 1(3), pp.297-299.
Olson, D.L. and Delen, D., 2008. Advanced data mining techniques. Springer Science &
Business Media.
Pujari, A.K., 2001. Data mining techniques. Universities press.
Read, J., Reutemann, P., Pfahringer, B. and Holmes, G., 2016. Meka: a multi-label/multi-target
extension to weka. The Journal of Machine Learning Research, 17(1), pp.667-671.
Soukup, T. and Davidson, I., 2002. Visual data mining: Techniques and tools for data
visualization and mining. John Wiley & Sons.
Yan, X., Lee, S. and Li, N., 2009. Missing data handling methods in medical device clinical
trials. Journal of Biopharmaceutical Statistics, 19(6), pp.1085-1098.
1 out of 13
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]