Design and Development of a Data Analytics Tool

Verified

Added on  2020/10/05

|35
|5004
|498
AI Summary
The provided assignment requires students to design and develop a data analytics tool using MS Excel for data visualization. The report should detail the tool's features, results, and outcomes. References to various books and journals on database design, water distribution networks analysis, and spatial data analysis are also included. Students will need to make reasonable assumptions to support their design and create a suitable tool that can be used for data analytics.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Database and analytical principles

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
TABLE OF CONTENTS
INTRODUCTION...........................................................................................................................3
SESSION -1 DATABASE...............................................................................................................3
TASK 1............................................................................................................................................3
Design of logical model of Entity relationship of Auto sales database ......................................3
2. Produce a scripts by using SQL command to create set of database tables.............................9
3. By using SQL statements, populate the tables in the database that create 10 records in each
table............................................................................................................................................14
Data Manipulation Language (DML)........................................................................................21
4. Discuss about the strength and weakness of top down and bottom up approach in design. . 24
SECTION 2- DATA WAREHOUSING.......................................................................................25
TASK 2..........................................................................................................................................25
Designing star schema................................................................................................................25
SECTION 3- DATA ANALYTICS...............................................................................................27
TASK 3..........................................................................................................................................27
Car Manufacturer which contains most quantity of car models................................................27
Average fuel economy for the city and highway driving...........................................................27
Good and bad average fuel economy cars..................................................................................28
Car manufacturer which have 4-wheel drive and 2 wheel drive with engine power more than
3.5...............................................................................................................................................29
Critical reflection.......................................................................................................................31
RECOMMENDATION.................................................................................................................34
CONCLUSION..............................................................................................................................34
REFERENCES..............................................................................................................................35
Document Page
INTRODUCTION
Database is a collection of element that lives for a long terms. It is a type of software that
provide an interface to the database for storage of information and retrieval. This report will
discuss about the Entity relationship diagram on the basis of their modelling tools and also
representing the specific entities, attributes and their relationship between them. Furthermore,
this assignment will discuss about the SQL command to create table by using My SQL relational
database and it should represent the different level of table. It will critical analyse the strength
and weakness of top down or bottom up approach to design, develop the database within proper
justifications. Afterwards, this assignment will describe to design a star schema of auto sale and
also make a reasonable assumption that support the design. At last, this report will describe the
data analytics tool to finding the results and outcomes.
SESSION -1 DATABASE
TASK 1
Design of logical model of Entity relationship of Auto sales database
Illustration 1: ER Diagram
Document Page
Entity relationship diagram is a basically a graphical representation of information system that
depict the relationship among the entities, objects, concepts and events within system (Ghosh
and Bashar, 2018). It is based on the data modelling technique that can help to define the
enterprise process for designing a relational database. It involves ERD to provide the visual
starting point of database design that support to determine the information system requirement
through an organization.
In Above diagram, there are different type of entities represented such as car, seller and
buyer etc. Each entities has been shown their attributes in the form of tables.
For example- In car table, there are such attributes Mileage, year, body, type and engine etc.
Data Dictionary -
Entities attributes
Buyer_info Buyer_id (PK)
B_L_name
B_contact
B_F_name
B_M_name
Postcode_id (FK)
Buy_sell_history H_id (PK)
attributes
Purchase_date
sold_date
sold_price
car_id (FK)
buyer_id (FK)
Car Car_id (PK)
car_name
company_id (FK)
car_company company_id

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
company_name
car_detail Detail_id (PK)
engine_size
Co2 Emission
price
body_type
Mileage
color_id (FK)
model_id (FK)
seller_id (FK)
fuel_id (FK)
year_id (FK)
City city_id (PK)
city_name
country_id (FK)
color color_id (PK)
color
country country_id (PK)
country
Fuel_type Fuel_id
Fuel_type
Model Model_id
Model_name
Postcode postcode_id (PK)
postcode
city_id (FK)
Seller_info seller_id (PK)
S_contact
Document Page
S_M_name
S_F_name
postcode_id (FK)
Year Year_id (PK)
year
Table -1
This above table is representing all the entities that has some specific attributes and also create a
relationship between them. According to scenario, each table contain a unique key that help to
create a relationship between two or more tables.
In buyer_info table:
Buyer_id is a primary key that help to create a relationship with postcode table because
buyer store the postcode details in it. it has created a relational between buyer table and postcode
table. It is representing as a Foreign key in postcode_id . buyer_info and postcode has to create
one to one cardinality participation. It occurs the total participation between the postcode and
buyer tables.
Buyer_info table:
Primary key - Buyer_ id
Foreign key- postcode_id
In car table:
car_id is a primary key that help to create a relationship with car_company table because
company has stored the information of car so that it has created a relational between car and
car_company table. company_id is representing as a foreign key. Buyer and car has to create one
to one cardinality participation. It occurs the total participation between the entities.
car table:
Primary key – car_id
Foreign key – company_id
Document Page
In seller_info table:
Seller id is a primary key that help to create a relationship with seller table and postcode
because seller store the information of postcode, they want to sell the car so that it has created a
relational between seller and postcode. On the other hand, Postcode_id is representing as a
foreign key seller_info table. Seller and buyer has to create one to one cardinality participation. It
occurs the total participation between the seller and buyer.
Seller_info table:
Primary key – Seller_id
Foreign key – Postcode_id
In car_details table:
detail_id is a primary key that help to create a relationship with car_detail table and
colour table because it store the information about the car colour in database system. colour_id is
representing as a foreign key in car_detail table. Car_detail and color has to create one to one
cardinality participation. It occurs the total participation between the car_detail and color table.
Car_details table:
Primary key - detail_id
Foreign key- color_id
In car_details table contain the information about the model, year, fuel type and other
information so that it has required to use the foreign key in car_detail table to create a
relationship between them.
Foreign key : model_id , fuel_id , year_id etc.
Each key contain the specific values in the database system in proper manner.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Justification for using ER Diagram-
Designer choose the Entity relationship design because it helps to analyse and understand
every attribute in each table. It supports to identify that primary key has performed significant
role in the database system. It is also creating a relationship between the different tables. In my
opinion, it allows for designer to show the relationship between the group of data and gives a
simple yet an effective overview of different entities in the system. It also uses ER model to
summarised that what entities are needed in the relational database system. In this way, it can be
representing the relationship among the records in the tables (Roy-Hubara, Rokach and Shoval,
2018). It can use in the both as a reference for the designer when it has setting the foreign key
and as a means of determine the many to one, many to many relationships, which can resolve in
proper manner.
ER model is produced because it records the design of database that need to part of
technical manner. In the future, it will require the maintenance of database, which necessary. At
time, ER model plays important role to easily find the entities and attributes. It also helps to the
designer to check the overall working system of relational DBMS. Entity relationship diagram is
clearly representing an appropriate flow of information which required for any type of database
development. It has to be provide the proper guidance to understand the concept of development.
It can store all information or data in the form of tables to generate a specific ouput in proper
manner.
Document Page
2. Produce a scripts by using SQL command to create set of database tables.
SQL stands for structure query language which is mainly used to create, retrieve,
maintain the relational database system in proper manner. It is particularly applied in the work
with structure data where it has created relations associated with the data itself. A database
system is mainly used the SQL query language to perform different operations and functions.
Data definition language (DDL):
Data definition language is a type of standard for the commands that define the different
structure in the database. DDL statements modify, create and remove object in the tables or
indexes.
Create
Document Page

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Document Page
Document Page
Alter

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
3. By using SQL statements, populate the tables in the database that create 10 records in each
table.
Buyer_info table:
Buy_sell_history table:
Document Page
Car table:
Car_company table:
Document Page
Car_detail table:

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
City_table:
Country_table:
Document Page
Color_table:
Fuel_type:
Document Page
Model table:
Postcode_table:

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Seller_info table:
Year_table:
Document Page
Data Manipulation Language (DML)
DML stands for Data manipulation language that can help to insert, delete and update the
data into tables. It is basically defined the database structure. It is an efficient user interaction
with the system which has capability of DML that organize the command in proper manner.
Select : This command is basically used to retrieve the row from a table which is widely
used DML command in SQL.
Insert : This command is used to add one or more records to the database tables. It also
inserting the value into the tables in database system.
Insert Command:
Document Page

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Document Page
4. Discuss about the strength and weakness of top down and bottom up approach in design.
Top down approach-
A top down approach is mainly used to break down the system and gain the insight into
the composition into sub systems (Kumar and Azad, 2019). It is an overview of system that
specify the formulation but not dealing with the first sub system. This approach is more coherent
with less redundancy and fewer to do thing in proper manner. It helps to maintain the
connectivity from top to bottom.
Strength-
It is high level design and end with the low level design or development.
It is based on autocratic and hierarchical style in terms of decision making. Simplicity and speed in decision making of complex situation.
Weakness- Low participation
Bottom up approach-
A bottom up approach is piecing together the system and give more complex system, thus
making the original sub system. This approach is based on the information process which based
on the incoming data from the environment. Decisions can be made and implemented very
quickly. It is mainly used in estimating process. This approach also helps in work breakdown
structure.
Strength-
High visibility High deployment
Weakness-
It lacks cohesion.
Risk generated in the operating without a clear strategy is more in this approach.
It is very tough to implement in the system
As per scenario, it is used in the top down approach to start the design from higher level to lower
level because it follows the step by step process to fullfill all the requirement in DBMS.
Document Page
SECTION 2- DATA WAREHOUSING
TASK 2
Designing star schema
Dimensional modelling is one of the main part of business dimensional life cycle
methodology that was developed by Kimball. It included various kinds of techniques, methods
and all the concepts that are used in data warehousing designing. This Kimball's dimensional
design process consist of four steps which helps in construction of schema with detailed designed
activities. It helps in identifying table (Zhu and et.al., 2017). Column names with defined
business rules. In this firstly a business process is defined with discrete business processes. It is
done with the help of ER diagram constructed. Then Grain of the fact table is chosen which helps
in defining the meaning of the fact table with maximum level of detailed warehouse. Then
according to the ER diagram Fact table is designed in order to analyse all the business events that
are to be gathered in the fact table. Then lastly all the tables are denormalized and dimensional
tables are designed and created. These dimensional tables are like references tables which helps
in understanding how fact table information is analyses.
Here, with the help of this Kimball's four step dimensional design process star schema of
Auto sales scenario has been designed which will help in analyzing auto sales of some goods.
According to the below ER diagram sales of all the car models sold from 2014 by different car
manufacturer has been discussed so according to this the dimensional table is divided into three
which are: car dimensional table, buyer dimensional table and seller dimensional table. All the
three dimensional tables consist of some keys that are used to design the fact table.
For designing a star schema, it can be used the different entities because there are the
strong entities that help to create a strong relational between them. According to auto sales, it is
the most suitable entities and their attributes in the database design.
The star schema for the above Auto sales scenario is shown below.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Illustration 2: Star Schema
Document Page
SECTION 3- DATA ANALYTICS
TASK 3
Data analysis is used to analyze and discover all the association with the data set is
Microsoft Excel. Data analysis tool is the best for the purpose of data visualization that help to
arrange all the information in proper manner (Roy-Hubara and Sturm, 2018). It is representing
the analysis details through charts, graphs etc. This tool has been used because it is effective
enough to answer all the questions by analyzing the data set given.
Steps: First of all data set provided was saved within the system, then as per the requirements of
the questions formulas were applied to the tables in order to achieve desired results then obtained
results graph was plotted from insert object option present within Excel.
Car Manufacturer which contains most quantity of car models
Car manufacturer which contains most quantity of car models is General Motors.
General motors 84
Volkswagen groups 33
General motors
Volkswagen groups
Document Page
Steps: From the above table, it can be stated that General motors consists most quantity of car
models as compared to Volkswagen. Also as per the requirements of the questions formulas were
applied to the tables in order to achieve desired results then obtained results graph was plotted
from insert object option which is present within Excel.
Average fuel economy for the city and highway driving
Average Fuel economy for the city driving is 20.075 whereas average fuel economy for highway
driving is 28.231.
From the above graph it is clear that average fuel economy of cars on highway is way more than
fuel economy of cars in the city.
Steps: From the above plotted graph, it can be stated that average fuel economy of both the
company vary accordingly in which city driving is more then the highway driving. Although, as
per the requirements of the questions formulas were applied to the tables in order to achieve
desired results and graph is plotted from the insert option within excel.
Good and bad average fuel economy cars
Here, good average fuel economy of auto and manual cars have been shown which is 23
for auto cars and 25 for manual cars.
0
5
10
15
20
25
30
City FE
Highway FE
Illustration 3: Average fuel economy for the city and highway driving

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
23
25
Auto
Ma nu
a l
Illustration 4: Average of fuel economy
Car manufacturer which have 4-wheel drive and 2 wheel drive with engine power more than 3.5
2-d 12
4-d 3
0
2
4
6
8
10
12
14
2-d
4-d
Illustration 5: Car manufacturer which have 4 wheel drive and 2 wheel drive with engine
power more than 3.5
Document Page
Car manufacturers which have 2 WD with more than 3.5 engine power are: Austin
Martin, BMW, General Motors, jaguar Land Rover, Mercedes-Benz, Porsche, Nissan, Hyundai,
Volkswagen, Chrysler Group LLC, kia, Toyota.
Car manufacturers which have 4 WD with more than 3.5 engine power are: Chrysler Group
LLC, Mercedes-Benz, Porsche.
Steps: according to the above presented graph, it has been stated that 2 two wheel drive is more
then 4 wheel drive as per the engine power 3.5. the data set was saved in the system and graph is
inserted according from the insert option within excel.
Document Page
Critical reflection
From the above assignment I have learnt how an ER diagram is designed using crow's
foot notations and using that learning and knowledge I was able to design the ER diagram
clearly. In the ER diagram I have clearly specified entity types (Niazkar and Afzali, 2017).
Attributes as well as relationship between them. With the help of these crow's foot notations ER
diagram was designed and constructed easily and with the help of this database table was
created. For creation of database tables proper SQL commands of data definition language were
used for creation of tables. After creation of database table proper defined SQL commands of
data modelling language were used in order to fill some data in those tables. After designing and
creating ER diagram and tables star schema was created which was designed with the help of ER
diagram. In my opinion, Structural query language is the most important part in terms of
database designs because it embedded with the specific scripts that help to design a table,
columns. By using SQL language, I entered the value in every tables in proper manner.
After completion of database and designing star schema of data warehouse data analysis
was done of the given data set (Tabar, Nasar and Basher, 2018). For this data analysis I had
chosen Microsoft Excel as a data analysis tool. With the help of this data analytics tool all the
required answers were identified which were required for data analysis of questions given in the
brief like finding and comparing average fuel efficiency, finding car manufacturer etc.
Strength of data analytics tool used
Microsoft Excel is used to store and analyse data stored in it. While using Microsoft
Excel I have identified various kinds of strength of excel analysis tool such as:
It is an effective data analysis tool that is mainly used to analysis large amount of data in
order to analysis the data, find current trends, patterns etc. that can be used to influence the
decisions. It also helps in creating graphs of the required calculations which has helped me to
analyse the knowledge required regarding data. I have also observed one of the main strength of
Excel is that it can store and analyse large number of data more easily and effectively (Uerlings,
Matusch and Weiskirchen, 2016). It can easily be used as an application in almost all kind of
devices like laptops, mobiles etc. It is can easily store all the tables of database with all the data
that is required to be stored in it will complete addressing of all the cells which helps in
identifying the required data. In this any kind of graph, pie chart or histogram can be created
which can be used for analysis. In this pivot tables can also be created that can be used to

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
summarize group of data that is required to be analysed. Another strength of Microsoft Excel is
that it can store any kind of data in any format which can be changed anytime as well as it will
automatically update the calculations or results according to the current data format. MS Excel is
the powerful analytical tools that has ability to analyse the large amount of information and
discover trends or patterns. It influences to designer to take right decisions. It has graphing
capability that allows summarising all the data or enhancing to organise in proper manner.
Recently, the upgrades excel spreadsheets that enhance to ability for performing different
operations such as sorting, filtering, searching etc. It also combines with the other tools like pivot
tables and graph to find the information what they want in quickly manner. It is scalable that can
use at home, offices on their own PC. On the other hand, Excel worksheet is based on the
collaborate at anywhere. It is applicable to perform mathematical and logical operations or
functions. It provides many formulas that support to solve both type of complex and simple
calculations.
Weakness of data analytics tool used
There are various kinds of weaknesses of Microsoft Excel which forces an individual to
kind about another tool for data analysis, such as:
One of the main weakness of Microsoft Excel is its security issues i.e. data stored in it
can easily be read by anyone who opens the document (Tigkas, Vangelis and Tsakiris, 2015).
Another weakness is that mostly important data is hidden in the raw data that is stored in it which
makes it difficult to analyse and find the required data that is required. Due to this analysis of the
data stored becomes difficult. There are way more chances of misinterpretation of data which
can create problems in final analysis of the data. Sometimes the process of data analysis of the
raw data that is stored in the spreadsheets is quite difficult which can become a hindrance in
proper analysis of the data for any kind of decision making. Sometimes, it has lot of time to
generate results where some part to need extra software when install in PC. MS Excel does not
offer the good variety of dynamic graph capabilities However, it customized into the standard
formats.
Document Page
Another data analytics tool which might have been used
Another data analytics tool that might have been used for analysis is Tableau. Tableau is
an effective tool that help to create an interactive data visualization which quickly work in proper
manner. This is very simple and user friendly to understand the need or requirement. It is
extremely easy to integrate with the different data sources and develop visualization for the
longer amount of data without crashing. It is an art of business process to design an effective
concept to collect any of data or information. It is a data visualization and analytics tool that is
used to understand the data stored in the database. It can connect with hundreds of data source
connections and can work on it as well as store all the calculations, graphs or figures etc. in its
internal memory (Rose and Lindsay, 2018). In this anyone can easily analyse data sets with
simple drag and drop products. Tableau can easily connect with the data set and analyse the data
faster as compared to excel. It is quite easy to use i.e. anyone can easily analyse their data with
simple drag and drop feature. It has a feature which helps in multiple view of data which further
helps in analysis of the data. Data and graph stored in it gets automatically updated whenever
there is any change in the data set or any value (Szewrański and et.al, 2017). It can also be shared
with anyone easily with just few clicks even with mobile phones. Data in this is protected so that
all the analysed results and graphs can be seen by the authorized user only after they login into
the system. Data analysis in this can be done much more easily and effectively and with the help
of its graphical representation divisions can be taken much more easily.
Document Page
RECOMMENDATION
It has recommended to use the database system that help to easily store all the
information about the car, buyer and seller etc. It has an ability to support the entire processing
of database system. It has needed to use the database system because it provide the multi model
which allow the decentralized that easily utilized the specific algorithm and task in the process
management. Relational Database is interactive to create a relationship between one table to
another in proper manner. The relational database has used the structure query language that
support to create overall database and creating a relationship between the tables in effective
ways. These recommendations will help to enhance the system, increase their reliability and will
also help the company to fetch their customers details effectively and efficiently as per the
requirements.
CONCLUSION
As per discussion, it concluded that Database is a type of system that collect many data
into single format. It is a type of software that provide an interface to the database for storage of
information and retrieval. Database system is a software that help to retrieve, update, store and
delete etc. these are different operations performed by the system in the form of tables. This
report is discussed about the Entity relationship diagram on the basis of their modelling tools and
also representing the specific entities, attributes and their relationship between them. Critical
analyse the strength and weakness of top down or bottom up approach to design, develop the
database within proper justifications It summarised the SQL command to create table by using
My SQL relational database and it should represent the different level of table. It describes to
design a star schema of auto sale and also make a reasonable assumption that support the design.
This report describes the data analytics tool to finding the results and outcomes. This report is
described about the data analytical tool such as MS Excel which help for data visualization.

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
REFERENCES
Books and Journals
Ghosh, S. and Bashar, R., 2018, December. Automated Generation of ER Diagram from a Given
Text in Natural Language. In 2018 International Conference on Machine Learning and
Data Engineering (iCMLDE) (pp. 91-96). IEEE.
Kumar, K. and Azad, S.K., 2019. Determining All Possible Candidate Keys for Relational
Database Design. In Innovations in Soft Computing and Information Technology (pp.
241-248). Springer, Singapore.
Niazkar, M. and Afzali, S.H., 2017. Analysis of water distribution networks using MATLAB and
Excel spreadsheet: h‐based methods. Computer Applications in Engineering
Education. 25(1). pp.129-141.
Rose, K. and Lindsay, B.D., 2018. Everyday Assessment and Workflow Management in Access
Services via Tableau.
Roy-Hubara, N. and Sturm, A., 2018. Exploring the Design Needs for the New Database Era. In
Enterprise, Business-Process and Information Systems Modeling (pp. 276-290).
Springer, Cham.
Roy-Hubara, N., Rokach, L. and Shoval, P., 2018. Evaluation of a Design Method for Graph
Database. In Enterprise, Business-Process and Information Systems Modeling (pp. 291-
303). Springer, Cham.
Szewrański, S., and et.al., 2017. Spatial data analysis with the use of ArcGIS and Tableau
systems. In The Rise of Big Spatial Data (pp. 337-349). Springer, Cham.
Tabar, E., Nasar, A. and Basher, T., 2018. Pressure Transient Analysis by Using MS. Excel
Sheet and Computer Programming.
Tigkas, D., Vangelis, H. and Tsakiris, G., 2015. DrinC: a software for drought analysis based on
drought indices. Earth Science Informatics. 8(3). pp.697-709.
Uerlings, R., Matusch, A. and Weiskirchen, R., 2016. Reconstruction of laser ablation
inductively coupled plasma mass spectrometry (LA-ICP-MS) spatial distribution images
in Microsoft Excel 2007. International Journal of Mass Spectrometry. 395. pp.27-35.
Zhu, J., and et.al., 2017. Looking ahead makes query plans robust: Making the initial case with
in-memory star schema data warehouse workloads. Proceedings of the VLDB
Endowment. 10(8). pp.889-900.
1 out of 35
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]