logo

Data Analysis Report of Australian Road Transport Crash Fatalities

   

Added on  2022-12-30

11 Pages3093 Words79 Views
Table of Contents
Data analysis report of the Australian road transport crash fatalities from 2010 to 2018............2
Introduction................................................................................................................... 2
Authorization and purpose........................................................................................... 2
Limitations................................................................................................................. 2
Scope....................................................................................................................... 2
Methodology................................................................................................................. 3
Data setup.................................................................................................................... 3
Exploratory data analysis................................................................................................ 3
One variable analysis.................................................................................................. 3
One variable analysis 1............................................................................................... 3
One variable analysis 2............................................................................................... 4
One variable analysis 3............................................................................................... 5
Two-variable analysis..................................................................................................... 6
Linear regression........................................................................................................... 7
Clustering................................................................................................................... 9
Brief explanation of k-means and clustering...........................................................9
Clustering analysis.................................................................................................. 9
Analysis:................................................................................................................... 10
Conclusion.................................................................................................................. 11
Reflections.................................................................................................................. 11
References................................................................................................................. 11
Data Analysis Report of Australian Road Transport Crash Fatalities_1
Data analysis report of the Australian road transport crash fatalities
from 2010 to 2018
Introduction
Authorization and purpose
This data analysis report has been assembled to find trends in the Australian
road transport crash fatalities from 2010 to 2018. There are no specific
achievable goals set out, but anything found to be interesting or significant
will be critically analyzed to assist other researchers, business
representatives and government agencies.
Limitations
The Australian road transport crash fatalities from 2010 to 2018 will be
analyzed. The dataset from” (https://data.gov.au/data/dataset/australian-
road-deaths-database )” will be the only data that is processed and
analyzed.
Scope
Data from” (https://data.gov.au/data/dataset/australian-road-deaths-
database )” will be setup and pre-processed. Three, one-variable analysis will
be performed, followed by two, two-variable analysis. A graph will be
provided with each individual analysis. Clustering and k-means will be briefly
explained and a clustering analysis to group certain states will be performed.
Linear regression will be briefly explained and two linear regression analyzes
will be executed, with both models to be plotted.
Data Analysis Report of Australian Road Transport Crash Fatalities_2
Methodology
Research for the report has been gathered from a csv data spreadsheet from
“(https://data.gov.au/data/dataset/australian-road-deaths-database )” . All
references have been gathered from journal articles from the USC library
database.
Data setup
Before the data is loaded into R, the raw “road transport 1.xlsx” file should
be pre-processed. Many fields are left unrecorded and this will make cutting
these rows out easier in R. Now the collected dataset under the name of
“road transport 1.xlsx” must be loaded into the workspace. In R, by using the
command getwd () the location of the R workspace can be found.
> getwd()
[1] "C:/Users/JARO/Documents"
Now that the location of the workspace is confirmed “road transport 1.xlsx”
can be placed in the work space location. Considering that the raw data is an
excel workbook file (.xlsx) and not a text file, the built in xlsx reader can be
utilized to read the file information into the variable “data”.
road_transport_1 <- read_excel("C:/Users/JARO/Downloads/road transport 1.xlsx")
Library “cluster” is loaded for a visualization of clusters.
library(cluster)
The data is split into annual values shared throughout all the series. Because
of this, subplots will have to be generated to separate the different series
into categories. The data is now ready for analysis.
Exploratory data analysis
One variable analysis
One variable analysis 1
> boxplot(road_transport_1$`Speed Limit`,ylab="Speed Limit",main="Boxplot of Speed
Limit")
Data Analysis Report of Australian Road Transport Crash Fatalities_3
Analysis;
A subplot of speed limit will be made and displayed with a boxplot. This
subplot represents as a sequence of speed limit of the Australian transport
industry. A boxplot is preferred as it will clearly define the average of the
posted speed limit at location of crash as well as the upper and lower
quartile ranges and the outlying values. The average speed limit at location
of crash is approximately 80. It can be noted that substantial portion seem to
be grouped in the upper quartile as it is much more compact than the lower
quartile.
One variable analysis 2
> hist(road_transport_1$Age)
Data Analysis Report of Australian Road Transport Crash Fatalities_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Introduction to Data Science
|11
|3150
|22

Data Analysis report of Road Crashes
|13
|2045
|350

Introduction to Data Science: Analysis of Crash Trends in Australia
|16
|2780
|274

Data Analysis Report of Fatalities in Australia
|16
|1986
|225

Data Analysis Report of Fatalities in Australian Road Accidents
|12
|2212
|59

Analysis of Fatalities in Australia
|12
|1650
|53