logo

Statistics and Data Modelling Assignment - Desklib

   

Added on  2023-06-06

15 Pages3159 Words83 Views
Running head: STATISTICS AND DATA MODELLING ASSIGNMENT
STATISTICS AND DATA MODELLING ASSIGNMENT
Name of Student
Name of University
Author Note

1STATISTICS AND DATA MODELLING ASSIGNMENT
Table of Contents
Section 1: Introduction...............................................................................................................2
Section 2: Analysis of single variable in Dataset 1....................................................................3
a).............................................................................................................................................3
b)............................................................................................................................................5
Section 3: Analysis of two variables in Dataset 1......................................................................6
a).............................................................................................................................................6
b)............................................................................................................................................7
c).............................................................................................................................................8
Section 4: Collect and Analyse Dataset 2..................................................................................8
Section 5: Discussionand Conclusion......................................................................................12
References................................................................................................................................14

2STATISTICS AND DATA MODELLING ASSIGNMENT
Section 1: Introduction
The paper is a study of the transport system in New South Wales, Australia. Data was
obtained from the NSW open data for transport from the government site and a sample of the
same was used to study the scope of the government to grow and improve upon the scenario
as suggested from the data. The opal on and off dataset was used for the purpose of enquiry.
The opal card is an all purpose transport card which can be used for travelling by ferry, light
rail, bus and train by anyone who possess it. It also provides a way to track and keep records
of travel patterns of the passengers for the purpose of further developments as per the
perceived issues and needs (Culnane, Rubinstein and Teague 2017).
Ortega-Tong (2013) conducted a study using smart card data like Opal card in
London, which is the Oyster card. The study used the data to classify passengers on the basis
of frequency of travel and type of traveller, that is whether workers, students or even visitors
who visited for business or leisure. The analysis however that was used was that of cluster
analysis, done on the basis of characteristics relating to spatial variability, socio-demographic
condition, activity patterns and the choice of modes. The clusters were found to represent
and classify passenger behaviour. Four clusters were found which were of visitors visiting for
leisure, visitors visiting for business, registered users who use the mode regularly and those
who use in more occasionally than on a regular basis.
Hence data from smart card transactions have been proved to be useful for
understanding passenger behaviour and pattern. This study focuses on the mode of transport
and the frequency of tapping in and out for the state of NSW in Australia.
Dataset 1 is the sample of data obtained from the Opal Tap on and Tap Off Location-
8th to 14th August 2016 dataset, as available via the Transport or NSW Open Data. The dataset can be

3STATISTICS AND DATA MODELLING ASSIGNMENT
accessed via the link, https://opendata.transport.nsw.gov.au/dataset/opal-tap-on-and-tap-off. It is
therefore a secondary dataset (Creswell and Creswell 2017). The variables in the sample of size 1000
are mode of the data, with four categories, bus, train, ferry and light rail. The data also includes dates
of transactions, in day, month and year. The variable tap recorded that on or off status. The location of
the tap being accessed was also included. These are all categorical data, except the date variable
which is interval. The variable count is interval type, giving the total number of times the tap was on
or off in a certain location on that certain date.
The second dataset was obtained by using a survey method. The data was collected using
simple random sampling from travellers across NSW and hence is primary in nature. The simple
random sampling method is an unbiased sample technique which gives equal chance of inclusion into
the sample to all the members of a population. It is a popular probability sampling technique,
considered for being simple and robust. It however can end up not being able to capture the features
of the population fully if the representation of different factions in the population is not equally
proportionate (Creswell and Creswell 2017). For example if the number of students in the
considered population is lower than the number of workers, then the sample could fail to gather
enough information about the students. Nonetheless, it is proven to work fairly well if proper care is
taken with regard to such complexities. The variables based on which data was collected are, gender,
mode of transportation and the anticipated cost of public transport per month for the individual.
Section 2: Analysis of single variable in Dataset 1
a)
The first research question of interest is regarding the type or mode of transport for
the passengers in the period 8th August , 2016 to 14th August , 2016. The following table,
labelled table 1, gives the numerical summary of the passengers in each mode of transport
within the given time frame.
Count of Column Labels

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Analysis of New South Wales Government Transport by Bus, Train, Ferry and Light Rail
|15
|2124
|177

Statistical Data Analysis on Transport Data in NSW
|14
|3200
|452

Analysis of New South Wales Public Transportation System
|9
|1985
|389

Analysis of Transport Data in NSW: Statistics Study Material
|11
|1882
|92

Analysis of Public Transport System in NSW using Opal Tap On/Off Data
|10
|2442
|484

Statistics: Analysis of Transportation System in New South Wales
|10
|2401
|207