BUS105: Data Management Project: Exploring Popular Names Analysis
VerifiedAdded on 2022/12/27
|12
|1921
|30
Report
AI Summary
This report, prepared for BUS105, explores data management and analysis of popular names in South Australia from 1944 to 2015. The report begins by outlining the data lifecycle stages, including generation, collection, storage, visualization, analysis, and action, and then discusses the use of big data software like Hadoop and NoSQL for storing large datasets. The analysis includes visualizations of name popularity in specific years (1945 and 2005) for both genders, using column charts to identify top names like Margaret, Jack, and Charlotte. Furthermore, the report compares the popularity of selected names (Jack, Noah, Charlie, Lucas, Charlotte, Amelia, Olivia, and Ava) over time, presenting data on amounts and ranks. The report also includes charts illustrating the popularity trends of these names over the years. Finally, the report concludes with recommendations for a gift shop owner, suggesting the purchase of trending names like Oliver and Charlotte, and emphasizes the importance of considering external data sources like movies, TV series, and popular games to predict future trends. The project uses data from data.sa.gov.au and other sources to support the analysis and recommendations.

Running head: ASSESSMENT 3
ASSESSMENT 3
Name of the Student
Name of the University
Author Note
ASSESSMENT 3
Name of the Student
Name of the University
Author Note
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Task 1:
The different stages of the data life cycle are Generation, Collection, Storage, Visualization,
Analysis and Action in sequential manner. The data generation happens throughout the
events in real life and this data is collected in the collection process by means of surveys,
secondary data obtained from medical records and sales data of different companies. In the
storage phase the gathered data is stored generally in computer hard drives or in cloud based
servers. Sometimes for a few amount of data can be stored in human memory also (Li et al.
2015). Now, the easiest way to understand patterns in the data is to perform several
visualizations. After that data is analysed by suitable statistical tests to make inference about
the data. The conclusions of those tests is useful to make necessary actions for the businesses.
In this case study the business problem is to identify popular names in the South Australia so
that stocks by those names can be bought by the boss and thus all the stocks can be sold by
the company.
Task 2:
If the entire data is obtained for many of the states in Australia for a period of more than 30
years then huge data will be collected and this data can be considered as big data. Thus for
storing this entire amount of data some advanced software will be needed so that data scaling
can be proportional to the data growth (Data.sa.gov.au. 2019). The different software that are
popular for storing big data and real time display are Hadoop, NoSQL and Cassandra. These
software are directly connected with direct-attached storage servers and hence the solution for
storing multiple states data for many years can be obtained (Sookhak et al. 2015).
Task 3:
Now, a sample of 50 rows from year 1945 is obtained and then a column chart of the same is
displayed below.
The different stages of the data life cycle are Generation, Collection, Storage, Visualization,
Analysis and Action in sequential manner. The data generation happens throughout the
events in real life and this data is collected in the collection process by means of surveys,
secondary data obtained from medical records and sales data of different companies. In the
storage phase the gathered data is stored generally in computer hard drives or in cloud based
servers. Sometimes for a few amount of data can be stored in human memory also (Li et al.
2015). Now, the easiest way to understand patterns in the data is to perform several
visualizations. After that data is analysed by suitable statistical tests to make inference about
the data. The conclusions of those tests is useful to make necessary actions for the businesses.
In this case study the business problem is to identify popular names in the South Australia so
that stocks by those names can be bought by the boss and thus all the stocks can be sold by
the company.
Task 2:
If the entire data is obtained for many of the states in Australia for a period of more than 30
years then huge data will be collected and this data can be considered as big data. Thus for
storing this entire amount of data some advanced software will be needed so that data scaling
can be proportional to the data growth (Data.sa.gov.au. 2019). The different software that are
popular for storing big data and real time display are Hadoop, NoSQL and Cassandra. These
software are directly connected with direct-attached storage servers and hence the solution for
storing multiple states data for many years can be obtained (Sookhak et al. 2015).
Task 3:
Now, a sample of 50 rows from year 1945 is obtained and then a column chart of the same is
displayed below.

MARGARET
JUDITH
JENNIFER
PATRICIA
LORRAINE
BARBARA
SANDRA
ROSEMARY
JANET
DIANNE
MARY
MARILYN
RAELENE
ANNE
MARLENE
MAUREEN
VALERIE
MARIE
KAYE
ANN
GLENYS
JOSEPHINE
JUNE
DENISE
SUZANNE
0
50
100
150
200
250
300
350
Popularity of Names
Amount Position
It can be seen that from the chart that most popular female name in the chosen sample of
1945 is MARGARET (Verginelli, Yao and Suuberg 2016).
Now, 50 rows from 2005F is taken and then showed by a column chart.
EMILY
ELLA
OLIVIA
ISABELLA
MIA
JASMINE
CAITLIN
HANNAH
SIENNA
RUBY
JADE
HAYLEY
ZOE
AMBER
TAHLIA
TAYLA
LAUREN
BRIANNA
KIARA
JORJA
BETHANY
KATE
ERIN
MADDISON
ABBEY
0
20
40
60
80
100
120
140
160
Popularity of sample names of 2005F
Amount Position
Now, similarly 50 rows from 1945M are chosen and the column chart is displayed as given
below.
JUDITH
JENNIFER
PATRICIA
LORRAINE
BARBARA
SANDRA
ROSEMARY
JANET
DIANNE
MARY
MARILYN
RAELENE
ANNE
MARLENE
MAUREEN
VALERIE
MARIE
KAYE
ANN
GLENYS
JOSEPHINE
JUNE
DENISE
SUZANNE
0
50
100
150
200
250
300
350
Popularity of Names
Amount Position
It can be seen that from the chart that most popular female name in the chosen sample of
1945 is MARGARET (Verginelli, Yao and Suuberg 2016).
Now, 50 rows from 2005F is taken and then showed by a column chart.
EMILY
ELLA
OLIVIA
ISABELLA
MIA
JASMINE
CAITLIN
HANNAH
SIENNA
RUBY
JADE
HAYLEY
ZOE
AMBER
TAHLIA
TAYLA
LAUREN
BRIANNA
KIARA
JORJA
BETHANY
KATE
ERIN
MADDISON
ABBEY
0
20
40
60
80
100
120
140
160
Popularity of sample names of 2005F
Amount Position
Now, similarly 50 rows from 1945M are chosen and the column chart is displayed as given
below.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Similarly
50 rows
from
2005M
is chosen
and then
their
popularity is displayed in the column chart.
JACK
THOMAS
ETHAN
RYAN
LIAM
SAMUEL
JACOB
BENJAMIN
JAYDEN
DYLAN
JORDAN
MATTHEW
NOAH
HARRISON
OLIVER
MITCHELL
CAMERON
MAX
HENRY
HARRY
OSCAR
SEBASTIAN
ANGUS
CHARLIE
MICHAEL
0
50
100
150
200
250
Popularity of names in sample of 2005M
Amount Position
Task 4:
1. The four boys names that are selected are JACK, NOAH, CHARLIE, LUCAS and he girls
names those are selected are CHARLOTTE, AMELIA, OLIVIA, AVA.
The amounts and Ranks for the above boys and girls names are listed in the following table.
The rank and corresponding number of times these above names are used are given in the
following table.
JOHN
PETER
TREVOR
BRIAN
BARRY
KEVIN
RICHARD
RONALD
GEOFFREY
COLIN
DENNIS
RODNEY
DONALD
JEFFREY
WAYNE
TERENCE
DESMOND
DOUGLAS
NEVILLE
PHILLIP
BRENTON
GARY
LESLIE
KEITH
ANDREW
0
100
200
300
400
500
600
Popularity of names in sample of 1945M
Amount Position
50 rows
from
2005M
is chosen
and then
their
popularity is displayed in the column chart.
JACK
THOMAS
ETHAN
RYAN
LIAM
SAMUEL
JACOB
BENJAMIN
JAYDEN
DYLAN
JORDAN
MATTHEW
NOAH
HARRISON
OLIVER
MITCHELL
CAMERON
MAX
HENRY
HARRY
OSCAR
SEBASTIAN
ANGUS
CHARLIE
MICHAEL
0
50
100
150
200
250
Popularity of names in sample of 2005M
Amount Position
Task 4:
1. The four boys names that are selected are JACK, NOAH, CHARLIE, LUCAS and he girls
names those are selected are CHARLOTTE, AMELIA, OLIVIA, AVA.
The amounts and Ranks for the above boys and girls names are listed in the following table.
The rank and corresponding number of times these above names are used are given in the
following table.
JOHN
PETER
TREVOR
BRIAN
BARRY
KEVIN
RICHARD
RONALD
GEOFFREY
COLIN
DENNIS
RODNEY
DONALD
JEFFREY
WAYNE
TERENCE
DESMOND
DOUGLAS
NEVILLE
PHILLIP
BRENTON
GARY
LESLIE
KEITH
ANDREW
0
100
200
300
400
500
600
Popularity of names in sample of 1945M
Amount Position
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Name Year
Amoun
t Rank
NOAH 1945 0 NA
NOAH 1955 0 NA
NOAH 1965 0 NA
NOAH 1975 1 460
NOAH 1985 1 449
NOAH 1995 5 235
NOAH 2005 78 24
NOAH 2015 123 4
Name Year
Amoun
t Rank
CHARLIE 1945 0 NA
CHARLIE 1955 3 215
CHARLIE 1965 0 NA
CHARLIE 1975 0 NA
CHARLIE 1985 0 NA
CHARLIE 1995 6 206
CHARLIE 2005 46 47
CHARLIE 2015 103 5
Name Year Amount Rank
JACK 1945 2 208
JACK 1955 3 215
JACK 1965 3 312
JACK 1975 4 230
Amoun
t Rank
NOAH 1945 0 NA
NOAH 1955 0 NA
NOAH 1965 0 NA
NOAH 1975 1 460
NOAH 1985 1 449
NOAH 1995 5 235
NOAH 2005 78 24
NOAH 2015 123 4
Name Year
Amoun
t Rank
CHARLIE 1945 0 NA
CHARLIE 1955 3 215
CHARLIE 1965 0 NA
CHARLIE 1975 0 NA
CHARLIE 1985 0 NA
CHARLIE 1995 6 206
CHARLIE 2005 46 47
CHARLIE 2015 103 5
Name Year Amount Rank
JACK 1945 2 208
JACK 1955 3 215
JACK 1965 3 312
JACK 1975 4 230

JACK 1985 26 71
JACK 1995 178 9
JACK 2005 198 1
JACK 2015 125 3
Name Year Amount Rank
LUCAS 1945 0 NA
LUCAS 1955 0 NA
LUCAS 1965 2 392
LUCAS 1975 9 151
LUCAS 1985 11 133
LUCAS 1995 25 83
LUCAS 2005 43 51
LUCAS 2015 102 6
Name Year
Amoun
t Rank
CHARLOTT
E 1945 0 NA
CHARLOTT
E 1955 1 502
CHARLOTT
E 1965 2 471
CHARLOTT
E 1975 3 352
CHARLOTT
E 1985 4 315
CHARLOTT
E 1995 35 54
CHARLOTT
E 2005 116 4
CHARLOTT
E 2015 124 1
Name Year Amount Rank
AMELIA 1945 2 246
AMELIA 1955 1 502
AMELIA 1965 2 471
AMELIA 1975 7 213
AMELIA 1985 25 81
AMELIA 1995 33 59
AMELIA 2005 81 12
JACK 1995 178 9
JACK 2005 198 1
JACK 2015 125 3
Name Year Amount Rank
LUCAS 1945 0 NA
LUCAS 1955 0 NA
LUCAS 1965 2 392
LUCAS 1975 9 151
LUCAS 1985 11 133
LUCAS 1995 25 83
LUCAS 2005 43 51
LUCAS 2015 102 6
Name Year
Amoun
t Rank
CHARLOTT
E 1945 0 NA
CHARLOTT
E 1955 1 502
CHARLOTT
E 1965 2 471
CHARLOTT
E 1975 3 352
CHARLOTT
E 1985 4 315
CHARLOTT
E 1995 35 54
CHARLOTT
E 2005 116 4
CHARLOTT
E 2015 124 1
Name Year Amount Rank
AMELIA 1945 2 246
AMELIA 1955 1 502
AMELIA 1965 2 471
AMELIA 1975 7 213
AMELIA 1985 25 81
AMELIA 1995 33 59
AMELIA 2005 81 12
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

AMELIA 2015 111 2
Name Year Amount Rank
OLIVIA 1945 0 NA
OLIVIA 1955 2 347
OLIVIA 1965 4 323
OLIVIA 1975 16 119
OLIVIA 1985 17 114
OLIVIA 1995 72 18
OLIVIA 2005 111 5
OLIVIA 2015 97 3
Name Year Amount Rank
AVA 1945 0 NA
AVA 1955 1 502
AVA 1965 0 NA
AVA 1975 0 NA
AVA 1985 0 NA
AVA 1995 1 712
AVA 2005 38 36
AVA 2015 96 4
Charts:
1945 1955 1965 1975 1985 1995 2005 2015
0
50
100
150
200
250
Popular ity of CHA RLIE over
years
Amount Rank
Name Year Amount Rank
OLIVIA 1945 0 NA
OLIVIA 1955 2 347
OLIVIA 1965 4 323
OLIVIA 1975 16 119
OLIVIA 1985 17 114
OLIVIA 1995 72 18
OLIVIA 2005 111 5
OLIVIA 2015 97 3
Name Year Amount Rank
AVA 1945 0 NA
AVA 1955 1 502
AVA 1965 0 NA
AVA 1975 0 NA
AVA 1985 0 NA
AVA 1995 1 712
AVA 2005 38 36
AVA 2015 96 4
Charts:
1945 1955 1965 1975 1985 1995 2005 2015
0
50
100
150
200
250
Popular ity of CHA RLIE over
years
Amount Rank
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

1940 1950 1960 1970 1980 1990 2000 2010 2020
0
50
100
150
200
250
300
350
Popularity of JACK over
years
Amount Rank
1945 1955 1965 1975 1985 1995 2005 2015
0
50
100
150
200
250
300
350
400
450
500
Popular ity of NOAH over year s
Amount Rank
1945 1955 1965 1975 1985 1995 2005 2015
0
50
100
150
200
250
300
350
400
450
Popular ity of LUCAS over years
Amount Rank
0
50
100
150
200
250
300
350
Popularity of JACK over
years
Amount Rank
1945 1955 1965 1975 1985 1995 2005 2015
0
50
100
150
200
250
300
350
400
450
500
Popular ity of NOAH over year s
Amount Rank
1945 1955 1965 1975 1985 1995 2005 2015
0
50
100
150
200
250
300
350
400
450
Popular ity of LUCAS over years
Amount Rank

1945 1955 1965 1975 1985 1995 2005 2015
0
100
200
300
400
500
600
Popular ity of CHA RLOTTE over
years
Amount Rank
1940 1950 1960 1970 1980 1990 2000 2010 2020
0
100
200
300
400
500
600
Popularity of AMELIA over
years
Amount Rank
1945 1955 1965 1975 1985 1995 2005 2015
0
50
100
150
200
250
300
350
400
Popular ity of OLIVIA over years
Amount Rank
0
100
200
300
400
500
600
Popular ity of CHA RLOTTE over
years
Amount Rank
1940 1950 1960 1970 1980 1990 2000 2010 2020
0
100
200
300
400
500
600
Popularity of AMELIA over
years
Amount Rank
1945 1955 1965 1975 1985 1995 2005 2015
0
50
100
150
200
250
300
350
400
Popular ity of OLIVIA over years
Amount Rank
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

1945 1955 1965 1975 1985 1995 2005 2015
0
100
200
300
400
500
600
700
800
Popular ity of AVA over year s
Amount Rank
A different analysis can be calculating the averages of the amount of names in all the years
for the names to find which name is the most popular on an average. Another analysis that
can be conducting ANOVA test between the sum of amounts of four boys and four girls
throughout 30 years from 1985 to 2015 to see if the average amount of 4 boys names are
significantly different from the average amount of 4 girls names.
Task 5:
It is evident form the data of 2015 that the most famous name is in the most recent time is
OLIVER and there are other popular names also. Similarly, the most popular female name is
CHARLOTTE and some female names in descending order are AMELIA and OLIVIA.
Hence, the recommendation is to buy at least 20 recent popular male and female T shirts so
that all the bought shorts can be sold. The trend of name OLIVER is increasing through years
and thus it is also recommended to buy OLIVER named shirts in future. This is also true for
the name CHARLOTTE. The popularity of WILLIAM had been slightly decreased from
2005 to 2015, although an increasing trend was observed before 2005. Hence, it is
recommended to buy moderate number of WILLIAM shirts.
Task 6:
Additionally, after collecting the data of popularity, it is needed to collect data from other
sources in order to correctly determine patterns in the future. The names which was become
popular in recent years are influenced by the character names of movies, TV series and plays.
Hence, collecting the upcoming movie or TV series list along with corresponding character
names can be useful for finding significant correlation. Additionally, player names in popular
0
100
200
300
400
500
600
700
800
Popular ity of AVA over year s
Amount Rank
A different analysis can be calculating the averages of the amount of names in all the years
for the names to find which name is the most popular on an average. Another analysis that
can be conducting ANOVA test between the sum of amounts of four boys and four girls
throughout 30 years from 1985 to 2015 to see if the average amount of 4 boys names are
significantly different from the average amount of 4 girls names.
Task 5:
It is evident form the data of 2015 that the most famous name is in the most recent time is
OLIVER and there are other popular names also. Similarly, the most popular female name is
CHARLOTTE and some female names in descending order are AMELIA and OLIVIA.
Hence, the recommendation is to buy at least 20 recent popular male and female T shirts so
that all the bought shorts can be sold. The trend of name OLIVER is increasing through years
and thus it is also recommended to buy OLIVER named shirts in future. This is also true for
the name CHARLOTTE. The popularity of WILLIAM had been slightly decreased from
2005 to 2015, although an increasing trend was observed before 2005. Hence, it is
recommended to buy moderate number of WILLIAM shirts.
Task 6:
Additionally, after collecting the data of popularity, it is needed to collect data from other
sources in order to correctly determine patterns in the future. The names which was become
popular in recent years are influenced by the character names of movies, TV series and plays.
Hence, collecting the upcoming movie or TV series list along with corresponding character
names can be useful for finding significant correlation. Additionally, player names in popular
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

games like basketball, football, Tennis in Australia can influence people to rename their
children or neighbours. Thus collecting names of popular game players can prove useful to
predict patterns of popular names in future (Data.gov.au 2019). The actor, actress, player
name can be found in official data source of Australia in https://data.gov.au/ with other
sources.
children or neighbours. Thus collecting names of popular game players can prove useful to
predict patterns of popular names in future (Data.gov.au 2019). The actor, actress, player
name can be found in official data source of Australia in https://data.gov.au/ with other
sources.

References:
Data.gov.au. (2019). Search. [online] Available at: https://data.gov.au/ [Accessed 12 May
2019].
Data.sa.gov.au. (2019). Popular Baby Names - data.sa.gov.au. [online] Available at:
https://data.sa.gov.au/data/dataset/popular-baby-names [Accessed 12 May 2019].
Li, J., Tao, F., Cheng, Y. and Zhao, L., 2015. Big data in product lifecycle management. The
International Journal of Advanced Manufacturing Technology, 81(1-4), pp.667-684.
Sookhak, M., Gani, A., Khan, M.K. and Buyya, R., 2015. Dynamic remote data auditing for
securing big data storage in cloud computing (Doctoral dissertation, Fakulti Sains Komputer
dan Teknologi Maklumat, Universiti Malaya).
Verginelli, I., Yao, Y. and Suuberg, E.M., 2016. An Excel®‐Based Visualization Tool of
Two‐Dimensional Soil Gas Concentration Profiles in Petroleum Vapor Intrusion.
Groundwater Monitoring & Remediation, 36(2), pp.94-100.
Data.gov.au. (2019). Search. [online] Available at: https://data.gov.au/ [Accessed 12 May
2019].
Data.sa.gov.au. (2019). Popular Baby Names - data.sa.gov.au. [online] Available at:
https://data.sa.gov.au/data/dataset/popular-baby-names [Accessed 12 May 2019].
Li, J., Tao, F., Cheng, Y. and Zhao, L., 2015. Big data in product lifecycle management. The
International Journal of Advanced Manufacturing Technology, 81(1-4), pp.667-684.
Sookhak, M., Gani, A., Khan, M.K. and Buyya, R., 2015. Dynamic remote data auditing for
securing big data storage in cloud computing (Doctoral dissertation, Fakulti Sains Komputer
dan Teknologi Maklumat, Universiti Malaya).
Verginelli, I., Yao, Y. and Suuberg, E.M., 2016. An Excel®‐Based Visualization Tool of
Two‐Dimensional Soil Gas Concentration Profiles in Petroleum Vapor Intrusion.
Groundwater Monitoring & Remediation, 36(2), pp.94-100.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 12
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2026 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.





