Deakin University HSH746 Biostatistics Assignment 1: Data Description

Verified

Added on  2022/09/18

|14
|1841
|37
Homework Assignment
AI Summary
This assignment provides a comprehensive analysis of data extracted from the 2018 National Health Interview Survey (NHIS). The assignment involves interpreting and summarizing statistical data using Stata. It addresses various aspects of the dataset, including interview quarters, regional distributions, age demographics, and healthcare access. The student uses Stata commands to analyze variables such as U5MR over different years, places where individuals seek medical help, and the relationship between age and dental visits. The assignment also explores the recoding of variables and assesses the percentage of children who received flu shots. The analysis includes descriptive statistics, frequency tables, and cross-tabulations to provide insights into the health characteristics of the surveyed population. The document concludes with a list of cited references.
Document Page
Running head: Biostatistics
Biostatistics
Name of Student
Name of the University
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Table of Contents
Q1...............................................................................................................................................3
Q2...............................................................................................................................................3
Q3...............................................................................................................................................3
Q4...............................................................................................................................................4
Q5...............................................................................................................................................6
Q6...............................................................................................................................................7
Q7...............................................................................................................................................9
Q8.............................................................................................................................................10
Q9.............................................................................................................................................10
References:...............................................................................................................................12
Document Page
Q1.
(d)
The target population is norwegian drivers in the six counties of Finnmark, Oslo, Akershus,
Buskerud, Hedmark and Oppland
Q2.
No, the study is not representative of the target population.
As the rural areas of Norway are sparsely populated completely random sampling could not
be done which is prerequisite for a good research.
A three stage cluster procedure was used in both surveys including people from other areas.
Q3.
. tabstat year_1990 year_2000 year_2010 year_2018, statistics( mean sd iqr p50 )
columns(variables)
Years | 1990 2000 2010 2018
---------+----------------------------------------
mean | 48.89 38.69 27.28 21.31
sd | 40.08 33.88 24.63 19.36
iqr | 62.13 53.11 35.86 26.63
p50 | 36.29 25.73 17.35 14.04
The average of U5MR over the years 1990, 2000, 2010 and 2018 is shown above.
Document Page
The mean, average, standard deviation and the median U5MR data is shown for each year.It
can be seen that the average U5MR decreased for the years. The standard deviation which
measures the variability of the data and IQR which measures the spread also decreased from
1990 to 2018.
Q4.
by cusualpl, sort : tabulate intv_qrt age_p, summarize(cusualpl) nomeans nostandard
-> cusualpl = 3
Frequencies of cusualpl
age_p
intv_qrt 0 2 3 4 5 Total
1 1 1 1 1 3 30
2 0 2 0 3 1 27
3 1 2 2 1 1 27
4 0 0 0 1 0 11
Total 2 5 3 6 5 95
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
The variable of interest is here is cusualpl which has been coded as follows.
cusualpl Place USUALLY go
when sick
1 Yes
2 There is NO place
3 There is MORE THAN ONE place
7 Refused
8 Not ascertained
9 Don't know
It is desired to know how many kids between ages 0 to 5 have at least one place to go when
they are sick in interval quarter 3 and interval quarter 4.
From the above analysis it is seen that, a total of 27 kids between the age of 0 to 5 have
atleast one place to go when sick in interval quarter 3 and in interval quarter 4 more 11 kids
had atleast one place to go when sick.
Document Page
Q5.
0 5 10 15
p 50 of age_p
1 2 3 4 5 6 7 8 9
Bar chart of cplkind on the x axis and the 50% percentile of age of each category on the y
axis.
cplkind Type of place to go when
sick (most often)
1Clinic or health center
2Doctor's office or HMO
3Hospital emergency room
4Hospital outpatient department
5Some other place
6Doesn't go to one place most often 7Refused
8Not ascertained
9Don't know
Document Page
Q6.
. tabulate age_p cdnlongr, summarize(cdnlongr) nomeans nostandard
Frequencies of cdnlongr
| cdnlongr
age_p | 0 1 2 3 4 | Total
-----------+-------------------------------------------------------+----------
1 | 333 85 9 1 0 | 432
2 | 218 155 34 8 2 | 421
3 | 133 221 56 15 2 | 430
4 | 47 268 66 24 2 | 413
5 | 18 292 57 29 1 | 404
6 | 11 316 76 12 5 | 424
7 | 10 300 59 13 4 | 391
8 | 6 322 64 18 9 | 421
9 | 4 346 58 20 6 | 436
10 | 3 376 84 34 7 | 508
11 | 3 318 72 24 4 | 428
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
12 | 4 354 72 20 10 | 466
13 | 3 362 84 16 6 | 480
14 | 4 347 70 45 13 | 483
15 | 2 385 82 34 12 | 526
16 | 2 410 104 40 15 | 582
17 | 1 364 104 52 17 | 547
-----------+-------------------------------------------------------+----------
Total | 802 5221 1151 405 115 | 7792
The information required is about children aged 17 or less who had access to a dentist atleast
once in the last year.
The variable cdnlongr describes the state of kids relating to when they last saw a dentist.
cdnlongr how long since last saw a
dentist? Include all types
of dentists
0 Never
1 6 months or less
2 More than 6 months, but not more than 1 year
ago
3 More than 1 year, but not more than 2 years
ago
4 More than 2 years, but not more than 5 years
ago
5 More than 5 years ago
7 Refused
8 Not ascertained
9
Don't know
The variable cdnlongr is of interest here when cdnlongr = 1, cdnlngr = 2,
Document Page
Thus total total kids between age 1 – 17 who had a dentist visit at least one time in the last
year:
(5221+1151) = 6372 children.
D. 6,372 children aged 1-17 years had at least one dental visit during the year
Q7.
tabulate doctor_howlong
RECODE of cdnlongr | Freq. Percent Cum.
---------------------------------------+-----------------------------------
6 months or less to less than 1 year | 802 10.29 10.29
more than 1 year to less than 2 years | 6,372 81.78 92.07
2 years to more than 5 years | 405 5.20 97.27
The variable cdnlongr is recoded into doctor_howlong with the following properties:
doctor_howlong=
0 if doctor has never been seen or talked to
1 if doctor has been seen or talked to 6 months or less to less than 1 year
2 if doctor has been seen or talked to more than 1 year to less than 2 years
Document Page
3 if doctor has been seen or talked to 2 years to more than 5 years
The value labels were added and the tabulation of the frequency counts of the new variable
shows that about 802 children or 10.2 % of the children had been to a dentist in the last 6
months to less than a year, 81.78% had been to a dentist in the last 1 to 2 years, and about
5.2% went to the dentist in the last 2 years to more than 5 years.
Q8.
The data set is checked for any inconsistent or missing values. It is found that there are no
missing values but in the age category there are two cases with age 19 and 22 which is
inconsistent with the information provided. Hence the two cases are removed by using the
drop cases command in stata.
Q9.
The variable agegroup_1to5 is created which recodes the continuous age variable into a new
categorical variable:
= 1 if child’s age is 1-5 years old
= 0 if child’s age is 6-17 years old.
. tabulate agegroup_1to5
RECODE of |
age | Freq. Percent Cum.
------------+-----------------------------------
0 | 384 77.11 77.11
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
1 | 114 22.89 100.00
------------+-----------------------------------
Total | 498 100.00
Thus in the data set 77.11 percent fall in the older category and 22.89 fall in the younger
category.
. tabulate flu12 agegroup_1to5, summarize(agegroup_1to5) nomeans nostandard
Frequencies of RECODE of age
| RECODE of age
flu12 | 0 1 | Total
-----------+----------------------+----------
1 | 173 73 | 246
2 | 204 38 | 242
8 | 1 1 | 2
9 | 6 2 | 8
-----------+----------------------+----------
Total | 384 114 | 498
Document Page
The percentage of younger children that had the flu shot: 73/114 % = 64.03 %
The percentage of older children that had the flu shot: 173/384 % = 45.05 %
chevron_up_icon
1 out of 14
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]