Name: ____________________ ID number: ________________________
MASSEY UNIVERSITY
College of Sciences
School of Fundamental Sciences (Statistics)
Introductory Biostatistics 161.130
Assignment 1: Sampling and Exploratory Data Analysis
Due date: Friday 16th August 2019
Total marks: 45 Assessment value: 12%
The population data
The population we are considering for this assignment are the 10,000
kuku (New Zealand green–lipped mussels) growing in a mussel farm in
the Marlborough Sounds. Variables of interest are the length of the
kuku (in millimetres), grade (small, medium or large) and sex (male or
female).
Each kuku (mussel) has a unique ID. The population consists of:
1948 large kuku with ID numbers from 1 to 1948.
4457 medium kuku with ID numbers from 1949 to 6405
3595 small kuku with ID numbers from 6406 to 10000.
You do not have access to the population data. You will generate a sample of ID numbers and then
we will give you a file containing information on the length, grade and sex for each of the kuku
identified in your sample.
Part A: Sampling method [9 marks]
You are to use Excel to create a representative sample (size 100) of the population. Your
sample will consist of 100 ID numbers.
Ensure all 100 ID numbers are in a single column.
Click on the following link to open a Shiny app and copy your 100 ID numbers into the
“Sample row identifiers” box.
http://shiny.massey.ac.nz/jcmarsha/student_data/?
p=161120&y=2019&a=1
This will generate a file containing ID, length, grade and sex for each of the kuku in your
sample. This is your sample data.
Keep an electronic copy of your sample data for use in Assignments 2 and 3.
Use your sample data to answer the following questions in the answer spaces provided.
(You can re-size the answer spaces.)
1. What type of sampling method have you chosen to use? [1 mark]
Simple random sampling technique
2. What other sampling methods could you have used and why is the method that you chose
better? [3 marks]
Stratified random sampling
3. Describe how you used Excel to create your sample. [4 marks]
To generate random values in excel, a person uses an inbuilt function RANDBETWEEN.
Therefore;
i. select a cell and type =RANDBETWEEN(1,10000)
ii. Use the autofill (+) to generate the 100 sample ID
4. Attach a copy of your sample to the Appendix. [1 mark]
Part B: Exploring your sample data [36 marks]
Use Excel and incorporate the output into your report.
1. Descriptive analysis of kuku grades according to sex.
a. Produce a contingency table of kuku grades according to sex. [2 marks]
Sex
Large Medium Small
Femal
e 6 22 22 50
Male 9 24 17 50
Total 15 46 39 100
b. Draw a suitable graph of the distribution of kuku grades according to sex. [2 marks]
