STAT701 2019 Assignment One: Bronchitis Trial Data and Analysis
VerifiedAdded on 2023/04/08
|8
|1379
|371
Homework Assignment
AI Summary
This assignment solution addresses the analysis of clinical trial data related to a bronchitis treatment, likely for a statistics course (STAT701). The assignment involves importing and manipulating data from CSV and SAS datasets, which contain information on patient demographics and vital signs (weight, height, blood pressure, pulse) from two studies (HI002 and HI004) investigating a new treatment (Haemoph) versus a placebo. The solution demonstrates the use of SAS data step programming to process the data, create summary tables, and generate descriptive statistics (mean, standard deviation, minimum, maximum) grouped by site. The analysis includes the creation of tables for each study and a combined table, presenting results for variables like RANDOM_N, VISIT_I1, PULSE, and protocol. Visualizations, though not fully detailed, are also mentioned, indicating an exploration of data distributions. The student has successfully converted the CSV data into SAS datasets, and the solution shows the results of printing the data to verify its structure and content. The assignment's objective is to analyze the demographic data and vital signs, potentially to compare the two treatment groups based on the collected data.

3/24/20 Results: Summary
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 1/
STATS701 2019
Assignment One
Assignment One
Data step programming and producing Tables
<submitted by>
<Prof Name>
<Univ>
<Submission date>
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 1/
STATS701 2019
Assignment One
Assignment One
Data step programming and producing Tables
<submitted by>
<Prof Name>
<Univ>
<Submission date>
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

3/24/20 Results: Summary
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 2/
Introduction
Basically there is a give csv file named: “attach-file-1552982775398.csv” and this was converted to
the SAS Dataset through the following program:
proc import datafile = '/folders/myfolders/attach-file-1552982775398.csv'
out = work.attach
dbms = CSV
;
run;
We have received the following output from the above conversion:
3/24/2019 Results: WORK.ATTACH
localhost:10080/SASStudio/38/sasexec/submissions/5bf17686-3570-
4689-bbfa-a255b5a6870a/results 1/7
Obs SITE SUBJ_ID RANDOM_N SUBJ_INI VISIT_ID VISIT_I1 PAGE PAGE_RPT
WEIGHT HEIGHT SYSTOLIC
DIASTOLI PULSE protocol H002 H004 studorder
1 5 81 81 MIT 1 . 5 . 70.2 1.6 156 88 72 4 0 1 2
2 5 46 46 DPO 2 . 10 . . . . . . 2 1 0 1
3 5 79 79 LAD 1 . 5 . 80.8 1.72 142 94 76 4 0 1 2
4 5 53 53 I-M 1 . 5 . 51.4 1.53 134 84 62 2 1 0 1
5 5 43 43 PDJ 4 . 16 . 127 . 150 88 74 2 1 0 1
6 5 50 50 MJB 1 . 5 . 63.4 1.75 120 72 64 4 0 1 2
7 5 49 49 R-L 1 . 5 . 60.8 1.6 128 82 66 4 0 1 2
8 5 46 46 DPO 4 . 16 . 97 . 130 80 96 2 1 0 1
9 5 121 121 K-L 4 . 16 . 83 . 138 66 64 4 0 1 2
10 5 101 101 PCH 4 . 16 . 65 . 118 60 50 4 0 1 2
11 5 50 50 RML 2 . 10 . 108.5 . 136 96 90 2 1 0 1
12 5 100 100 STP 2 . 10 . 115 . 146 86 64 4 0 1 2
13 5 80 80 LVR 1 . 5 . 73.6 1.65 128 88 58 4 0 1 2
14 5 45 45 JWM 1 . 5 . 77 1.75 146 88 88 2 1 0 1
15 5 48 48 FRK 1 . 5 . 56.8 1.54 168 94 70 2 1 0 1
16 5 47 47 KJH 4 . 16 . 77.8 . 134 84 94 2 1 0 1
17 5 51 51 D-M 1 . 5 . 59 1.59 108 84 56 4 0 1 2
18 5 54 54 ARP 2 . 10 . 105.9 . . . 84 4 0 1 2
19 5 51 51 J-R 1 . 5 . 66.4 1.68 126 68 66 2 1 0 1
20 5 50 50 RML 1 . 5 . 108 1.94 138 96 94 2 1 0 1
21 5 51 51 J-R 4 . 16 . 65 . 172 68 62 2 1 0 1
22 5 98 98 MJB 2 . 10 . 81.5 . 144 86 98 4 0 1 2
23 5 84 84 A-M 1 . 5 . 90.6 1.65 110 84 58 4 0 1 2
24 5 121 121 K-L 2 . 10 . 82.6 . 124 64 74 4 0 1 2
….
Apart from that there are 4 SAS data sets given as:
attach-file-1552982738479.sas7bdat
attach-file-1552982738490.sas7bdat
attach-file-1552982753093.sas7bdat
attach-file-1552982753098.sas7bdat
To make them SAS ready we have to map these files as per below names:
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 2/
Introduction
Basically there is a give csv file named: “attach-file-1552982775398.csv” and this was converted to
the SAS Dataset through the following program:
proc import datafile = '/folders/myfolders/attach-file-1552982775398.csv'
out = work.attach
dbms = CSV
;
run;
We have received the following output from the above conversion:
3/24/2019 Results: WORK.ATTACH
localhost:10080/SASStudio/38/sasexec/submissions/5bf17686-3570-
4689-bbfa-a255b5a6870a/results 1/7
Obs SITE SUBJ_ID RANDOM_N SUBJ_INI VISIT_ID VISIT_I1 PAGE PAGE_RPT
WEIGHT HEIGHT SYSTOLIC
DIASTOLI PULSE protocol H002 H004 studorder
1 5 81 81 MIT 1 . 5 . 70.2 1.6 156 88 72 4 0 1 2
2 5 46 46 DPO 2 . 10 . . . . . . 2 1 0 1
3 5 79 79 LAD 1 . 5 . 80.8 1.72 142 94 76 4 0 1 2
4 5 53 53 I-M 1 . 5 . 51.4 1.53 134 84 62 2 1 0 1
5 5 43 43 PDJ 4 . 16 . 127 . 150 88 74 2 1 0 1
6 5 50 50 MJB 1 . 5 . 63.4 1.75 120 72 64 4 0 1 2
7 5 49 49 R-L 1 . 5 . 60.8 1.6 128 82 66 4 0 1 2
8 5 46 46 DPO 4 . 16 . 97 . 130 80 96 2 1 0 1
9 5 121 121 K-L 4 . 16 . 83 . 138 66 64 4 0 1 2
10 5 101 101 PCH 4 . 16 . 65 . 118 60 50 4 0 1 2
11 5 50 50 RML 2 . 10 . 108.5 . 136 96 90 2 1 0 1
12 5 100 100 STP 2 . 10 . 115 . 146 86 64 4 0 1 2
13 5 80 80 LVR 1 . 5 . 73.6 1.65 128 88 58 4 0 1 2
14 5 45 45 JWM 1 . 5 . 77 1.75 146 88 88 2 1 0 1
15 5 48 48 FRK 1 . 5 . 56.8 1.54 168 94 70 2 1 0 1
16 5 47 47 KJH 4 . 16 . 77.8 . 134 84 94 2 1 0 1
17 5 51 51 D-M 1 . 5 . 59 1.59 108 84 56 4 0 1 2
18 5 54 54 ARP 2 . 10 . 105.9 . . . 84 4 0 1 2
19 5 51 51 J-R 1 . 5 . 66.4 1.68 126 68 66 2 1 0 1
20 5 50 50 RML 1 . 5 . 108 1.94 138 96 94 2 1 0 1
21 5 51 51 J-R 4 . 16 . 65 . 172 68 62 2 1 0 1
22 5 98 98 MJB 2 . 10 . 81.5 . 144 86 98 4 0 1 2
23 5 84 84 A-M 1 . 5 . 90.6 1.65 110 84 58 4 0 1 2
24 5 121 121 K-L 2 . 10 . 82.6 . 124 64 74 4 0 1 2
….
Apart from that there are 4 SAS data sets given as:
attach-file-1552982738479.sas7bdat
attach-file-1552982738490.sas7bdat
attach-file-1552982753093.sas7bdat
attach-file-1552982753098.sas7bdat
To make them SAS ready we have to map these files as per below names:

3/24/20 Results: Summary
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 3/
attach-file-1552982738479.sas7bdat alldemog file1
attach-file-1552982738490.sas7bdat alltreats file2
attach-file-1552982753093.sas7bdat allspiro file3
attach-file-1552982753098.sas7bdat allsmokhist file4
So now we have the following files placed at the SAS workspace:
To check what these datasets have, we ran the following program:
data new;
set '/folders/myfolders/file1.sas7bdat';
run;
proc print data=/folders/myfolders/file1.sas7bdat';
run;
From the above we have the following results:
3/24/2019 Results: WORK.NEW
localhost:10080/SASStudio/38/sasexec/submissions/7fe1232d-602e-408c-8674-2c6cb1bd0f67/results 1/2
Obs SITE SUBJ_ID RANDOM_N SUBJ_INI VISIT_ID VISIT_I1 PAGE PAGE_RPT GENDER INFCON_D
INFCON_M INFCON_Y BIRTH_DD BIRTH_MM BIRTH_YY protocol H002 H004 studorder
1 001 2101 025 ATT 1 . 1 . M 20 03 2006 27 01 1932 2 1 0 1
2 001 026 026 G-O 1 . 1 . M 28 03 2006 23 12 1943 2 1 0 1
3 001 027 027 BEM 1 . 1 . F 12 04 2006 08 05 1938 2 1 0 1
4 001 028 028 WOR 1 . 1 . M 24 04 2006 30 04 1936 2 1 0 1
5 001 029 029 JFH 1 . 1 . M 26 04 2006 09 10 1928 2 1 0 1
6 001 030 030 JPB 1 . 1 . M 01 05 2006 10 06 1939 2 1 0 1
7 003 001 001 PJB 1 . 1 . F 07 03 2006 27 06 1925 2 1 0 1
8 003 002 002 WRE 1 . 1 . M 07 03 2006 06 05 1928 2 1 0 1
9 003 2003 003 L-B 1 . 1 . M 09 03 2006 19 12 1925 2 1 0 1
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 3/
attach-file-1552982738479.sas7bdat alldemog file1
attach-file-1552982738490.sas7bdat alltreats file2
attach-file-1552982753093.sas7bdat allspiro file3
attach-file-1552982753098.sas7bdat allsmokhist file4
So now we have the following files placed at the SAS workspace:
To check what these datasets have, we ran the following program:
data new;
set '/folders/myfolders/file1.sas7bdat';
run;
proc print data=/folders/myfolders/file1.sas7bdat';
run;
From the above we have the following results:
3/24/2019 Results: WORK.NEW
localhost:10080/SASStudio/38/sasexec/submissions/7fe1232d-602e-408c-8674-2c6cb1bd0f67/results 1/2
Obs SITE SUBJ_ID RANDOM_N SUBJ_INI VISIT_ID VISIT_I1 PAGE PAGE_RPT GENDER INFCON_D
INFCON_M INFCON_Y BIRTH_DD BIRTH_MM BIRTH_YY protocol H002 H004 studorder
1 001 2101 025 ATT 1 . 1 . M 20 03 2006 27 01 1932 2 1 0 1
2 001 026 026 G-O 1 . 1 . M 28 03 2006 23 12 1943 2 1 0 1
3 001 027 027 BEM 1 . 1 . F 12 04 2006 08 05 1938 2 1 0 1
4 001 028 028 WOR 1 . 1 . M 24 04 2006 30 04 1936 2 1 0 1
5 001 029 029 JFH 1 . 1 . M 26 04 2006 09 10 1928 2 1 0 1
6 001 030 030 JPB 1 . 1 . M 01 05 2006 10 06 1939 2 1 0 1
7 003 001 001 PJB 1 . 1 . F 07 03 2006 27 06 1925 2 1 0 1
8 003 002 002 WRE 1 . 1 . M 07 03 2006 06 05 1928 2 1 0 1
9 003 2003 003 L-B 1 . 1 . M 09 03 2006 19 12 1925 2 1 0 1
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

3/24/20 Results: Summary
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 4/
10 003 2004 004 FJG 1 . 1 . M 09 03 2006 31 01 1918 2 1 0 1
11 003 2005 005 DLR 1 . 1 . M 13 03 2006 08 09 1923 2 1 0 1
For the next dataset we have the following codes written as per the below screen:
We get the following results:
These dataset have been source from bronchitis new treatment. These data files offer the small
subsets of variables measured in clinical trials, and these variables are SITE, SUBJ_ID,
RANDOM_N, SUBJ_INI, VISIT_ID, VISIT_I1, WEIGHT , HEIGHT, SYSTOLIC,
DIASTOLI, PULSE, etc. We are told that we have 2 treatments - Haemoph & Placebo and the
datasets from these 2 treatments have been combined.
Result output:
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 4/
10 003 2004 004 FJG 1 . 1 . M 09 03 2006 31 01 1918 2 1 0 1
11 003 2005 005 DLR 1 . 1 . M 13 03 2006 08 09 1923 2 1 0 1
For the next dataset we have the following codes written as per the below screen:
We get the following results:
These dataset have been source from bronchitis new treatment. These data files offer the small
subsets of variables measured in clinical trials, and these variables are SITE, SUBJ_ID,
RANDOM_N, SUBJ_INI, VISIT_ID, VISIT_I1, WEIGHT , HEIGHT, SYSTOLIC,
DIASTOLI, PULSE, etc. We are told that we have 2 treatments - Haemoph & Placebo and the
datasets from these 2 treatments have been combined.
Result output:
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

3/24/20 Results: Summary
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 5/
The demography table, File1 in our case, which we assumed for simplicity, contains
information for the treatment groups. We have provided the output as per the given format:
SITE N Obs Variable Mean Std Dev Minimum Maximum N
1 147 WEIGHT 81.6851064 18.4494248 49.0000000 131.6000000 141
HEIGHT 1.6657143 0.0935637 1.4400000 1.8900000 49
SYSTOLIC 126.7021277 19.6322420 86.0000000 220.0000000 141
DIASTOLI 73.7375887 9.2964243 54.0000000 100.0000000 141
PULSE 77.2056738 8.7729436 56.0000000 100.0000000 141
3 161 WEIGHT 73.2484076 14.2946987 41.2000000 115.8000000 157
HEIGHT 1.6620000 0.0977033 1.4400000 1.8900000 55
SYSTOLIC 129.7006369 18.6895916 87.0000000 190.0000000 157
DIASTOLI 78.1019108 11.6592566 48.0000000 101.0000000 157
PULSE 78.6687898 12.3503288 55.0000000 115.0000000 157
4 9 WEIGHT 86.4666667 10.6306162 73.0000000 99.0000000 9
HEIGHT 1.7200000 0.0854400 1.6400000 1.8100000 3
SYSTOLIC 130.1111111 8.3732378 115.0000000 140.0000000 9
DIASTOLI 75.1111111 6.3333333 70.0000000 86.0000000 9
PULSE 87.6666667 16.9336942 52.0000000 115.0000000 9
5 99 WEIGHT 77.1077778 20.4299913 44.0000000 127.0000000 90
HEIGHT 1.6875758 0.0923658 1.4900000 1.9400000 33
SYSTOLIC 132.2619048 16.5275375 94.0000000 172.0000000 84
DIASTOLI 78.5952381 10.9358690 48.0000000 96.0000000 84
PULSE 72.8089888 12.7830467 50.0000000 102.0000000 89
The graphs:
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 5/
The demography table, File1 in our case, which we assumed for simplicity, contains
information for the treatment groups. We have provided the output as per the given format:
SITE N Obs Variable Mean Std Dev Minimum Maximum N
1 147 WEIGHT 81.6851064 18.4494248 49.0000000 131.6000000 141
HEIGHT 1.6657143 0.0935637 1.4400000 1.8900000 49
SYSTOLIC 126.7021277 19.6322420 86.0000000 220.0000000 141
DIASTOLI 73.7375887 9.2964243 54.0000000 100.0000000 141
PULSE 77.2056738 8.7729436 56.0000000 100.0000000 141
3 161 WEIGHT 73.2484076 14.2946987 41.2000000 115.8000000 157
HEIGHT 1.6620000 0.0977033 1.4400000 1.8900000 55
SYSTOLIC 129.7006369 18.6895916 87.0000000 190.0000000 157
DIASTOLI 78.1019108 11.6592566 48.0000000 101.0000000 157
PULSE 78.6687898 12.3503288 55.0000000 115.0000000 157
4 9 WEIGHT 86.4666667 10.6306162 73.0000000 99.0000000 9
HEIGHT 1.7200000 0.0854400 1.6400000 1.8100000 3
SYSTOLIC 130.1111111 8.3732378 115.0000000 140.0000000 9
DIASTOLI 75.1111111 6.3333333 70.0000000 86.0000000 9
PULSE 87.6666667 16.9336942 52.0000000 115.0000000 9
5 99 WEIGHT 77.1077778 20.4299913 44.0000000 127.0000000 90
HEIGHT 1.6875758 0.0923658 1.4900000 1.9400000 33
SYSTOLIC 132.2619048 16.5275375 94.0000000 172.0000000 84
DIASTOLI 78.5952381 10.9358690 48.0000000 96.0000000 84
PULSE 72.8089888 12.7830467 50.0000000 102.0000000 89
The graphs:

3/24/20 Results: Summary
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 6/
SITE N Obs Variable Mean Std Dev Minimum Maximum Median N
1 147 RANDOM_N 53.2653061 19.2582758 25.0000000 104.0000000 55.0000000 147
VISIT_I1 . . . . . 0
PULSE 77.2056738 8.7729436 56.0000000 100.0000000 78.0000000 141
protocol 3.7551020 0.6578487 2.0000000 4.0000000 4.0000000 147
3 161 RANDOM_N 24.8695652 30.0275146 1.0000000 116.0000000 14.0000000 161
VISIT_I1 . . . . . 0
PULSE 78.6687898 12.3503288 55.0000000 115.0000000 78.0000000 157
protocol 3.4037267 0.9177342 2.0000000 4.0000000 4.0000000 161
4 9 RANDOM_N 81.6666667 14.7563546 62.0000000 92.0000000 91.0000000 9
VISIT_I1 . . . . . 0
PULSE 87.6666667 16.9336942 52.0000000 115.0000000 88.0000000 9
protocol 3.3333333 1.0000000 2.0000000 4.0000000 4.0000000 9
5 99 RANDOM_N 66.9393939 22.4907320 43.0000000 121.0000000 54.0000000 99
VISIT_I1 . . . . . 0
PULSE 72.8089888 12.7830467 50.0000000 102.0000000 72.0000000 89
protocol 3.1515152 0.9934853 2.0000000 4.0000000 4.0000000 99
localhost:10080/SASStudio/38/sasexec/submissions/f95eef40-6e5a-4804-9a24- 6/
SITE N Obs Variable Mean Std Dev Minimum Maximum Median N
1 147 RANDOM_N 53.2653061 19.2582758 25.0000000 104.0000000 55.0000000 147
VISIT_I1 . . . . . 0
PULSE 77.2056738 8.7729436 56.0000000 100.0000000 78.0000000 141
protocol 3.7551020 0.6578487 2.0000000 4.0000000 4.0000000 147
3 161 RANDOM_N 24.8695652 30.0275146 1.0000000 116.0000000 14.0000000 161
VISIT_I1 . . . . . 0
PULSE 78.6687898 12.3503288 55.0000000 115.0000000 78.0000000 157
protocol 3.4037267 0.9177342 2.0000000 4.0000000 4.0000000 161
4 9 RANDOM_N 81.6666667 14.7563546 62.0000000 92.0000000 91.0000000 9
VISIT_I1 . . . . . 0
PULSE 87.6666667 16.9336942 52.0000000 115.0000000 88.0000000 9
protocol 3.3333333 1.0000000 2.0000000 4.0000000 4.0000000 9
5 99 RANDOM_N 66.9393939 22.4907320 43.0000000 121.0000000 54.0000000 99
VISIT_I1 . . . . . 0
PULSE 72.8089888 12.7830467 50.0000000 102.0000000 72.0000000 89
protocol 3.1515152 0.9934853 2.0000000 4.0000000 4.0000000 99
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

8
3/24/20 Results: Summary
3/24/20 Results: Summary
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

8
3/24/20 Results: Summary
Discussion/short Para: The demography table contained information related to the treatment groups and both groups combined and the variables
like site, weight, systolic, diastolic and pulse were represented. The data was grouped site wise to fit in the 3/4 pages and the visualizations were
also extracted. The summary statistics were mentioned and the different plots were generated from the combined datasets. The two groups were
combined based on the site and the data was initially sorted before combining. This had resulted the set of above output as presented above. The
relevant inclusions of some relevant grpahs were also presented as above, to show the distribution of the data.
3/24/20 Results: Summary
Discussion/short Para: The demography table contained information related to the treatment groups and both groups combined and the variables
like site, weight, systolic, diastolic and pulse were represented. The data was grouped site wise to fit in the 3/4 pages and the visualizations were
also extracted. The summary statistics were mentioned and the different plots were generated from the combined datasets. The two groups were
combined based on the site and the data was initially sorted before combining. This had resulted the set of above output as presented above. The
relevant inclusions of some relevant grpahs were also presented as above, to show the distribution of the data.
1 out of 8
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2026 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.





