SAS Assignment: Data Analysis of Basketball and US Presidents
VerifiedAdded on 2023/02/01
|5
|1590
|81
Homework Assignment
AI Summary
This SAS assignment involves a comprehensive data analysis project using the SAS programming language. The assignment begins with importing and manipulating data from multiple sources, including a SAS dataset and CSV files containing basketball scoring statistics and information on US presidents. The solution utilizes procedures such as PROC FORMAT for creating custom formats, PROC TABULATE for generating tables with specified layouts, and PROC PRINT for displaying datasets with specific formatting requirements. Data manipulation includes handling missing values, date formatting, and creating new variables based on existing data. The assignment also covers data import from CSV files, data summarization using PROC SUMMARY, and generating reports with titles, footnotes, and page numbering. The final output includes PDF files containing the SAS code, log, and generated output, demonstrating the student's ability to perform data analysis tasks using SAS and adhere to specific formatting and reporting requirements.

/*1. Begin your program with the required header, filename, and
libname statements. As always,
your program must include comments in the appropriate places. Use
filename statements to
define the paths to the two raw data files. The readonly option
does not apply to these two
filename statements. */
/*Create a library name as MyLib and import hiscores.sas7bdat
dataset */
libname MyLib '/folders/myfolders/Project-13-967790/';
data hiscores;
set MyLib.hiscores;
run;
/*2. For this assignment the output file must be created with the
pages in a landscape layout. The
date is to only be displayed on the final section of the output.
The SAS output portion of your
PDF file should start on page number 2. */
/*3. Create a user defined format that can be used to display the
score type in the downloaded data
set as shown in the Type column on the first page of the sample
output. (In the data, PT is an
abbreviation for Point and FT is short for Free Throw.) */
/*4. Use the tabulate procedure to produce the output shown in
the posted results based on the
hiscores data set. Use labels as needed. Apply your user defined
format to display the values of
score type shown in the Type column. */
proc format;
value $typefmt (multilabel) /*User defined format name as
"typefmt" that can be used to display the score type */
'2PT' = "Two-Point"
'3PT' = "Three-Point"
'FT' = "Free Throw";
run;
options orientation=landscape; /*Output page in a landscape
layout */
libname statements. As always,
your program must include comments in the appropriate places. Use
filename statements to
define the paths to the two raw data files. The readonly option
does not apply to these two
filename statements. */
/*Create a library name as MyLib and import hiscores.sas7bdat
dataset */
libname MyLib '/folders/myfolders/Project-13-967790/';
data hiscores;
set MyLib.hiscores;
run;
/*2. For this assignment the output file must be created with the
pages in a landscape layout. The
date is to only be displayed on the final section of the output.
The SAS output portion of your
PDF file should start on page number 2. */
/*3. Create a user defined format that can be used to display the
score type in the downloaded data
set as shown in the Type column on the first page of the sample
output. (In the data, PT is an
abbreviation for Point and FT is short for Free Throw.) */
/*4. Use the tabulate procedure to produce the output shown in
the posted results based on the
hiscores data set. Use labels as needed. Apply your user defined
format to display the values of
score type shown in the Type column. */
proc format;
value $typefmt (multilabel) /*User defined format name as
"typefmt" that can be used to display the score type */
'2PT' = "Two-Point"
'3PT' = "Three-Point"
'FT' = "Free Throw";
run;
options orientation=landscape; /*Output page in a landscape
layout */
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

options nodate pageno=2 number; /*The SAS output portion of your
PDF file should start on page number 2. */
title1 '2012 Basketball Scoring Analysis';
title2 'Baskets Made by Class and Position';
proc tabulate data=hiscores;
class class position score_type;
var bpg;
table score_type = 'Type',
class='Class' * bpg = 'Baskets per Game' * (Mean Median)
position = 'Position' * bpg = 'Baskets per Game' * (Mean
Median);
format score_type $typefmt.;
run;
/*5. Write a data step similar to the one shown in the lectures
that converts the USPres.csv file to a
SAS dataset named USPres in the work library. This data step will
include infile, input, format,
and label statements. You will need to make certain decisions
about this data (such as column
widths and the presentation of dates) based on your observations.
You will need to use the
firstobs option for the infile statement to tell SAS to skip the
column titles that are in the raw
data. Unlike the data step option by the same name, this option
does not use parenthesis. The
dates must be presented in the WORDDATE format when the dataset
is viewed or printed. . It
is expected that you will get the error shown below but you
should not have any other errors or
warnings in your log.
a. The columns must be in the order shown in the printed output
without using any
statements in the print procedure to control the order.
b. Even though the current president is still serving, we want an
end of term date so we
can do some additional calculations. Add a line of code to this
data step that will set the
end of term date (Exited) to April 15, 2019 when the end of term
date is missing.
c. Create a new variable that contains the number of years each
president has been out of
office. This calculation should be able to automatically adjust
if the code is run again a
year from now. We will not be taking into account rounding or
partial years. This value
PDF file should start on page number 2. */
title1 '2012 Basketball Scoring Analysis';
title2 'Baskets Made by Class and Position';
proc tabulate data=hiscores;
class class position score_type;
var bpg;
table score_type = 'Type',
class='Class' * bpg = 'Baskets per Game' * (Mean Median)
position = 'Position' * bpg = 'Baskets per Game' * (Mean
Median);
format score_type $typefmt.;
run;
/*5. Write a data step similar to the one shown in the lectures
that converts the USPres.csv file to a
SAS dataset named USPres in the work library. This data step will
include infile, input, format,
and label statements. You will need to make certain decisions
about this data (such as column
widths and the presentation of dates) based on your observations.
You will need to use the
firstobs option for the infile statement to tell SAS to skip the
column titles that are in the raw
data. Unlike the data step option by the same name, this option
does not use parenthesis. The
dates must be presented in the WORDDATE format when the dataset
is viewed or printed. . It
is expected that you will get the error shown below but you
should not have any other errors or
warnings in your log.
a. The columns must be in the order shown in the printed output
without using any
statements in the print procedure to control the order.
b. Even though the current president is still serving, we want an
end of term date so we
can do some additional calculations. Add a line of code to this
data step that will set the
end of term date (Exited) to April 15, 2019 when the end of term
date is missing.
c. Create a new variable that contains the number of years each
president has been out of
office. This calculation should be able to automatically adjust
if the code is run again a
year from now. We will not be taking into account rounding or
partial years. This value

will be calculated by simply subtracting the year of the
presidents first term,
by using a conditional statement based on values in the data to
restrict the observations
printed. Use the appropriate enhancements to ensure that the
label breaks as shown in the eCampus output.*/
/* Import a file USPres.csv and Use filename statements to
define the paths to the two raw data files.*/
options nodate pageno=3 number; /* starts with page number 3
with nodate */
title 'U.S. Presidents Starting with Cleveland'; /* Title */
data USPres;
retain President Party Home_State Presidency Took_office
Left_office Years; /*Set sequence of the columns */
infile '/folders/myfolders/Project-13-
967790/3332935_1560390127_USPres.csv' dlm=',' firstobs=2;
/*import datase */
length Presidency President $27. Took_office Left_office Party
$10. Home_State $13.; /*Set the lengths for the columns */
input Presidency President $ Took_office Left_office $ Party $
Home_State $; /*Set the input column names */
NewDatevar1= input(Took_office,anydtdte32.);
NewDatevar2= input(Left_office,anydtdte32.);
format newdatevar1 worddate19. NewDatevar2 worddate19.; /*
Changed the date format */
keep President Party Home_State Presidency NewDatevar1
NewDatevar2 Years; /*Keep only these variables as output
variables */
if NewDatevar2 = . then NewDatevar2 = '15Apr2019'd; /*
Changed the missing value with the ‘April 15, 2019’ */
Years = intck('years', NewDatevar2, today()); /*Year counts from
current dates */
if Presidency >= 22 then output; /*Select observations from
president Cleveland */
label NewDatevar1 = 'Start of Term' NewDatevar2 = 'End of Term'
Presidency = '#' Home_State = 'Home State' Years= 'Years Out of
Office'; /*Rename the variabels */
run;
proc print data = USPres label noobs; /*Print the datasets */
var President Party Home_State Presidency NewDatevar1
NewDatevar2 Years;
run;
presidents first term,
by using a conditional statement based on values in the data to
restrict the observations
printed. Use the appropriate enhancements to ensure that the
label breaks as shown in the eCampus output.*/
/* Import a file USPres.csv and Use filename statements to
define the paths to the two raw data files.*/
options nodate pageno=3 number; /* starts with page number 3
with nodate */
title 'U.S. Presidents Starting with Cleveland'; /* Title */
data USPres;
retain President Party Home_State Presidency Took_office
Left_office Years; /*Set sequence of the columns */
infile '/folders/myfolders/Project-13-
967790/3332935_1560390127_USPres.csv' dlm=',' firstobs=2;
/*import datase */
length Presidency President $27. Took_office Left_office Party
$10. Home_State $13.; /*Set the lengths for the columns */
input Presidency President $ Took_office Left_office $ Party $
Home_State $; /*Set the input column names */
NewDatevar1= input(Took_office,anydtdte32.);
NewDatevar2= input(Left_office,anydtdte32.);
format newdatevar1 worddate19. NewDatevar2 worddate19.; /*
Changed the date format */
keep President Party Home_State Presidency NewDatevar1
NewDatevar2 Years; /*Keep only these variables as output
variables */
if NewDatevar2 = . then NewDatevar2 = '15Apr2019'd; /*
Changed the missing value with the ‘April 15, 2019’ */
Years = intck('years', NewDatevar2, today()); /*Year counts from
current dates */
if Presidency >= 22 then output; /*Select observations from
president Cleveland */
label NewDatevar1 = 'Start of Term' NewDatevar2 = 'End of Term'
Presidency = '#' Home_State = 'Home State' Years= 'Years Out of
Office'; /*Rename the variabels */
run;
proc print data = USPres label noobs; /*Print the datasets */
var President Party Home_State Presidency NewDatevar1
NewDatevar2 Years;
run;
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

/*7. The 2012stats file contains the following variables: Rank,
Name, Team, Class, Position, Number
of games played (Games), Field Goals Made (FGM), Three-point
Field Goals Made (3FG), Free
Throws (FT), Total Points (PTS), Average points per game (PPG),
and Conference (Conf). Use
notepad, Word, or some other text editing program to open this
file and examine the data and
the file layout. You will need to make certain decisions about
this data (such as column widths)
based on your observations. Write a data step that converts the
csv file to a SAS dataset in your
work library. Be sure to check the log very carefully when you
read in the raw data. You should
not get invalid data errors when reading this data. If you do,
then you most likely do not have
your data step set up to read the raw data correctly. */
/* Import a file stats.csv and Use filename statements to
define the paths to the two raw data files.*/
filename MyLib '/folders/myfolders/Project-13-
967790/3332936_394450699_2012stats.csv';
proc import datafile=MyLib
dbms=CSV
out=stats;
getnames=YES;
run;
/*8. Use the SUMMARY procedure to create an output data set that
contains the average number of
games played grouped by the values in the variable CLASS (FR, SO,
etc.).
9. Print out the data portion of the summary data set. Make sure
the date is printed at the top of
the page for this output step. Use other enhancements to supply
titles and a footnote as
shown. Suppress the printing of observation numbers.
10. At the end of your program include statements to ensure that
titles and
footnotes do not get carried over to any subsequent output
generated during this SAS session.
11. Review the output from steps 8 and 9 above. Before converting
Name, Team, Class, Position, Number
of games played (Games), Field Goals Made (FGM), Three-point
Field Goals Made (3FG), Free
Throws (FT), Total Points (PTS), Average points per game (PPG),
and Conference (Conf). Use
notepad, Word, or some other text editing program to open this
file and examine the data and
the file layout. You will need to make certain decisions about
this data (such as column widths)
based on your observations. Write a data step that converts the
csv file to a SAS dataset in your
work library. Be sure to check the log very carefully when you
read in the raw data. You should
not get invalid data errors when reading this data. If you do,
then you most likely do not have
your data step set up to read the raw data correctly. */
/* Import a file stats.csv and Use filename statements to
define the paths to the two raw data files.*/
filename MyLib '/folders/myfolders/Project-13-
967790/3332936_394450699_2012stats.csv';
proc import datafile=MyLib
dbms=CSV
out=stats;
getnames=YES;
run;
/*8. Use the SUMMARY procedure to create an output data set that
contains the average number of
games played grouped by the values in the variable CLASS (FR, SO,
etc.).
9. Print out the data portion of the summary data set. Make sure
the date is printed at the top of
the page for this output step. Use other enhancements to supply
titles and a footnote as
shown. Suppress the printing of observation numbers.
10. At the end of your program include statements to ensure that
titles and
footnotes do not get carried over to any subsequent output
generated during this SAS session.
11. Review the output from steps 8 and 9 above. Before converting
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

your program to PDF, write a
two or three sentence paragraph in a comment block at the bottom
of your program describing
any conclusions or observations that you have made from the
summary data set.
12. Convert the program and log to PDF files and submit them to
eCampus along with your SAS
output.*/
options date pageno=4 number; /*Starts from page number 4 with
datestamp */
title1 '2012 Division I Mens Basketball Scoring Analysis'; /*Set
title1*/
title2 'Average Games Played by Class'; /*Set Title2 */
footnote1 'Based on Raw Data Prior to Cleansing'; /*Set Footnote
*/
proc summary data=stats;
class class;
var Games;
output out=stats_out MEAN(Games)=avg_games;
proc print data = stats_out noobs;
format avg_games 7.4;
run;
two or three sentence paragraph in a comment block at the bottom
of your program describing
any conclusions or observations that you have made from the
summary data set.
12. Convert the program and log to PDF files and submit them to
eCampus along with your SAS
output.*/
options date pageno=4 number; /*Starts from page number 4 with
datestamp */
title1 '2012 Division I Mens Basketball Scoring Analysis'; /*Set
title1*/
title2 'Average Games Played by Class'; /*Set Title2 */
footnote1 'Based on Raw Data Prior to Cleansing'; /*Set Footnote
*/
proc summary data=stats;
class class;
var Games;
output out=stats_out MEAN(Games)=avg_games;
proc print data = stats_out noobs;
format avg_games 7.4;
run;
1 out of 5

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.