# Data Processing and Data Analyzing Techniques Assignment

Added on 2019-09-18

9 Pages2047 Words399 Views

IntroductionIn this assignment, you have to perform pre-specified, step-by-step tasks toensure you are familiar with (very) basic data downloading, data processing anddata analyzing basic techniques. This task should take you less than a day of work.If you find any of these questions complex or it takes your more than one day toperform them, please consider that you would need to devote additional time toAssignment 3. This task is much easier than the required tasks for Assignment 3.InstructionsPlease fill in the blanks in the questions below. That is, you need to replace “?”with the appropriate value and/or explanation. To obtain the answers, you need to perform the tacks listed in bullet points by thespecified order. The questions are marked with the “Q:” sign and the blanks to befilled with “???”. Inside [], I specify the value of that question, out of a maximumof 10 points for this assignment.You may use any software for this assignment (Excel, Eviews, SPSS, Stata, SAS,Matlab, Python, and etc.). My preferred software is SAS+ STATA. SAS helps youmanipulate the data and clean the data in a visible way (you also learn SQL) andSTATA has many packages to run the regression and output the results. Erasmuspurchases license of STATA (which is much cheaper...) for students, so you couldeasily install it. Therefore I provide some useful detailed STATA code instructionsincluded ingreen courier new font where coding is relevant.I add some [NOTE: ...] that you may find useful.Task SummaryIn this assignment, we will evaluate the effect of executive competition on firmleverage. And explore the effect conditional on CEO gender and CEO tenure.The baseline model would be:leveraget+1=α+βcompensationt+Controlvart+εt

Part 1 Obtaining Compensation DatasetsLogin in WRDSDownload “Compustat/ExecuComp/ Annual Compensation” data. oDownload the entire database by CUSIP for the (fiscal) years 2006-2015. You need the variables CUSIP and Issue Number; Executive IDnumber; CEOANN -- Annual CEO Flag; SALARY -- Salary ($); TDC1 --Total Compensation (Salary + Bonus + Other An; OPTION_AWARDS –Value of Option Awards – FAS 123R ($); YEAR -- Fiscal Year; CEOgender—Gender; Date became CEOoUse None Compression Type and the Output and Date Format moreappropriate to you. You may save this query if you prefer.o[NOTE: Make sure that you handle cusip identifiers as a stringvariable. Excel (and other non-advanced software) will have anobsession in converting the all number identifiers to numbers insteadof text. This would imply that zeros on the left would be deleting,creating possible problems in your matching.i.e. make sure that cusip “0000360” is not converted to cusip “360”)Keep only CEOs’ data: variable ceoann has to be equal to “CEO”.keep if ceoann=="CEO" First drop all observations generated by firms with more than one CEO in agiven year, and by CEOs work in more than one firm in a given year.sort cusip year execidby cusip year: gen count_obs = [_N]drop if count_obs > 1 // Firm has more than 1 CEO in a year.drop count_obssort execid year cusipby execid year: gen count_obs = [_N]drop if count_obs > 1 // CEO works in more than 1 firm in ayear.drop count_obsCalculate the variables ln_salary, ln_totalcomp, ln_option. ln_salary is thelogarithm of salary, ln_totalcomp is the logarithm of tdc1, ln_option Is thelogarithm of option_awards.

gen ln_salary = ln(salary) gen ln_totalcomp = ln(tdc1)gen ln_option = ln(option_awards) Generate 8-digit CUSIP identifier from the first 8 digits in the CUSIPgen cusip_8d = cusipWinsorize the three variables at the 1 %, 99% to prevent extreme values.That means that you need to change the values above 99% into 99% andthe value below 1% to 1%. Importantly this the winsorization processshould take once every year in case there is any time trend. There is anavailable package for this process which is winsor2. You could install itaccording to the following code.ssc install winsor2 //not every package is directlyavailable, you might want to search it first, just type findit***(so easy!) and then install it. Now you have the package, let’s kick out the outliers.winsor2 ln_salary, suffix(_w) cuts(1 99) by(year)//here yougenerate a new variable, ln_salary_wFor such kind of repeating work, we could use some loop. Try to google“foreach stata” and see whether you could make it more efficient.Let’s call this database, “CEO”.Hint for Q1 &2 (google “summary statistics Stata”)Q1: [0.5 point] For “CEO”, report the following and [0.5 point] explain howwinsorize process changes your data.VariablesObs.MeanStandardDeviationMinimumMaximumSkewnessKurtosisln_salary???????ln_salary_w???????ln_totalcomp???????ln_totalcomp_w???????ln_option???????

## End of preview

Want to access all the pages? Upload your documents or become a member.