Study on the Impact of Depression on Cigarette Smoking Frequency

Verified

Added on  2023/05/29

|24
|6346
|437
Report
AI Summary
This report investigates the impact of age, gender, and depression on cigarette smoking frequency using a large sample of the general public. The study analyzes the number of cigarettes smoked per day (cgtsday) in relation to age (agea), gender (gndr), and feelings of depression (fltdpr). Hypothesis testing reveals significant differences in smoking habits between genders, a weak positive correlation between age and smoking, and a notable impact of depression on smoking levels. A multiple regression model confirms the linear relationship between the control variables and smoking frequency. The findings highlight the importance of considering demographic and psychological factors in understanding and addressing smoking habits, particularly among women and individuals experiencing depression. Desklib provides access to a wealth of study resources, including past papers and solved assignments, to support students in their academic endeavors.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
RUNNING HEAD: Impact of Depression on Cigarette Smoking Frequency – An Age and Gender-Based Study
Impact of Depression on Cigarette Smoking
Frequency – An Age and Gender-Based Study
1
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Abstract
The present investigation was performed by breaking down an extensive example of the
overall public, including a significant number of subjects who continued smoking. The result
variable was chosen as "cgtsday", which inferred the number of cigarettes smoked every day by
a person. The initial two demographical control factors were "agea", showing the age of the
examples and "gndr", meaning the sex of the example subjects. The third factor was the
sentiment of melancholy in a week ago and was meant by "fltdpr" in the overview. The reliance
on the result variable on the three control factors was tried in the theory testing process. The null
hypothesis was rejected against the substitute that there was a significant difference in cigarette
smoking speculation at 5% level, implying that guys smoked essentially more than the females.
There was a factually noteworthy connection demonstrating that cigarette smoking rate expanded
with increment in age. In any case, the affiliation was extremely frail and it is shown that for one
year of age increase cigarette smoking increments by 0.065. The cross-organization with the
sentiment of discouragement in later past or a week ago discovered that low smoking gathering
was measurably unique from different gatherings. From the t-test insights of the regression
model, the control factors were observed to be measurably noteworthy at 5% dimension of
noteworthiness. The control factors were found to significantly impact the cigarette smoking rate
in a linear manner.
2
Document Page
Table of Contents
Abstract...................................................................................................................................................................... 2
Introduction.............................................................................................................................................................. 4
Rationale and Hypotheses................................................................................................................................... 5
Methodology............................................................................................................................................................. 6
Categorical Variable Generation for “cgtsday”....................................................................................... 7
One Variable Summary......................................................................................................................................... 7
Control variables................................................................................................................................................ 7
Outcome Variable............................................................................................................................................... 9
Inferential Analysis / Interpretation of Findings.................................................................................... 10
Hypothesis H01 Test: (T-test).................................................................................................................. 10
Hypothesis H02 Test: (Pearson’s Correlation).................................................................................. 11
Hypothesis H03 Test: (ANOVA)............................................................................................................... 12
Confirmatory test for Hypothesis H03: (Chi-Square Test for Categorical “cgtsday_cat”)13
Multiple Regression Model.............................................................................................................................. 14
Conclusion............................................................................................................................................................... 16
References............................................................................................................................................................... 18
Appendix (DO FILE OF STATA CODES)....................................................................................................... 20
3
Document Page
Introduction
Cigarette smoking became commonplace from the very beginning of the 20th century.
Many men started smoking from the time of the First World War and many women got addicted
to the nicotine addiction from the era of Second World War. The general observation of time
trends has shown that more men than women smoke in their younger age, and although the
percentages have dropped globally, the decline is more pronounced in men than in women
(Fenech, & Bonassi, 2011, Ng et al., 2014). Cigarette smoking is the most important and decisive
factor that accelerates the age-related decline in lung diseases. The aim of this study was to
analyze the effects of age, gender and depression in the past on smoking habits (Leventhal &
Zvolensky, 2015). The main goal was to see whether smokers show a decrease in smoking
frequency as a result of increasing age and recently depressive mood (Boden, Fergusson &
Horwood, 2010). The study was conducted by analyzing a large sample of the general
population, including a significant number of subjects who were having increased smoking
habits. The prevalence of smoking and the burden of smoking-related diseases are of late shifting
to women. With a large number of young women who continue to smoke, there is a clear need to
learn more about ways to stop smoking and find ways to quit the menacing habit (Corona et al.,
2010). This article also examines smoking and gender behavior, with special attention to the
experience of aging.
One outcome and three control variables were identified from the survey data set
provided for this quantitative report. The outcome variable was selected as “cgtsday”, which
implied the number of cigarettes smoked per day by an individual. The first two
demographical control variables were “agea", indicating the age of the samples
(continuous) and “gndr”, denoting the gender (categorical and nominal) of the sample
subjects. The third control variable was the feeling of depression in last week and was
denoted by “fltdpr” in the survey (categorical and ordinal). The dependence of the outcome
variable on the three control variables was tested in the hypothesis testing process. A final
multiple regression model was constructed to assess the impact of the control variable on the
frequency of cigarette smoking.
4
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Rationale and Hypotheses
In the tenure of entire life the amount of cigarettes smoked by people ultimately determines the
quality of air breathed. In women, the frequency of smoking determines the health condition,
especially given the biological intricacies of females. Some of the reviews have also shown the
status of smoker from the mid nineteenth century has decreased in the percentage, together with
an unstoppable increase in the number of ex-smokers (O'Loughlin, Karp, Koulis , Paradis,
DiFranza, 2009). Smoking in young women is particularly worrisome because it not only entails
a risk to their own long-term health, but also has a direct impact on the reproductive function and
health of their children (Strine et al., 2008). The prevalence of smoking in the most recent
meetings or in adult women exceeds that of men. This particular trend is now predominant in
European and Asian countries. While many longitudinal studies have studied the number of
people who have quit smoking to demonstrate that there is an effect in the loss of lung function
in heavy smokers compared to that of the light smokers. On the other hand, some studies had
already shown in the late nineteenth century that patients with adverse mental health had a high
proportion of addiction to cigarette smoking. Today, researchers are considering using both
cross-section and order information and taking advantage of recent depression problems (Thun,
DeLancey, Center, Jemal & Ward, 2009). The presentation of lung cancer and smoking was
followed by the constant expansion of data on the risks of smoking.
To study and analyze the impact of the control variables on smoking habit and frequency
of smoking, three hypotheses have been framed by the researcher. These hypotheses were
framed to answer the primary research question of this article: “Whether Age, Gender,
and recent Depression have any impact on frequency of cigarette smoking in daily life”.
Considering the type of the variables, the null hypotheses were structured against appropriate
alternate hypotheses. The three set of hypotheses were all tested at 5% level of significance for
their statistical implication.
5
Document Page
H01: Male and females were equally inclined to smoking.
Against
HA1: There was a significant difference in smoking habits of men and women.
H02: Age had no correlation with the smoking frequency of sample subjects.
Against
HA2: There was a significant association of age of the subjects with their smoking frequency.
H03: Recent depression has no impact on the number of cigarettes smoked.
Against
HA3: Recent depression has a significant impact on the frequency of smoking.
Methodology
Four variables for the research purpose were extracted from the master data file to
evaluate the impact of age, gender and depression felt on cigarette addiction and smoking
frequency. STATA 15.0 version was utilized as the data analysis software platform for the
present research. The missing responses or responses where subjects did not want to answer were
taken care of in the STATA environment. The dependent "cgtsday" implied the number of
cigarette smoking by the subjects and responses from 2390 subjects was analyzed. Among 2390
responses 1837 missing values were re-coded, and assigned a character value to the missing
value for descriptive and inferential analyses. The missing values were then treated as non-
existent in STATA environment. Hence, valid 553 responses were analyzed in further
6
Document Page
investigation. To test the gender difference effect on smoking cigarettes, an independent t-test
was used. The relation of smoking frequency with age of the subjects was tested by a Pearson’s
correlation coefficient.
Categorical Variable Generation for “cgtsday”
A new categorical variable “cgtsday_cat” was generated, where three categories of
smokers were created based on the frequency of smoking. Subjects smoking between 0 - 15
cigarettes was labeled as “Low Smoking”, smokers ranged between 16 to 30 cigarettes were
named as “Medium Smoking”, and smoking frequency between 31 to 45 cigarettes a day was
labeled as "High Smoking". A chi-square test was used to find the categorical difference of
smoking frequency on past depression levels. Finally, a multiple regression model was
constructed for testing the linear relationship between the outcome and control variables.
One Variable Summary
Control variables
Gender analysis from figure 1 it can be noted that there was 1102 male (P = 46.11%) and
1288 female subjects (P = 53.89%) in the sample. The dominant presence of female subjects as
smokers in the sample indicated that females were addicted to smoking instead of various
biological hazards caused due to smoking (Dechanet et al., 2010).
46.11
53.89
0 20 40 60
Percent
Gender
Gender Distribution
Figure 1: Gender Distribution of the Subjects in the Sample
7
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Age wise exploration of the subjects yielded that average age of male subjects was 49.72
years (SD = 18.58 years) ranging between 15 to 96 years. The average age of females was 49.10
years (SD = 17.85 years) ranging from 15 to 97 years. From figure 2 the comparative analysis of
the age for the two genders revealed that the median age of males was slightly greater than that
of the females. The spread of age for the middle 50% of subjects was similar for both male and
females. From Figure 2 it was also noted that the distribution of age has the right skewness. This
was probably due to the presence of a few highly aged subjects in the sample data.
0 5 10 15 20
Percent
20 40 60 80 100
Age of respondent, calculated
20 40 60 80 100
Age of respondent, calculated
Male Female
Figure 2: Distribution of Age and Normal Curve Fitted
The third control variable was "fltdpr", denoting the frequency of depression feeling in
the last week for a subject. The variable had four categories, starting from almost no depression
felt in last week to feeling depressed almost every time in the last week. From the frequency
table in Table 1, it was noted that most of the subjects (N = 1730, P = 73.03%) felt almost no
depression. Next, 527 (P = 22.25%) subjects felt depression for some of the time in a past week.
Rest of the 4.73% of the subjects was either depressed for most of the time or all of the times in
the past week. 21 missing values were taken care of by assigning separate code so that STATA
environment was able to recognize it as a missing value.
8
Document Page
Table 1: Feeling Depressed Categories with Frequency
73.03
22.25
3.335 1.393
0 20 40 60 80
P e rce nt
Felt depressed, how often past week
Feelling Depression
Figure 3: Distribution of Feeling Depressed in Pat week
Outcome Variable
The dependent or outcome variable "cgtsday" denoted the number of cigarettes smoked
in a day. For the valid 553 subjects, average cigarettes smoked in a day was found to 13.91 (SD
= 7.86) in a day where the number of cigarettes smoked ranged from zero to 45. Among 553
subjects, 348 were found to be addicted to low smoking, whereas 191 subjects were medium
smokers, and 14 subjects were found to be heavily addicted to smoking. The categories were
earlier described in the methodology section. Gender basis investigation revealed that males on
9
Document Page
an average were smoking 15.26 (SD = 8.63) cigarettes, ranging from 0 – 45. Average cigarettes
smoked by the females was 12.58 (SD = 6.77), ranging from 0 – 40. From Figure 4 the density of
smoking it was identified that in the low smoking category the density or frequency was higher
than males. Frequency or density of males in the medium smoking category was higher than that
of the females. In the high smoking zone, there was no comparison between the genders, where
the frequency of males was very much higher than that of the females.
0 .02 .04 .06
kdensity cgtsday
0 10 20 30 40 50
x
kdensity cgtsday kdensity cgtsday
Figure 4: Gender wise Comparison for Cigarette Smoking
Note: Red Line for females, the Blue line for males
Inferential Analysis / Interpretation of Findings
Hypothesis H01 Test: (T-test)
The first hypothesis was tested using independent t-test at 5% level of significance.
Average smoking of consumptions of cigarettes was compared for the two genders using two-
sample t-test. At 5% level, a statistically significant difference in average smoking habits of both
genders was noticed. Hence, there was a strong evidence that males were significantly (t = 4.06,
p < 0.05) smoking more cigarettes compared to females. The null hypothesis H01 was rejected
against the alternative hypothesis at 5% level, concluding that males smoked significantly more
than the females.
10
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Table 2: T-test for Gender Comparison for Cigarette Smoking
Note: The one tail test was used here as Mean smoking rate for males was greater than females
Hypothesis H02 Test: (Pearson’s Correlation)
The correlation between age and number of cigarettes smoked in a day was
evaluated by Pearson’s correlation coefficient. The correlation coefficient was positive (r =
0.173, p < 0.05) and statistically significant at 5% level. The low positive correlation can also be
observed in the scatter plot of age with cigarette smoking frequency in Figure 5. The null
hypothesis H02 was rejected against the two tail alternate hypothesis HA2 at 5% level. Hence, there
was a statistically significant correlation indicating that cigarette smoking rate increased with
increase in age. But, the association was very weak and it indicated that for one year of age
increment cigarette smoking increases by 0.065 on an average. Hence, for a difference in 5 years
in age average cigarette consumption would change approximately by 3 cigarettes. Hence, the
result was of practical significance also. The trend was also in line with practically significance
relation of age and cigarette smoking frequency.
11
Document Page
How many
cigarettes
smoke on
typical day
Age of
respondent,
calculated
0
20
40
0 20 40
0
50
100
0 50 100
Figure 5: Correlation Matrix between Age and Cigarette Smoking Frequency
Hypothesis H03 Test: (ANOVA)
Third hypothesis H03 regarding the impact of depression level of recent past on
cigarette smoking rate was tested by a one-way ANOVA. The categorical levels of smoking
were later cross-tabulated with the four levels of past depression levels for confirmatory analysis.
First, at 5% level, the ANOVA model tested the difference in smoking rates for four depression
levels. It was identified that always depressed people were smoking more compared to other
subjects with an average rate of 18.5 cigarettes per day. From Table 3 it could be inferred that
there existed a statistically significant difference between the smoking rates for the different
depression levels (F = 4.72, P < 0.05). A post-hoc analysis was conducted for pairwise
comparison for smoking rates for different depression levels. From Bonferroni test of pairwise
comparison the difference between smoking frequencies of frequently depressed and not at all
depressed subjects was identified. There was no other significant pairwise comparison based on
depression levels. Hence, the null hypothesis H03 was rejected against HA3 at 5% level of
significance, concluding that depression levels had a significant effect on cigarette smoking rates
of the subjects.
12
Document Page
Table 3: ANOVA for Cigarette Smoking Frequency for Different depression Levels
Confirmatory test for Hypothesis H03: (Chi-Square Test for Categorical “cgtsday_cat”)
The confirmatory analysis with the chi-square test for independence was used to
assess the relation of smoking categories and four depression levels. The results yielded that
the percentage of low smoking was considerably high in low depression cases. The difference in
smoking frequencies for the three categories was apparently clear. The cross-tabulation with a
feeling of depression in the recent past or last week found that Low smoking group was
statistically different ( χ2=13 .06 , p< 0. 05 ) from other groups based on depression levels.
Table 4: Chi-Square Table for Depression and Cigarette Smoking Rate
13
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Multiple Regression Model
Finally, a multiple regression model was constructed to assess the degree of linear
association between the outcome variable and the control variables. The linear association was
initially assessed by the correlation coefficient between the outcome and control variables.
Previously, in hypothesis testing section a positive and statistically significant, but low
association was observed between age and rate of smoking (r = 0.173, p < 0.05). Spearman's
rank correlation was used for finding the linear association of "cgtsday" with gender ("gndr") and
feeling depressed ("fltdpr") due to both variables being categorical. A low positive, yet
statistically significant (rho = 0.09, p < 0.05) correlation between cigarette smoking rate and
feeling of depression was found. Now, a negative and statistically significant correlation (rho = -
0.15, p < 0.05) was noted between gender and cigarette smoking frequency. The correlation
coefficient implied that there was a retarding tendency in smoking for females compared to that
of the males.
The multiple-regression model with "cgtsday" as the outcome variable and "gndr",
"fltdpr", and "agea" as control variable was constructed. The model was statistically significant
(F (3, 546) = 12.57, P < 0.05) at 5% level. The regression equation was calculated as below,
cgtsday=1 . 40fltdpr+0 . 06agea2. 29gndr +12 . 31
Table 5: Multiple Regression Model for "cgtsday" on "gndr", "fltdpr", and "agea"
14
Document Page
The model implied that levels of depression categories positively affected smoking
frequencies. Keeping other factors constant, increase in a single level of depression increased
cigarette smoking frequency by 1.4. Similarly, with other control factors being fixed, females
were found to have a negative impact on smoking frequency, and a decrease in smoking by 2.29
cigarettes on average was noted for females compared to males. There was a low positive
relation and increase in one year of age was found to increase smoking by 0.65 cigarettes.
Without any one of the control variables, subjects were found to smoke 12.31 cigarettes on an
average. From the t-test statistics, the control variables were found to be statistically significant
at 5% level of significance. Linearity assumptions for the regression model were found to be
satisfied.
-2 0 -1 0 0 1 0 2 0 3 0
R e s id u a ls
10 12 14 16 18 20
Fitted values
Figure 6: HETEROSCEDASTICITY for the Post-Regression Model
From the residual plot, no specific patterns were identified. Hence, no heteroscedasticity
was observed in the multiple-regression model, where a random scattering of residuals was
noted. From the variance inflation factor (VIF) the multicollinearity in the regression model was
estimated. It was noted from Table 6 that the VIF values were approximately 1, indicating that
there was no multicollinearity concern in the regression model. This was also noted from the low
correlation between the control factors.
15
Document Page
Table 6: Multicollinearity in Regression
Conclusion
The consequences of this investigation affirmed that smoking, particularly irresistible
smoking gets quickened by the decrease in emotional wellness. The discoveries recommend that
age advantageously affects the smoking decrease and that decrease in the number of cigarettes
smoked additionally has a helpful impact among more youthful subjects. As in most of the prior
examinations, it has been discovered that the relationship among smoking and the sorrow level to
be portion subordinate and in light smokers additionally age-subordinate. In many past
investigations in which a helpful impact of the smoking suspension on the discouragement decay
has been watched. Substantial smokers and light smokers have been considered separately, as
light smokers will in general quit smoking more regularly than overwhelming smokers
(Mykletun, Overland, Aarø, Liabø, & Stewart, 2008). Separate examination of substantial and
light smokers who quit smoking demonstrated that controlling of mental condition impact was
most articulated among light smokers. When all is said in done, dependence on smoking has a
quickening impact on more youthful and in addition the elderly subjects. This is as per the
aftereffects of relapse examination.
In the present investigation of distress first and smoking later was scrutinized for the
purpose of cause and effect relation (Munafò, Hitsman, Rende, Metcalfe, & Niaura, 2008).
16
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Given the pattern toward smoking discontinuance, it is conceivable that subjects who get
discouraged and stopped smoking should encourage others to do so. Given the fundamental
finding of what occurred from utilizing the available information to test the three hypotheses, it
was possible to infer that inclusion of economic and cultural background could have helped in a
more precise analysis. This finding demonstrates that subjects who ended up depressed were
bound to keep smoking. The theory with respect to the view that discouragement prompts
smoking, we found the critical relationship between past sadness and resulting smoking. The
proof suggests that who got discouraged were more inclined with low level of depression.
Subjects at earlier stage in life were founded to be less addicted towards nicotine addiction. The
positive association of age with smoking rate with increasing level of depression was a matter of
concern. The finding additionally proposes that subjects who ended up discouraged to smoke
were females than the individuals who were males (Jha, 2009).
In future research the inter-relation among the age and depression might reflect the true
reason for increased arte in nicotine addiction. Along with age the place of origin of the subjects
would also have helped in analysing the impact of weather in the contribution of increased
smoking.
***********************************End***********************************************
17
Document Page
References
Boden, J. M., Fergusson, D. M., & Horwood, L. J. (2010). Cigarette smoking and depression:
tests of causal linkages using a longitudinal birth cohort. The British Journal of
Psychiatry, 196(6), 440-446.
Corona, G., Lee, D. M., Forti, G., O'connor, D. B., Maggi, M., O'neill, T. W., ... & Finn, J. D.
(2010). Age‐related changes in general and sexual health in middle‐aged and older men:
Results from the European Male Ageing Study (EMAS). The journal of sexual
medicine, 7(4pt1), 1362-1380.
Dechanet, C., Anahory, T., Mathieu Daude, J. C., Quantin, X., Reyftmann, L., Hamamah, S., ...
& Déchaud, H. (2010). Effects of cigarette smoking on reproduction. Human
reproduction update, 17(1), 76-95.
Fenech, M., & Bonassi, S. (2011). The effect of age, gender, diet, and lifestyle on DNA damage
measured using micronucleus frequency in human peripheral blood lymphocytes.
Mutagenesis, 26(1), 43-49.
Jha, P. (2009). Avoidable global cancer deaths and total deaths from smoking. Nature Reviews
Cancer, 9(9), 655.
Leventhal, A. M., & Zvolensky, M. J. (2015). Anxiety, depression, and cigarette smoking: A
transdiagnostic vulnerability framework for understanding emotion–smoking
comorbidity. Psychological bulletin, 141(1), 176.
Munafò, M. R., Hitsman, B., Rende, R., Metcalfe, C., & Niaura, R. (2008). Effects of
progression to cigarette smoking on depressed mood in adolescents: evidence from the
National Longitudinal Study of Adolescent Health. Addiction, 103(1), 162-171.
18
Document Page
Mykletun, A., Overland, S., Aarø, L. E., Liabø, H. M., & Stewart, R. (2008). Smoking in relation
to anxiety and depression: evidence from a large population survey: the HUNT
study. European Psychiatry, 23(2), 77-84.
Ng, M., Freeman, M. K., Fleming, T. D., Robinson, M., Dwyer-Lindgren, L., Thomson, B., ... &
Murray, C. J. (2014). Smoking prevalence and cigarette consumption in 187 countries,
1980-2012. Jama, 311(2), 183-192.
O'loughlin, J., Karp, I., Koulis, T., Paradis, G., & DiFranza, J. (2009). Determinants of first puff
and daily cigarette smoking in adolescents. American Journal of Epidemiology, 170(5),
585-597.
Strine, T. W., Mokdad, A. H., Balluz, L. S., Gonzalez, O., Crider, R., Berry, J. T., & Kroenke, K.
(2008). Depression and anxiety in the United States: findings from the 2006 behavioral
risk factor surveillance system. Psychiatric services, 59(12), 1383-1390.
Thun, M. J., DeLancey, J. O., Center, M. M., Jemal, A., & Ward, E. M. (2009). The global
burden of cancer: priorities for prevention. Carcinogenesis, 31(1), 100-110.
19
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Appendix (DO FILE OF STATA CODES)
Comments in Red
* Dataset defined as follows
* Dataset in the destination specified
set more off
cd "C:\Users\Eoin\Documents\UCD\SOC40830\2017_2018\data_do_files"
dir
use ess7ie_2014.dta
* one outcome and three control variables selected
* selecting required variables
keep idno cgtsday fltdpr gndr agea
table cgtsday
save "C:\Users\Eoin\Documents\UCD\SOC40830\2017_2018\wb2014_2014.dta"
clear
* new data set needs to be used for purpose of analyses
* use new data set
use wb2014.dta
**********************************************
**************1. Descriptive Details**********
***********************************************
* To describe the contents of slected independent variable:
des agea gndr fltdpr
* To summarise age:
sum agea
* To view quartiles of age and view specific percentiles:
tabstat agea, statistics(min p25 p50 p75 p90 max count mean sd)
by gndr, sort : summarize agea
* To tabulate gender for table representation
tab gndr
* To cross-tabulate gender with cgtsday
tab cgtsday gndr, col
* To split summary statistics for age by gender
bysort gndr: sum agea
20
Document Page
************************************
***** 2. coding missing values *****
************************************
* First, the value labels of cgtsday
label list cgtsday
sum cgtsday
* Missing value treatment for cgtsday by assigning to a character value
mvdecode cgtsday, mv(666 777 888 999=.a)
* tabulating the cgtsday
tab cgtsday
***********************************************
***** 3. recoding and labelling variables *****
***********************************************
* recoding gender by changing the categories to male and female
label list gndr
* recoding male as 0 and female as 1
recode gndr (1=0) (2=1), gen(gender)
* label for the new values
label define gender 0"Male" 1"Female"
label values gender gender
label list gender
*tabulate recoded gender
table gender
*checking age and removing missing values
label list agea
mvdecode agea, mv(999=.a)
*tabulate age of the subjects
tab agea
***************************************************
***** 4. recoding by sorting data into groups *****
***************************************************
* recoding cgtsday variable by changing the categories
*0-15=> category 1, 16-30=> 2, 31-45=>3
gen cgtsday_cat = irecode(cgtsday, 15, 30, 45)
label variable cgtsday_cat "cigarettes smoke intervals"
21
Document Page
table cgtsday_cat
*creating three categories for cross tabulation with depression level
label define cgtsday_cat 0"Low Smoking" 1"Medium Smoking" 2"High Smoking"
label values cgtsday_cat cgtsday_cat
label list cgtsday_cat
*tabulate the cigarette smoking categories
table cgtsday_cat
* To view value labels alone:
label list fltdpr
* recoding fltdpr variable by changing the categories
mvdecode fltdpr, mv(7 8=.a)
table fltdpr
recode fltdpr (1=0) (2=1) (3=2) (4=3) , gen(fltdpr_n)
*relabelling the depression categories for practical significance
label define fltdpr_n 0"Not Depressed" 1"Sometimes" 2"Frequently" 3"Always Depressed"
label values fltdpr_n fltdpr_n
label list fltdpr_n
*saving the new recoded variables
save "C:\Users\Eoin\Documents\UCD\SOC40830\2017_2018\wb2014_2014.dta"
*using new file for graphical calculation
use wb2014, clear
*********************************
***** 5. data visualisation *****
*********************************
* histogram with percentage of age
hist agea, percent bin(10)normal
* kernel density plot for age
kdensity agea, bwidth(2)
* generating overlay distribution curves for two genders
graph twoway || kdensity cgtsday if gndr==1/*
*/ || kdensity cgtsday if gndr==2
tabulate cgtsday_cat gndr
22
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
* BOXPLOT for age and cigarette smoking
graph box agea, over(gndr)
graph box cgtsday, over(gndr)
* labeling outlier points
graph box cgtsday, medtype(line) mark(1, mlabel(cgtsday))
* SCATTERPLOT for cgtsday with age
scatter cgtsday agea, xscale(range(0 100)) yscale(range(0 45))/*
*/ylabel(0(5)45) xlabel(0(10)100)title(Cigarette Smoking and Age)
graph export smoke_age.tif
* BAR CHARTS for gender
label list gndr
*treating the missing values
mvdecode gndr, mv(9=.a)
hist gndr, discrete percent xlabel(1(1)2) /*
*/addlabels xlabel(1(1)2, valuelabel angle(45)) /*
*/gap(15) title(Gender Distribution)
* bar diagram for fltdpr
hist fltdpr, discrete percent xlabel(1(1)4) /*
*/addlabels xlabel(1(1)4, valuelabel angle(45)) /*
*/gap(15) title(Feelling Depression)
*****************************************
***** 6. Hypothesis Testing *****
*****************************************
*two sample T-test for cgtsday by gender
ttest cgtsday, by(gndr) unequal
*one-way ANOVA for cgtsday with fltdpr
oneway cgtsday fltdpr_n, bonferroni tabulate
*Chi-Square as confirmatory test
tabulate cgtsday_cat fltdpr_n, cell chi2
* CORRELATION for cgtsday with agea
des cgtsday agea
*correlations significant at the level specified in brackets (.05)
pwcorr cgtsday agea, sig star(.05) print(.05)
graph matrix cgtsday agea
23
Document Page
* We run the regression model
*Spearman's correlation for categorical variables
spearman cgtsday fltdpr gndr, stats(rho p) star(0.05)
regress cgtsday fltdpr gndr agea
* HETEROSCEDASTICITY checking for post regression
*To plot the model residuals vs. the fitted values:
rvfplot, yline(0)
* MULTICOLINEARITY (variance inflation factor) assumption for post regression
vif
***********************************End***********************************************
24
chevron_up_icon
1 out of 24
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]