Machine Learning for Depression Prediction via Social Media Analysis

Verified

Added on  2023/01/05

|4
|3134
|40
Report
AI Summary
This report provides a comprehensive literature review on the application of machine learning and natural language processing (NLP) techniques for predicting depression levels based on social media posts. The report examines various research papers that explore the use of social media data to identify and assess depression. It discusses methodologies such as sentiment analysis, LIWC, BDI-II, CES-D, and machine learning algorithms like Decision Trees, Support Vector Machines, K-Nearest Neighbors, and Ensemble methods. The review highlights the importance of social media as a platform for expressing emotions and the potential of these techniques to detect and classify different levels of depression. Furthermore, the report identifies research gaps, such as time limitations and sample size, and outlines the methodologies employed in the studies, including data collection, feature extraction, and classification techniques. The report emphasizes the ethical and security considerations associated with analyzing social media data for depression prediction and underscores the potential of these methods to provide valuable insights into mental health.
Document Page
Prediction of depression level using social media post in
machine learning and natural language processing
Authors Name/s per 1st Affiliation (Author)
line 1 (of Affiliation): dept. name of organization
line 2-name of organization, acronyms acceptable
line 3-City, Country
line 4-e-mail address if desired
Authors Name/s per 2nd Affiliation (Author)
line 1 (of Affiliation): dept. name of organization
line 2-name of organization, acronyms acceptable
line 3-City, Country
line 4-e-mail address if desired
Abstract— The significant emphasis of this paper is to deal
with the fact of predicting the level of depression in respect to the
social media post with the help of machine learning as well as
natural language processing techniques. This report will
critically assess different articles that will significantly identify
the overall levels of depression in accordance to the posts done on
the different social media platforms.
Keywords—LIWC, BDI-II, CES-D, LINGUISTIC
DIMENSIONS, DECISION TREE, SUPPORT VECTOR
MACHINE, K NEAREST NEIGHBOUR, ENSEMBLE
I. INTRODUCTION
The prominent usage of the social media platforms by the
people of different sectors to express their personal feelings as
well as views. The significant actions as well as behavior
regarding the mentally depressed people as well the perception
of thinking. The usage of different social media platforms are
popular in regards to the fact that they are consistent as well as
accessible to the users with the help of Internet. There are
several studies in this area which depicts that the depression
level may be addressed in respect to the usage of the machine
learning as well as natural language processing techniques.
II. LITERATURE REVIEW
Conway and O’Connor, (2016) stated that the social media
platforms are generally consistent as well as attracts more
people that remains depressed at times. The platforms of
social networking are quite related with the fact of the mental
illness, analyze the sentiments as well as opinions of the users.
Most of the case that are related severely with depression
leads to the suicide of the persons. . In this article the leading
causes of suicide by the people of the different ages are stated
to be the significant depression in accordance to their personal
life. The age group that is mainly attracted towards depression
as well as resulting it to suicide are the students [1]. The
significant article depicts the prediction of depression in
regards to the social media. The articles puts a great emphasis
on various techniques that can collect data related to
depression from various social media platforms. The crawling
technique is one of the significant tool that mainly collects the
data from twitter, further saving those data in the significant
database using the open API [16]. This article signifies the
usage of an enhanced Web Application that detects one of the
four depression levels that is minimal, moderate, mild as well
as severe. The data from the platforms such as Twitter and
Facebook are collected with the help of the BDI-II thus
analyzing the collected data with the help of text analysis
API’s. The article also depicts the usage of Deep Learning
methodologies for the detection as well as predicting the levels
of depression [5]. The posts associated with the depression are
then classified with the help of the two classification
methodologies. The first one being the binary classification
and the other one being the multiclass classification. The
article also signifies the introduction of a Recursive Neural
Tensor Networks as well as Stanford Sentiment Treebank that
importantly detects the level of sentiment associated with a
single post. Thus the overall article states the usage of the
significant methodologies that may help in the detection of the
depression levels that persists in different social media
platforms.
As per Aldarwish and Ahmed (2019), depression is stated to
be a disorder related to the individual’s mood which is mainly
caused by the continuous feeling of sadness as well as loss of
interest in the associated field. Since, the usage of the social
media is increasing day by day the access of the social
networking websites have become comparatively easier. This
has simultaneously increased the posts and sharing of personal
feelings by the individuals [11]. This has resulted them in
expressing their emotions in an enhanced manner. Thus the
social media platforms uses enhanced tools and techniques to
identify the posts with significant depression as well as make
an analysis of the depressed persons associated with these
platforms. Firstly the user data is collected with the help of the
Support Vector Machine and then the level of depression is
examined in these posts [3]. The Model for the prediction of
depression is developed by Rapid Miner. This model mainly
consists of two important datasets as well as seven main
operators. The method associated with the prediction depicts
the different social networking sites that would reflect many
user data thus emphasizing on the category or levels of
depression. There are various sample posts provided in the
article that represents the symptoms as well as levels of
depression. The calculation matrix in respect to the level of
depression is well stated in this article thus depicting the users
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
associated with the depressing posts within the social
networking sites.
According to Schwartz et al., (2016) the effect of depression
puts a great emphasis on the personal as well as public health
of the individuals. The sector of mental illness is stated to be a
leading area in respect to disability that are increasing in this
modern times. The significant emphasis of this paper is to use
the crowdsourcing method for the collection of the data [7].
After this several methodologies are used for the measurement
of the severity in regards to the depression levels of the post.
There are significant classifiers used in this article that tends
to predict, the vulnerability of the depression associated with
the post. The article identifies the usage of the CES-D, which
is a primary tool for the determination of the depression levels
within the posts [9]. The measurement of the depressing
behavior is stated to be discussed as well as analyzed in the
article with the help of significant tools as associated with the
posts in the different social networking sites. The significant
features that are associated with the mode of depression are
stated in the article which makes the analysis of the depression
levels to be easy and primarily collected data are analyzed for
the depression factors.
Guntuku et al., (2019) also stated that social media platforms
are depicted as the important tools that deals with the
prediction of the depression among the individuals that uses
them. This article significantly states the method of user level
social media data for pursuing a survey in regards to the posts
associated with the different social media profiles. The posts
of the overall social media posts are analyzed with the help of
the Happier Fun Tokenizer that are significant emoticons as
well as social media awareness tools. Tis will help in the
differentiation of the normal posts with the post that possesses
severe level of depressing words or sentences. Moreover, the
article also states the usage of LIWC or Linguistic Inquiry and
Word Count that possesses 73 different categories associated
with the psycholinguistic manner for representing each user’s
language as well as country in respect to the analysis of the
posts provided in different languages [2]. Moreover, there are
certain machine learning tools that are associated with the
further parts of the research that continuously depicts the
positive as well as negative correlations among the data that
are being posted in the social networking sites for the
depiction of the user depression level. The psychological
stress that is being affected by the users in the daily life is
significantly viewed in the websites such as Facebook as well
as twitter. This data are then accumulated with respect to the
posts and then analysed thus providing them with significant
categories of depression levels. This analysis is mainly done
by the usage of the machine learning tools which are adhered
to possess effective significance on the measurement of the
depression levels [15]. Moreover, the ethical as well as
security considerations of the posts related to the social
platform environments are also adhered by the machine
processing tools so that the posts can be verified in respect to
the stages of depression. The sample posts are significantly
examined in this article to clearly reflect the depression stages
of the posts associated with the social networking sites.
As stated by Islam and Kabir (2019), the depression level
of the various posts associated with the users of the social
media platforms are significantly depicted with the help of the
support vector machines. These machines are stated to be the
fact of the liner classifier of the posts that mainly puts an
emphasis on the detection of anomalies. The article also
suggests the usage of the decision tree that helps in the
classification of the empirical data present as well as collected
in respect to the data set. In this article the data analysis is
conducted with the help of MATLAB [8]. The four significant
classifiers such as Support Vector Machine, K-Nearest
Neighbors, Decision trees and ensemble. This classifiers helps
in the detection of the depression levels that are associated in
the posts of the social media users. The below figure represents
the depression accuracy of the dataset that are adhered by the
social media websites in regards to the various classifiers.
(Image: Depression Accuracy)
(Source: Islam and Kabir, 2019)
III. RESEARCH GAP
The research gap that can be identified for carrying out this
research is time limitation. More time would help to do the
research in depth and would allow the reader to get more
detailed information about the research paper. No proper
sample size of carrying out the data collection part is not
mentioned in the research study. It would find it difficult to
carry out the research to next level. Another research gap that
can be identified in this research paper studying the research
topic in more details. More budget would lead to study the
project in much details.
IV. METHODOLOGY
The overall study of the depression levels that are
associated with the posts within the social networking sites are
significantly focused on four types of processes that are likely
to be emotional process, linguistic style, temporal process as
well as the stated features together. Then the important
machine learning strategies are used for each significant
factors. Moreover, the classification techniques as understood
by the overall review states the usage of decision tree, support
vector machine, k nearest neighbor as well as ensemble [10].
Firstly, the data set exploration technique states the collection
of the data as provided in the different posts associated in the
several social media networking platforms. These data are then
stored in the database for carrying out the analysis of the
overall data in regards to the depression levels. After the
analysis of the data the preparation of the data set is done using
the LIWC software. This software helps in the analysis of the
text as well as strategies helps in the processing of the posts
Document Page
following the line by line methodology. Now this data are
significantly processed with the positive as well as the negative
comments that further helps the data to be divided in to the
depression as well as non-depression systems. After the
analysis of the data the significant features of those posts are
highlighted by the feature extraction methodologies of the
LIWC software [6]. These data are divided into their process
such as Psychological process, Linguistic process and the other
grammars that are posted within the posts. The measurement of
the depressive behavior is done by the usage of the different
emotional variables. These are stated as positive, negative,
anger, sad as well as anxiety. Moreover, the temporal
categories are depicted as the present focus, future focus and
past focus. The 9 significant linguistic dimensions such as
articles, auxiliary verb, prepositions, conjunctions, adverbs,
negations, pronoun and verbs, are used for the calculation of
the depression levels. After this the data are presented in the
support vector machines for the linear classification of the
depression [4]. Thus the overall methodology depicts the
significant methodology that are to be followed by the social
networking sites for the depiction of the depression levels
among the posts that are provided by the different individuals
in the social networking sites. The below figure depicts the
overall procedure that uses machine language and natural
language processing techniques for the analysis of the
depression levels form the posts that are adhered within the
social networking sites.
(Image: Procedure of depression detection)
(Source: Islam and Kabir, 2019)
Thus from the picture it can be said that following some
significant steps presented in the above figure the depression
level of the users can be well analysed in regards to their posts
in the social networking websites.
V. CONCLUSION
Thus the overall report states that the usage of the
depression testing tools may significantly emphasize on the
social media posts thus analysing the volume of the users that
usually remains depressed. The significant usage of the
machine learning tools as well as the natural language
processing methodologies for the enhanced prediction of the
depression levels according to the social media posts by the
individuals. Thus it can be concluded the overall report
analyses the methodologies as well as the procedure through
which the proper depression level will be measured in respect
to the social media posts for the enhancement and the analysis
of the individuals suffering from depression.
VI. REFERENCES
[1] A.H. Orabi, P. Buddhitha, M.H. Orabi and D. Inkpen.
Deep learning for depression detection of twitter users. In
Proceedings of the Fifth Workshop on Computational
Linguistics and Clinical Psychology: From Keyboard to
Clinic, 2018, June, (pp. 88-97).
[2] A. Benton, M. Mitchell and D. Hovy,. Multi-task learning
for mental health using social media text. arXiv preprint
arXiv:1712.03538 2017.
[3] S.C. Guntuku, D.B. Yaden, M.L. Kern, L.H. Ungar and
J.C. Eichstaedt. Detecting depression and mental illness on
social media: an integrative review. Current Opinion in
Behavioral Sciences, 18, 2017, pp.43-49.
[4] J. Liu, E.R. Weitzman, and R. Chunara. Assessing
behavior stage progression from social media data. In
Proceedings of the 2017 ACM Conference on Computer
Supported Cooperative Work and Social Computing, 2017,
February, (pp. 1320-1333). ACM.
[5] M. Conway and D. O’Connor. Social media, big data, and
mental health: current advances and ethical implications.
Current opinion in psychology, 9, 2016, pp.77-82.
[6] A. Wongkoblap, M.A. Vadillo and V. Curcin. Researching
mental health disorders in the era of social media: systematic
review. Journal of medical Internet research, 19(6), 2017,
p.e228.
[7] H.A. Schwartz, M. Sap, M.L. Kern, J.C. Eichstaedt, A.
Kapelner, M. Agrawal, E. Blanco, L. Dziurzynski, G. Park, D.
Stillwell and M. Kosinski. Predicting individual well-being
through the language of social media. In Biocomputing 2016:
Proceedings of the Pacific Symposium, 2016, (pp. 516-527).
[8] R.A. Calvo, D.N. Milne, M.S. Hussain and H. Christensen.
Natural language processing in mental health applications
using non-clinical texts. Natural Language Engineering,
23(5), 2017, pp.649-685.
[9] M.J. Vioulès, B. Moulahi, J. Azé and S. Bringay.
Detection of suicide-related posts in Twitter data streams.
IBM Journal of Research and Development, 62(1), 2018, pp.7-
1.
[10] J. Hirschberg and C.D. Manning. Advances in natural
language processing. Science, 349(6245), 2015, pp.261-266.
[11] S. Tsugawa, Y. Kikuchi, F. Kishino, K. Nakajima, Y. Itoh
and H. Ohsaki. Recognizing depression from twitter activity.
In Proceedings of the 33rd annual ACM conference on human
factors in computing systems, 2015, April, (pp. 3187-3196).
ACM.
[12] M. Aldarwish and H. Ahmed. Predicting Depression
Levels Using Social Media Posts. 13th ed. Kingdom of Saudi
Arabia: International Symposium on Autonomous
Decentralized Systems (2019).
[13] S. Guntuku, A. Buffone, K. Jaidka, J. Eichstaedt and L.
Ungar. Understanding and Measuring Psychological Stress
Using Social Media. 1st ed. Pennsylvania,: University of
Pennsylvania, (2019).
Document Page
[14] M. Islam and M. Kabir. Depression detection from social
network data using machine learning techniques. Health
Information Science and Systems, (2019).
[15] M. Choudhury, M. Gamon, S. Counts and E. Horvitz,
Predicting Depression via Social Media. Redmond WA
98052: Proceedings of the Seventh International AAAI
Conference on Weblogs and Social Media, 2019.
[16] A. Khan, A. Khan and M. Husain, Analysis of Mental
State of Users using Social Media to predict Depression! A
Survey. International Journal of Advanced Research in
Computer Science, 2019.
chevron_up_icon
1 out of 4
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]