logo

Prediction of Depression Level Using Social Media Posts in Machine Learning and Natural Language Processing

   

Added on  2023-01-05

4 Pages3134 Words40 Views
 | 
 | 
 | 
Prediction of depression level using social media post in
machine learning and natural language processing
Authors Name/s per 1st Affiliation (Author)
line 1 (of Affiliation): dept. name of organization
line 2-name of organization, acronyms acceptable
line 3-City, Country
line 4-e-mail address if desired
Authors Name/s per 2nd Affiliation (Author)
line 1 (of Affiliation): dept. name of organization
line 2-name of organization, acronyms acceptable
line 3-City, Country
line 4-e-mail address if desired
Abstract— The significant emphasis of this paper is to deal
with the fact of predicting the level of depression in respect to the
social media post with the help of machine learning as well as
natural language processing techniques. This report will
critically assess different articles that will significantly identify
the overall levels of depression in accordance to the posts done on
the different social media platforms.
Keywords—LIWC, BDI-II, CES-D, LINGUISTIC
DIMENSIONS, DECISION TREE, SUPPORT VECTOR
MACHINE, K NEAREST NEIGHBOUR, ENSEMBLE
I. INTRODUCTION
The prominent usage of the social media platforms by the
people of different sectors to express their personal feelings as
well as views. The significant actions as well as behavior
regarding the mentally depressed people as well the perception
of thinking. The usage of different social media platforms are
popular in regards to the fact that they are consistent as well as
accessible to the users with the help of Internet. There are
several studies in this area which depicts that the depression
level may be addressed in respect to the usage of the machine
learning as well as natural language processing techniques.
II. LITERATURE REVIEW
Conway and O’Connor, (2016) stated that the social media
platforms are generally consistent as well as attracts more
people that remains depressed at times. The platforms of
social networking are quite related with the fact of the mental
illness, analyze the sentiments as well as opinions of the users.
Most of the case that are related severely with depression
leads to the suicide of the persons. . In this article the leading
causes of suicide by the people of the different ages are stated
to be the significant depression in accordance to their personal
life. The age group that is mainly attracted towards depression
as well as resulting it to suicide are the students [1]. The
significant article depicts the prediction of depression in
regards to the social media. The articles puts a great emphasis
on various techniques that can collect data related to
depression from various social media platforms. The crawling
technique is one of the significant tool that mainly collects the
data from twitter, further saving those data in the significant
database using the open API [16]. This article signifies the
usage of an enhanced Web Application that detects one of the
four depression levels that is minimal, moderate, mild as well
as severe. The data from the platforms such as Twitter and
Facebook are collected with the help of the BDI-II thus
analyzing the collected data with the help of text analysis
API’s. The article also depicts the usage of Deep Learning
methodologies for the detection as well as predicting the levels
of depression [5]. The posts associated with the depression are
then classified with the help of the two classification
methodologies. The first one being the binary classification
and the other one being the multiclass classification. The
article also signifies the introduction of a Recursive Neural
Tensor Networks as well as Stanford Sentiment Treebank that
importantly detects the level of sentiment associated with a
single post. Thus the overall article states the usage of the
significant methodologies that may help in the detection of the
depression levels that persists in different social media
platforms.
As per Aldarwish and Ahmed (2019), depression is stated to
be a disorder related to the individual’s mood which is mainly
caused by the continuous feeling of sadness as well as loss of
interest in the associated field. Since, the usage of the social
media is increasing day by day the access of the social
networking websites have become comparatively easier. This
has simultaneously increased the posts and sharing of personal
feelings by the individuals [11]. This has resulted them in
expressing their emotions in an enhanced manner. Thus the
social media platforms uses enhanced tools and techniques to
identify the posts with significant depression as well as make
an analysis of the depressed persons associated with these
platforms. Firstly the user data is collected with the help of the
Support Vector Machine and then the level of depression is
examined in these posts [3]. The Model for the prediction of
depression is developed by Rapid Miner. This model mainly
consists of two important datasets as well as seven main
operators. The method associated with the prediction depicts
the different social networking sites that would reflect many
user data thus emphasizing on the category or levels of
depression. There are various sample posts provided in the
article that represents the symptoms as well as levels of
depression. The calculation matrix in respect to the level of
depression is well stated in this article thus depicting the users
Prediction of Depression Level Using Social Media Posts in Machine Learning and Natural Language Processing_1

associated with the depressing posts within the social
networking sites.
According to Schwartz et al., (2016) the effect of depression
puts a great emphasis on the personal as well as public health
of the individuals. The sector of mental illness is stated to be a
leading area in respect to disability that are increasing in this
modern times. The significant emphasis of this paper is to use
the crowdsourcing method for the collection of the data [7].
After this several methodologies are used for the measurement
of the severity in regards to the depression levels of the post.
There are significant classifiers used in this article that tends
to predict, the vulnerability of the depression associated with
the post. The article identifies the usage of the CES-D, which
is a primary tool for the determination of the depression levels
within the posts [9]. The measurement of the depressing
behavior is stated to be discussed as well as analyzed in the
article with the help of significant tools as associated with the
posts in the different social networking sites. The significant
features that are associated with the mode of depression are
stated in the article which makes the analysis of the depression
levels to be easy and primarily collected data are analyzed for
the depression factors.
Guntuku et al., (2019) also stated that social media platforms
are depicted as the important tools that deals with the
prediction of the depression among the individuals that uses
them. This article significantly states the method of user level
social media data for pursuing a survey in regards to the posts
associated with the different social media profiles. The posts
of the overall social media posts are analyzed with the help of
the Happier Fun Tokenizer that are significant emoticons as
well as social media awareness tools. Tis will help in the
differentiation of the normal posts with the post that possesses
severe level of depressing words or sentences. Moreover, the
article also states the usage of LIWC or Linguistic Inquiry and
Word Count that possesses 73 different categories associated
with the psycholinguistic manner for representing each user’s
language as well as country in respect to the analysis of the
posts provided in different languages [2]. Moreover, there are
certain machine learning tools that are associated with the
further parts of the research that continuously depicts the
positive as well as negative correlations among the data that
are being posted in the social networking sites for the
depiction of the user depression level. The psychological
stress that is being affected by the users in the daily life is
significantly viewed in the websites such as Facebook as well
as twitter. This data are then accumulated with respect to the
posts and then analysed thus providing them with significant
categories of depression levels. This analysis is mainly done
by the usage of the machine learning tools which are adhered
to possess effective significance on the measurement of the
depression levels [15]. Moreover, the ethical as well as
security considerations of the posts related to the social
platform environments are also adhered by the machine
processing tools so that the posts can be verified in respect to
the stages of depression. The sample posts are significantly
examined in this article to clearly reflect the depression stages
of the posts associated with the social networking sites.
As stated by Islam and Kabir (2019), the depression level
of the various posts associated with the users of the social
media platforms are significantly depicted with the help of the
support vector machines. These machines are stated to be the
fact of the liner classifier of the posts that mainly puts an
emphasis on the detection of anomalies. The article also
suggests the usage of the decision tree that helps in the
classification of the empirical data present as well as collected
in respect to the data set. In this article the data analysis is
conducted with the help of MATLAB [8]. The four significant
classifiers such as Support Vector Machine, K-Nearest
Neighbors, Decision trees and ensemble. This classifiers helps
in the detection of the depression levels that are associated in
the posts of the social media users. The below figure represents
the depression accuracy of the dataset that are adhered by the
social media websites in regards to the various classifiers.
(Image: Depression Accuracy)
(Source: Islam and Kabir, 2019)
III. RESEARCH GAP
The research gap that can be identified for carrying out this
research is time limitation. More time would help to do the
research in depth and would allow the reader to get more
detailed information about the research paper. No proper
sample size of carrying out the data collection part is not
mentioned in the research study. It would find it difficult to
carry out the research to next level. Another research gap that
can be identified in this research paper studying the research
topic in more details. More budget would lead to study the
project in much details.
IV. METHODOLOGY
The overall study of the depression levels that are
associated with the posts within the social networking sites are
significantly focused on four types of processes that are likely
to be emotional process, linguistic style, temporal process as
well as the stated features together. Then the important
machine learning strategies are used for each significant
factors. Moreover, the classification techniques as understood
by the overall review states the usage of decision tree, support
vector machine, k nearest neighbor as well as ensemble [10].
Firstly, the data set exploration technique states the collection
of the data as provided in the different posts associated in the
several social media networking platforms. These data are then
stored in the database for carrying out the analysis of the
overall data in regards to the depression levels. After the
analysis of the data the preparation of the data set is done using
the LIWC software. This software helps in the analysis of the
text as well as strategies helps in the processing of the posts
Prediction of Depression Level Using Social Media Posts in Machine Learning and Natural Language Processing_2

End of preview

Want to access all the pages? Upload your documents or become a member.