logo

Case Study B Sentiment analysis is the technique aiming to gauge

   

Added on  2022-12-27

6 Pages437 Words64 Views
 | 
 | 
 | 
Case Study B
Sentiment analysis is the technique aiming to gauge the attitudes of
customers in relation to
topics, products and services of interests. It is a pivotal technology for
providing insights to
enhance the business bottom line in campaign tracking, customer-centric
marketing strategy
and brand awareness. Sentiment analytics approaches are used to produce
sentiment
categories such as ‘positive’, ‘negative’ and ‘neutral’. More specific human
emotions are also
the topic of interest. There are two major streams of methods to develop
sentiment analytics
engine: the dictionary-based and machine learning-based approaches. In this
assignment, you
are required to perform sentiment analytics based on both approaches.
As a data scientist, you are required to perform a number of data analytics
tasks. You are
tasked to develop both dictionary-based and machine-learning sentiment
analytics engines
using R programming language and apply it to predict the sentiments of
hotel review tweets
from a sample of data. You are also required to use the SAS Sentiment
Analysis Studio to
compare the results.
Develop a statistical model using SAS Sentiment Analysis studio and
evaluate theaccuracies
Use the data folder: ‘hotel_tweets’ which contain ‘negative’ and ‘positive’
tweets
for training and testing.
Build a statistical model using SAS Sentiment Analysis (either simple or
advanced), you may change configurations in the advanced model to obtain
the
best training accuracy.
EXPLAINATION:
Case Study B Sentiment analysis is the technique aiming to gauge_1

Preprocessing steps
The targed of the following preprocessing is to create a Bag-of-Words representation of the
data. The steps will execute as follows:
1. Cleansing
A. Remove URLs
B. Remove usernames (mentions)
C. Remove tweets with *Not Available* text
D. Remove special characters
E. Remove numbers
2. Text processing
A. Tokenize
B. Transform to lowercase
C. Stem
3. Build word list for Bag-of-Words
Cleansing
For the purpose of cleansing, the TwitterCleanup class was created. It consists methods
allowing to execute all of the tasks show in the list above. Most of those is done using
regular expressions. The class exposes it's interface through iterate() method - it yields
every cleanup method in proper order.
Used Python Libraries
Case Study B Sentiment analysis is the technique aiming to gauge_2

Read data
Creating dependent and independent variable / train and test split
Transforming the object type dataframe to int because classifier
algorithms don’t accept object columns
Case Study B Sentiment analysis is the technique aiming to gauge_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents