logo

Text Analysis on Patients’ Response on WEKA

   

Added on  2022-11-07

19 Pages3669 Words106 Views
 | 
 | 
 | 
WEKA 1
TEXT ANALYSIS ON PATIENTS’ RESPONSE ON WEKA
NAME OF AUTHOR
NAME OF PROFESSOR
NAME OF CLASS
STATE AND CITY
DATE
Text Analysis on Patients’ Response on WEKA_1

WEKA 2
Executive Summary
Data is to be downloaded from the EDU website under the URL;
http://mlr.cs.umass.edu/ml/. the data that will be downloaded will be in a zip file and
the contents will be datasets that area actually in text format and nothing else.
From here, the analysis will be done as per requirement and the required
visualization will be gotten from the required tool of analysis in this question. Before
the analysis section, there will be an introductory part that illustrates the analytics
background, the motivation and the aim of the purpose that leads to the exercise
behind this report. After the introduction, there will follow the data introductory part
that gives a clear view of the data processing processes, the summary of every
variable, the visualizations on every variable and the data size and the actual data
quality that is gotten from the actual analysis. There will be a section with
explanations of the main data techniques that are employed in the bid of satisfying
application aim. The techniques employed, weather classification of clustering, data
mining algorithms adopted to execute the technique chosen and there will be a
discussion on the actual counterpart algorithm that proves to be viable for the
actual analysis of. The next section will be an evaluation and demonstration of the
algorithm models that have been selected for the technique chosen. There will then
be a conclusion part in the long run.
Introduction of Data Analytics Background
The assignment that leads to the development of this order depends on
machine learning and lots of data mining techniques. The actual software that will
be used is the WEKA software that was developed by the university of WAIKATO. It
has a rich set of the community since there are both students, professors and
Text Analysis on Patients’ Response on WEKA_2

WEKA 3
specialities that are involved in the improvement of its features. This fact makes
WEKA grow in popularity and favour among other machine learning software.
Additionally, what makes WEKA highly interactive is the fact that WEKA does not
require codes to be run in its platform for results that are being sorted to be
realized. The software itself allows for tabs to just be chosen and the machine
learning algorithms that are to be developed to be chosen without too much of a
hustle. If one is not good in learning codes and have no codding skills, then they are
surely covered as they can definitely choose from both WEKA and rapid minor since
these are machine learning software that is as well more elaborate and yet do not
require any codes that are run in them for results (Abusnaina, Abdullah and Kattan,
2015).
Machine learning as it stands now is widespread and the is a subset of big data.
From big data, we get to the bigger body, machine learning. In machine learning,
this is where actual algorithms that aid in computer systems operations are based.
In other words, from the name itself, it is a process that ensures that machine is
made to learn how to operate and process all the junk that it gets provided with. For
example, take the case of credit card transactions and email communication from
one end to another. There get to be fraudulent transactions indifferent times
amongst those that are not fraudulent. Systems are therefore set to detect
fraudulent transactions and prevent such transactions from taking place and
therefore there is a security on a company that handles credit card and then there
are profits ensured for companies and people that receive payments since, in the
initial stage, all the fraudulent transactions are stopped from the first point. The
best example is withdrawing payments from PayPal, in this case, when one tries to
withdraw money and the cancels with the aim of trying to raise the amounts that
Text Analysis on Patients’ Response on WEKA_3

WEKA 4
are intended for withdrawal, one will surely be limited in terms of the process of
withdrawal, the time will be limited and the withdrawal will have to take several
hours. This surely will make one who has ill intentions on an individual's PayPal
account to give up since there will be a limit for quite some time. In the case of
email communication emails that are received are either classified as spam or non-
spam based on the actual content of the email itself. This allows for the
classification of emails as spam or non-spam and the non-spam emails, therefore,
are sent to peoples' inboxes whereas the spam emails are sent into the spam
folders.
In our case, we will use WEKA to get to do machine learning tasks to help classify
which URL gives a piece of genuine and truthful advice on patients' health. Some
feedbacks are gotten from patients that have visited hospitals for treatment. These
feedbacks are mostly tweets that are sent via peoples' URLs. Therefore, there was
the collection of the tweeters ID, the date details when the tweets are sent, the
description of the every individual's response and finally the URL from where the
tweet was sent. There will be classification in regards to the URL as some URLs are
not trusted worthy while others are very trustworthy.
Summary of Dataset
From the website or download, the dataset will be in text format and this
cannot be incorporated into WEKA in this format. There will be the need to
transform it into a CSV file. The actual dataset that is transformed into the CSV file
will be having up to eight variables in total. This then translates to; ID, DAY, MONTH,
DATE, TIME, YEAR, DESCRIPTION and URL. The ID is for the ID of the person sending
the tweets ad the DAY variable is for the day the tweet was sent. The URL and the
Text Analysis on Patients’ Response on WEKA_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents