Project: Phishing Website Detection Using Machine Learning Techniques

Verified

Added on  2023/01/20

|16
|1468
|94
Project
AI Summary
This project delves into the critical issue of phishing website detection, employing machine learning techniques to identify and mitigate the risks associated with these malicious sites. The project begins with an introduction to phishing, its goals, and objectives, emphasizing the use of machine learning in detecting phishing websites. A comprehensive literature review is conducted, comparing various machine learning algorithms and features, including lexical and host-based features, to understand their effectiveness in classifying phishing URLs. The project also highlights the importance of user education and software-based approaches in combating phishing attacks. The methodology utilizes a positivism philosophy, incorporating both human and scientific perspectives, with primary data collection planned. A detailed project schedule, including a Gantt chart, outlines the tasks, duration, and dependencies. The conclusion emphasizes the significance of online algorithms and the need for continuous adaptation to counter evolving phishing tactics, along with future research directions. The project aims to provide a framework for understanding and combating phishing attacks, contributing to enhanced cybersecurity measures.
Document Page
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Introduction:
Phishing is the new term used for Fishing.
Act of attack in which the attacker lure the user for visiting
a fake website
This is done by means of looking similar to a particular
website.
Followed by this is the stealing of the users personal
information.
Document Page
Goals
The main goal of this project is understand the basic concept of phishing
and how machine learning can be used for the purpose of detecting the
Phishing websites. The phishing website detection techniques are broadly
classified into two major categories and this is associated with including
the user education and the software.
The user education approach is associated with including providing
education to the users related to safe browsing practices. The software
approach is consisting of
Document Page
Objectives
To understand phishing
To understand how machine
learning can be used for
detection of phishing website
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Literature review:
Ma et al. In their paper have been associated with comparing the
different batch dependent learning algorithms so as to classify the
phishing URLs and was also been associated with showing the fact that
the combination of host dependent and the lexical features initially
results in the highest accuracy in the process of classification. The
paper have also been associate with comparing the performance
dependent algorithms with the online algorithms while the full features
were being used. From this they were capable of finding out the fact
that the online algorithm which mainly includes the Confidence-
Weighted (CW) is associated with outperforming the batch-based
algorithms.
Document Page
Literature review:
The paper presented by Garera et al. have been associated with the
usage of the logistic regression over features which are hand-selected
for the purpose of classifying phishing URLs. The feature have been
associated with including the presence of red flag keywords in the
URL, some features which are dependent upon Google’s Page Rank,
and guidelines for the Google’s Web page quality. However, it is seen
to be very much difficult in making the direct comparison with the
approach without having an access to the same URLs and features.
Document Page
Literature review:
McGrath and Gupta was not associated with the construction of an
classifier, but was responsible for conducting a a comparative analysis
of the phishing as well as the non phishing URLs with respect to
datasets. Authors have been associated with comparing the non
phishing URLs which were drawn from the DMOZ Open Directory
Project with the phishing URLs which were obtained from PhishTank.
Some of the important features which were analyzed in this paper
included the IP addresses, thin records of the WHOIS consisting of the
date and the information provided from the registrarn, geographic
information, and the URLs lexical features like the character
distribution, length, and presence of predefined brand names.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Summary of review:
By comparing the different features by usage of the different data mining
algorithms has been associated with pointing out the fact that it is possible to
achieve the efficiency if the lexical features are used.
For the purpose of making sure that the end users are being protected from
visiting the phishing websites, there is need of making attempts so as to identify
the phishing URLs by means of analysis of the lexical and the host-based
features.
One of the problem which is faced in this particular domain is the fact that the
cyber criminals are constantly associated with the creation of new strategies for
breaching the defense measures.
Document Page
Project planning:
Methodology
Positivism philosophy would be used for the entire research.
This theory is associated with including the combination of the human
perspective as well as the scientific perspective
Primary data is to be collected.
Quantitative as well as qualitative data would be used
Document Page
Project planning:
Scope:
The project is aimed at understanding the concept of
Detection of phishing websites using machine
learning framework.
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Project Schedule:
Task Mode Task Name Duration Start Finish Predecessors Resource Names
Auto Scheduled Research Project 48 days Wed 12/7/16 Fri 2/10/17
Auto Scheduled Research Proposal 8 days Fri 4/26/19 Tue 5/7/19
Auto Scheduled Choosing a topic for research 3 days Fri 4/26/19 Tue 4/30/19
Auto Scheduled Background Study of the Research 6 days Fri 4/26/19 Fri 5/3/19
Auto Scheduled Development of the Research Question 3 days Wed 5/1/19 Fri 5/3/19 3
Auto Scheduled Designing the Conceptual Framework 2 days Mon 5/6/19 Tue 5/7/19 4
Auto Scheduled Development of the Research Question 2 days Mon 5/6/19 Tue 5/7/19 5
Auto Scheduled Research Proposal Submission 0 days Tue 5/7/19 Tue 5/7/19 6
Auto Scheduled Review of the Literature and Collection of the Data 27 days Fri 4/26/19 Mon 6/3/19
Auto Scheduled Reviewing the available literature 10 days Wed 5/8/19 Tue 5/21/19 7
Auto Scheduled Selecting target population for collecting the data 2 days Fri 4/26/19 Mon 4/29/19
Auto Scheduled Collecting the Data of the Research Study 9 days Wed 5/22/19 Mon 6/3/19 10
Auto Scheduled Analysing the gathered data 4 days Tue 4/30/19 Fri 5/3/19 11
Auto Scheduled Submission of the Draft Research paper 0 days Mon 6/3/19 Mon 6/3/19 12
Auto Scheduled Submission of the Final Project Paper 10 days Fri 4/26/19 Thu 5/9/19
Auto Scheduled Critical Analysis of the findings 3 days Mon 5/6/19 Wed 5/8/19 13
Auto Scheduled Concluding the Findings of the Study 3 days Fri 4/26/19 Tue 4/30/19
Auto Scheduled Recommendations 1 day Thu 5/9/19 Thu 5/9/19 16
Auto Scheduled Submitting the Final Project Report 0 days Tue 4/30/19 Tue 4/30/19 17
Document Page
Gantt Chart:
Document Page
Work Breakdown Structure
Research Project Research Proposal
Choosing a topic for
research
Background Study
of the Research
Development of the
Research Question
Designing the
Conceptual
Framework
Development of the
Research Question
Research Proposal
Submission
Review of the Literature
and Collection of the Data
Reviewing the
available literature
Selecting target
population for
collecting the data
Collecting the Data
of the Research
Study
Analysing the
gathered data
Submission of the
Draft Research
paper
Submission of the Final
Project Paper
Critical Analysis of
the findings
Concluding the
Findings of the
Study
Recommendations
Submitting the Final
Project Report
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Conclusion and Future Direction:
Phishing has become a serious network security problem and this in turn has been
putting forward a financial loss of billion of dollars for the customers as well as for
the e-commerce companies.
Phishing has been associated with making the e-commerce sites distributed as well as
attractive for the normal customers.
It is possible to eliminate this risk by having an algorithm which would be associated
with constant adaptation of the new examples as well as the features of the Phishing
URLs.
The online algorithms are associated with providing a better learning methods than
the batch-based learning mechanisms.
Moving forward with the discussion it can be stated that it is important to understand
the different aspects of the online learning along with the collection of the data so as
to identify the new trends included in the phishing activities like the fast changing of
Document Page
Reference:
S. Naaz, Detection of Phishing Websites Using Machine Learning
Approach, 2019.
K.L. Chiew, C.L. Tan, K. Wong, K.S. Yong, and W.K. Tiong, A new
hybrid ensemble feature selection framework for machine learning-
based phishing detection system, Information Sciences, 484, pp.153-
166, 2019.
S. Smadi, N. Aslam, and L. Zhang, Detection of online phishing email
using dynamic evolving neural network based on reinforcement
learning, Decision Support Systems, 107, pp.88-102, 2018.
Document Page
Reference:
J. Ma, L.K. Saul, S. Savage, and G.M. Voelker, Beyond blacklists:
learning to detect malicious web sites from suspicious URLs,
In Proceedings of the 15th ACM SIGKDD international conference
on Knowledge discovery and data mining (pp. 1245-1254), ACM ,
2009, June.
J. Ma,, L.K. Saul, S. Savage, and G.M. Voelker, Identifying
suspicious URLs: an application of large-scale online learning,
In Proceedings of the 26th annual international conference on
machine learning (pp. 681-688), ACM, 2009, June..
J. Hong, The state of phishing attacks, Commun, ACM, 55(1),
pp.74-81, 2012.
D.K. McGrath, and M. Gupta, Behind Phishing: An Examination of
Phisher Modi Operandi, LEET, 8, p.4, 2008.
chevron_up_icon
1 out of 16
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]