Cyber Security and Analytics: Data Analytics for Intrusion Detection

Verified

Added on  2022/11/09

|24
|2329
|306
Report
AI Summary
This report delves into the application of data analytics for intrusion detection within the realm of cybersecurity. It explores various tools such as Weka, Zoho Analytics, and Microsoft Power BI, alongside data analytics techniques like decision trees, backpropagation algorithms, and clustering. The report examines the use of datasets from the UCI repository, including a bank dataset, and analyzes the performance of different techniques based on accuracy metrics. Furthermore, it investigates the analysis of network traffic using pcap files and Bro IDS tools, covering data formats such as CSV, CAP, and BRO files. The report also discusses the importance of testing data applications, the use of confusion matrices, and the limitations of overfitting, concluding with recommendations and future work directions in the field of intrusion detection, including combining DM and NBA approaches for better performance in current IDS.
Document Page
1
Network security
Student’s Name:
Institution Affiliation:
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
2
Section 1
A data analytics platform helps in playing out the activities on data examination as a total
bundle, so as to perform information examination and to increase some valuable understanding
from the huge measures of information, certain devices are utilized. These tools basically fill in
as an information as a platform instrument. The three tools are as follows;
Weka analytics tool - Weka (Waikato Environment for Knowledge Analysis) is a well
known suite of AI programming written in Java, created at the University of Waikato,
New Zealand. Weka is free programming accessible under the GNU General Public
License. Weka is an accumulation of AI calculations for taking care of certifiable
information mining issues. It is composed in Java and keeps running on practically any
stage. The calculations can either be applied legitimately to a dataset or called from your
own Java code [5].
Zoho Analytics - Concentrating on convenience an especially key characteristic as
information instruments develop. Zoho analytics is a self administration alternative.
Implying that clients won't require its help staff or expert information researchers to
gather understanding from information. Remarkably, the Zoho analytics software has a
simplified interface, just as a great spreadsheet-style interface[1].
Microsoft Power BI - Power BI is a decent choice for associations searching for a
simple entrance ramp into Big Data Analytics and is an especially clear decision for those
Document Page
3
that have effectively institutionalized on a Microsoft stack. Power BI gives cloud based
business examination and coordinates what Microsoft calls "content packs" with pre-
constructed dashboards and report for various kinds of investigation and information
checking. The joint effort abilities in the stage empowers clients to share information and
dashboard, while giving cautioning capacities[2].
Data analytics techniques
Decision tree – Decision tree takes the structure of tree up different nodes such as
testing, starting, ending and branches nodes[3].
Back propation algorithm – Back propation algorithm involves training of a dataset
based on the neural network. There are two types of signals which flows on opposite
directions; that is fictional and error signals[4].
Quantitative analysis – It evaluates patterns and correlations in a dataset.
Utilizing measurable strategies, analysis depends on huge cross areas. Huge example size
characterizes speculation of got results to whole dataset. Results are numerical and in this
manner could be utilized for further numerical correlations and tasks to uncover and
measure connections.
Qualitative analysis – It depends on human word portrayal of information patterns and
relations. Dissected information test is littler contrasted with the volume of information
examined with quantitative analysis and here the objective is to delve further in the
information semantics. In light of moderately little example size, the analysis can't sum
up ends to generally speaking dataset. The outcomes are not numerical but rather
Document Page
4
enlightening and can't deliver numerical dataset. The yield of Qualitative investigation is
a portrayal of the interdependencies communicated in words[6].
Clustering (unsupervised AI) – It is an unaided learning strategy by which information
order (division) is taken care of into various gatherings dependent on likeness of
information properties as a division foundation. There is no requirement for past learning
of classes. Classes are framed implicitly as an outcome of information gathering rather
than this. Information arrangement calculation uses frames in a manner by which
information is assembled or separate calculations utilize distinctive group recognizable
proof strategies[7].
Demonstration
Datasets
We are going to use the datasets from the UCI repository. It is a bank dataset with 11 attributes
and 600 data items. The link for the dataset is provided in the instruction file.
Results
Here are the results of analyzing the bank dataset in Weka software. We are going to
compare two techniques that is decision tree and back propation algorithm. MLP is back
propation and j48 will be decision tree algorithm[8].
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
5
The snippet below illustrate the chart for the accuracy of the bank dataset
Document Page
6
Document Page
7
Here is the snippet for evaluation parameters on vote data set
The snippet below is for comparative accuracy for the whole data set that is bank and vote
datasets.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
8
We have classified the UCI dataset with two techniques that is decision tree and ack propation
algorithm using Weka analytics tool. Performance of the two techniques is based on the accuracy
like TP and FP[9].
Section 2
Pcap files are captured in router 1 in the figure shown below. The purpose of the configuration
below is to capture the normal and abnormal traffic in the network. We configured the setup with
IXIA tool. This tool is used to generate both the normal and the attack traffic in the network.
Document Page
9
Traffic analysis
The table shown below illustrate the period of simulation and different bytes in the network. It
also indicate both the normal and the abnormal traffic in the network.
Document Page
10
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
11
Document Page
12
Architectural Framework
The table shown below elaborates on all the features of the CSV files.
chevron_up_icon
1 out of 24
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]