logo

A New Intrusion Detection System Based on Fast Learning Network and Particle Swarm Optimization Presentation 2022

   

Added on  2022-09-26

7 Pages6793 Words26 Views
Received January 10, 2018, accepted March 18, 2018, date of publication March 27, 2018, date of current version April 25, 2018.
Digital Object Identifier 10.1109/ACCESS.2018.2820092
A New Intrusion Detection System Based
on Fast Learning Network and Particle
Swarm Optimization
MOHAMMED HASAN ALI 1, BAHAA ABBAS DAWOOD AL MOHAMMED 2,
ALYANI ISMAIL2, (Member, IEEE), AND MOHAMAD FADLI ZOLKIPLI1
1Faculty of Computer Systems and Software Engineering, University Malaysia Pahang, Malaysia 26300
2Department of Computer and Communication Systems Engineering, Faculty of Engineering, Universiti Putra Malaysia, Malaysia 43400
Corresponding author: Mohammed Hasan Ali (mh180250@gmail.com)
ABSTRACT Supervised intrusion detection system is a system that has the capability of learning from
examples about the previous attacks to detect new attacks. Using artificial neural network (ANN)-based
intrusion detection is promising for reducing the number of false negative or false positives, because ANN
has the capability of learning from actual examples. In this paper, a developed learning model for fast learning
network (FLN) based on particle swarm optimization (PSO) has been proposed and named as PSO-FLN.
The model has been applied to the problem of intrusion detection and validated based on the famous dataset
KDD99. Our developed model has been compared against a wide range of meta-heuristic algorithms for
training extreme learning machine and FLN classifier. PSO-FLN has outperformed other learning approaches
in the testing accuracy of the learning.
INDEX TERMS Fast learning network, KDD Cup 99, intrusion detection system, particle swarm
optimization.
I. INTRODUCTION
In recent years, computer network security is a major concern
of computer society due to the development of technolo-
gies and internet services at a rapid pace. Developments in
computer technology have enabled various new possibilities,
including the ability to remotely manage and control systems,
as well opening up a gateway to a multitude of information
through online sources. Organizational level cyber security
has consequently become a chief concern, Goodarzi et al. [1]
explored the problems faced by organizations in keeping their
information protected, available and reliable. This has created
the motivation for keeping systems secured from any external
system, program, or person aiming at breaking the security
line of the network. There are many tools and applications
developed to increase the security of the environments like
systems, networks and computers. Intrusion Detection Sys-
tem (IDS) is one of that tools that tries to protect the systems
from an intruder. IDS monitors the single machine or com-
puter network for intruder [2]. It is useful not only in detecting
successful intrusions, but also in monitoring attempts to break
security, which provides important information for timely
counter-measures [3].
The initial proposal to use intrusion detection in an attempt
to address misuses and networking attacks in computers, was
put forth by Denning [4] in 1987. The process is implemented
by an intrusion detection system. Presently such systems
are widely available with variety [5], points out the gen-
eral ineffectiveness and lack of sufficiency provided by the
present commercially available systems, this brings to light
the need for ongoing research on more dynamic intrusion
detection systems. In order to execute the process of intrusion
detection, there is a need to identify ongoing or attempted
intrusions or attacks on the system or network, this identifica-
tion data include data collection, behavior classification, data
reduction, and lastly reporting and response, this is referred
to, as ID [6].
The IDS attempted to determine whether monitored user
activity or network traffic is malicious. If a malicious attack
is detected, an alarm would be generated. Various differ-
ent techniques are available for IDSs’ to distinguish an
attack, such as anomaly detection or signatures of attack,
Green et al. [7] also point out that the success of IDS
depends upon these techniques. One amongst the principal
factors governing the efficacy of the IDS is the quality of the
VOLUME 6, 2018
2169-3536 2018 IEEE. Translations and content mining are permitted for academic research only.
Personal use is also permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
20255

M. H. Ali et al.: New Intrusion Detection System Based on FLN and PSO
feature construction and feature selection algorithm. In order
to improve the overall efficiency of the IDS, a drop in the
number of applicable traffic features without incurring any
adverse effects on classification accuracy is required.
In recent times, we have seen an exponentially great
increase in the employment of Artificial Intelligence (AI)
in a tremendously large and vast number of fields, such
as; computer vision, robotics, control, communication and
various engineering fields. AI combined of several sub
fields such as neural network, evolutionary searching, expert
systems, fuzzy systems, etc. Although a lot of researchers
prefer AI models with interpretability aspects such as heuris-
tically knowledge building based models like fuzzy sys-
tems, artificial neural networks ANN, which had no explicit
interpretability aspect is considered as more effective AI
models when learning scheme is feasible. This is due to
the power of capturing knowledge through examples pro-
vided to such models. This has created a strong motiva-
tion to researchers for building supervised learning models
to predict intrusion attacks based on collected data set of
examples of various attacks. There exists a very large num-
ber of methods, most of which have been used for differ-
ent intrusion-detection models to perform a diverse set of
important tasks, some of these methods include; Machine
learning based, Hybrid ANN based and/or integrated tech-
niques. Additionally, as presented by Kiranyaz et al. [8],
there are hybrid data mining schemes, hierarchical hybrid
intelligent system models, and ensemble learning approaches
all of which have gained popularity in the works reviewed.
The remainder of the present work is arranged as such;
we start in section 2 with related work. Section 3 talk
about the data set KDD. Section 4 formulation problem.
Section 5 developed methodology. Section 6 results and
discussion. The conclusion and summary in this work in
section 7.
II. RELATED WORK
Artificial Neural Networks (ANNs), from input patterns,
it can be approximate complex nonlinear mappings directly,
and has been used in a lot of applications with great
success [7]. Artificial Neural Networks (ANNs), given their
ability to approximate complex nonlinear mappings directly
from input patterns, have been frequently used in a variety of
applications with great success [9]. Based on gradient descent
algorithms training samples would be used to define the free
parameters of ANNs. Moreover, this reason for brings some
issues related to its local minima and the learning process
relatively became slow. Owing to these shortages, also train
ANNs could take much more time and have a suboptimal
solution [8]. For solving the above problems, it has been a hot
topic to reduce the computing iterations and simultaneously
decrease the training time [9]–[11]
In order to address the aforementioned problems,
Huang et al. [12] propose the use of a new artificial neural
network, known as an Extreme Learning Machine (ELM).
ELM is defined as new learning approach for Single Hidden
Layer Feedforward Neural Network (SLFN), where random
value generation is used for the input weights and the bias of
hidden nodes without tuning, and where the output weights
are determined analytically.
Extreme learning machine as explored by (Huang et al.,
2004), avoids several disadvantages of gradient descent-
based learning algorithm for SLFNs. Research on the approx-
imation abilities of Feed-Forward Neural Networks (FFNN’s)
focuses on two primary features: universal approximation
on compact input sets and approximation in a finite set of
training samples [12]. Some general advantages of ELM
algorithms are; simple and robust implementation, tendency
to converge with the shortest training error, and smallest norm
of weights, and generally good performance, with extremely
fast running. These amongst other help differentiate ELM
from the other SLFN algorithms.
The ELM algorithm is based on three steps training; firstly,
assigning random weights in the input-hidden layer, secondly,
calculating the output hidden layer matrix, and thirdly, cal-
culating the output layer weights based on Moore-Penrose
equation [11]. Based on the idea of ELM, Li et al. [13]
proposed a novel Fast Learning Network (FLN). The FLN is
a Double Parallel Forward Neural Network (DPFNN) [14],
which is essentially a parallel connection of a multilayer
FFNN, and an SLFN. The re-coded external information from
the hidden nodes, along with the external information itself
directly from the input nodes is fed into the output nodes of
the DFNN’s. Input weights as well as hidden layer biases
are generated in a random manner for FLN’s, but where an
analytical approach, based on a least squares method is used
to determine the weights of values for the connection between
the output layer and the input layer and the weights of values
for connecting the output node and the input. If a comparison
is made between relating methods FLN, is capable of reach-
ing a good general high speed performance, with impressive
stability in most scenarios, whilst running with a smaller
number of hidden units.
In order to build an effective and reliable ANN based
intrusion-detection system, there is a high need to pro-
vide comprehensive data set for teaching the ANN model.
Although several data sets exist within the literature for such
a knowledge building, there is a significant challenge that
needs to be addressed in this respect. More specifically, most
of the dataset do not provide enough examples for teaching
the models in an explicit way due to the less frequency of
some attacks. This has caused a concern on how to rely
on the available small examples of data of attacks in order
to build generalizable knowledge for AI models to use it
in detecting similar non-stored attacks. An example for one
common dataset used for training models on intrusion attacks
is KDD99.
Although ELM approach of training for both SLFN and
FLN is quite easy and provides non-iterated learning for the
model it has one important limitation. Actually, it is having
an infinite number of degree of freedom to reach a classi-
fication result. In other words, there is no one deterministic
20256 VOLUME 6, 2018

M. H. Ali et al.: New Intrusion Detection System Based on FLN and PSO
solution to train an SLFN network with basic ELM training.
Assuming that the possible weights of the input-hidden layer
connections are potential solution for training ELM, there are
certain values of set of solutions with more superiority if the
goal is to obtain best knowledge extraction from the data set.
We call the process of finding those solutions based upon an
extension of ELM a developed ELM. Our goal is to design a
learning mechanism based on two factors: the nature of the
data set, and the nature of the evaluation measures that are
aimed to be used for evaluating the learning mechanism or
algorithm.
III. DATA SET KDD99
ANN based intrusion detection has to be trained on selected
Dataset. In order to demonstrate the effectiveness of our
model, we choose the highest dataset in terms of citation to
the literature of intrusion KD99. Furthermore, we present the
different issue that is addressed in the literature.
A. OVERVIEW OF KD99
KDD Cup 99 is considered the most accepted research dataset
highly appropriate to benchmark performance [17], also notes
its use in comparing the effectiveness of various approaches
to Network Intrusion. KDD CUP 99 is built based on the data
captured in DARPA’98 IDS program [18]. DARPA’98 con-
tains approximately 4GB of compressed raw (binary) tcp-
dump data. This contains roughly 7 weeks of monitored
network traffic. This data can consequently, be managed into
about 5 million linking records, each about 100 bytes. KDD
training data set consists of approximately 4,900,000 single
connection vectors each of which contains 41 features and
is labeled as either normal or an attack, [19], the attacks can
thereafter be categorized into exactly one of four, as detailed
below;
Denial Service of Attack (DoS): DOS is an attack which
essentially involves the resources are too busy to handle other
requests or the attacker making use of specific resources to an
extent that denied access for legitimate users.
User to Root Attack (U2R): It is a form of security exploita-
tion, whereby the attacker would gain access to a normal user
account, through conventional means, and thereafter proceed
to attempt root access to the system through the exploitation
of a vulnerability.
Remote to Local Attack (R2L): this is when an
attacker attempts access to a system over a network.
The attacker can only transmit data packets over the net-
work, the attacker attempts to gain access to the machine,
by exploiting some vulnerability.
Probing Attack (Prob): It is when an attacker attempts to
acquire information from a network, for evading the systems,
security protocols.
Since 1999, a large number of researchers assessed their
IDS models using KDD Cup 99. This shows how KDD
Cup 99 has been a working benchmark data set for over
15 years, and is still easily accessible and available today.
The objective of the KDD 99 IDS competition is to create a
standard data set for the surveying and evaluation of research
in intrusion detection, [15]. Researchers found some difficul-
ties or hurdles in training with KDD99, Olusola et al. [16]
have analyzed the KDD 99 data set for selecting a relevant
feature. They proposed that some features or attributes were
not related to any attack, [17] they have taken 10% of the
whole data set to perform their analysis.
IV. FORMULATING THE PROBLEM
Intrusion detection based on ANN is built by using gathered
features about several types of attacks. Usually, building
knowledge based on gathered data required sufficient amount
of data with comprehensive nature. Unfortunately, in the
application of intrusion detection, it is not feasible to create a
sufficient knowledge for learning or at least balanced learning
between the different classes (refer to the problem described
in KDD99 in the previous section). Therefore, learning algo-
rithm has to be carefully optimized according to the nature of
the dataset. This leads us to investigate about how to identify
the optimization parameters of the learning algorithm. In this
work, the problem will be formulated as an optimization
problem. More specifically, the problem is how to find the
optimal values of the hidden layer neurons in both SLFN,
and FLN in order to maintain highest accuracy of testing.
Such problem is addressed in the literature as a heuristic
searching in the space of solutions considering the aim is
to minimize an objective function represents the accuracy
of the classification of attacks. Mathematically, assuming
that the accuracy of the testing is the function f (x), where
x = (x1, x2, . . . xn) denotes the random selected different
weights of hidden layer network. Our problem is presented
in equation (1)
x = argmax f
s.t. (x1, x2, . . . xn) [1, 1]n (1)
V. DEVELOPED METHODOLOGY
This section presents the developed methodology for this
research. Firstly, particle swarm optimization is PSO pre-
sented in section. Secondly, particle Fast Learning Net-
work (FLN) presented in section. Thirdly, our adaption of
PSO to build FLN based training for IDS is presented in
section.
A. PARTICLE SWARM OPTIMIZATION
Particle Swarm Optimization (PSO) is a parallel evolu-
tionary computation technique developed by Mishra and
Sengupta [23]. The protocol has been developed based on the
social behavior metaphor. The PSO algorithm’s performance
is greatly influenced by the included tuning parameters, often
referred to as the exploration– exploitation tradeoff: whereby
exploration describes the ability to assess various regions in
the problem space to an attempt to pinpoint a good optimum,
preferably the global one. Exploitation describes the ability
to focus the search within near vicinity of a promising candi-
date solution, to effectively and quickly locate the optimum.
VOLUME 6, 2018 20257

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
A New Intrusion Detection System Based on Fast Learning Network and Particle Swarm Optimization Article 2022
|4
|743
|29

Network Intrusion Detection Framework Analysis 2022
|11
|3856
|28

Whale Optimization Algorithm-trained Artificial Neural Network Article 2022
|15
|8872
|22