logo

UCO: A Unified Cybersecurity Ontology

   

Added on  2022-11-13

8 Pages5971 Words495 Views
UCO: A Unified Cybersecurity Ontology
Zareen Syed, Ankur Padia, Tim Finin, Lisa Mathews and Anupam Joshi
University of Maryland, Baltimore County, Baltimore, MD 21250
{zsyed, ankurpadia, finin, math1, joshi}@umbc.edu
Abstract
In this paper we describe the Unified Cybersecurity On-
tology (UCO) that is intended to support information in-
tegration and cyber situational awareness in cybersecu-
rity systems. The ontology incorporates and integrates
heterogeneous data and knowledge schemas from dif-
ferent cybersecurity systems and most commonly used
cybersecurity standards for information sharing and ex-
change. The UCO ontology has also been mapped to a
number of existing cybersecurity ontologies as well as
concepts in the Linked Open Data cloud (Berners-Lee,
Bizer, and Heath 2009). Similar to DBpedia (Auer et
al. 2007) which serves as the core for general knowl-
edge in Linked Open Data cloud, we envision UCO to
serve as the core for cybersecurity domain, which would
evolve and grow with the passage of time with addi-
tional cybersecurity data sets as they become available.
We also present a prototype system and concrete use
cases supported by the UCO ontology. To the best of our
knowledge, this is the first cybersecurity ontology that
has been mapped to general world ontologies to sup-
port broader and diverse security use cases. We compare
the resulting ontology with previous efforts, discuss its
strengths and limitations, and describe potential future
work directions.
Introduction
Cybersecurity data and information is usually generated by
different tools, sensors and systems expressed using differ-
ent standards and formats, published by different sources
and is often scattered as isolated pieces of information.
Furthermore, cybersecurity data is available in structured,
semi-structured and unstructured forms from both, internal
sources i.e. within the organization, and external sources i.e.
outside the organization. Unifying such scattered informa-
tion will provide better visibility and situational awareness
to cybersecurity analysts. Also, such integration can support
deep investigations and help transitioning from reactive ap-
proach to a more proactive and eventually a predictive ap-
proach.
Semantic Web technologies provide representation lan-
guages to build a common framework that allows data to
Copyright c© 2016, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
be shared, integrated and reused across applications, enter-
prises as well as community boundaries. Languages, such as
RDF and OWL, represent the semantics of an entity as a set
of things or concepts rather than strings of words. They pro-
vide rich constructs to represent information that is not only
machine readable, but also machine understandable, thus fa-
cilitating semantic integration and sharing of information
from heterogeneous sources. Languages like OWL have well
defined constructs to map classes and instances present in
the internal knowledge base to corresponding classes and in-
stances in external knowledge bases. This mapping exposes
a larger pool of knowledge and helps in providing a more
complete picture and situational awareness.
Figure 1: Things vs. Strings. “Strings” are ambiguous and
can refer to different concepts in the real world. “Things” are
precise and reference unique concepts using unique identi-
fiers, such as Web URIs.
Semantic Web technologies represent real world entities
as concepts rather than strings, as strings are lexical and
ambiguous. Concepts are associated with a globally unique
identifier called URI. For example, the string “Georgia” may
refer to “Georgia state” in the United States or “Georgia
country” (Figure 1). Moreover, concepts can be associated
with attributes and can have relations with other concepts.
These attributes and relations can be used to build up a con-
text for the concept. An entity like “Georgia country” can
have “longitude” and “latitude” as attributes, which provideThe Workshops of the Thirtieth AAAI Conference on Artificial Intelligence
Artificial Intelligence for Cyber Security: Technical Report WS-16-03
195
UCO: A Unified Cybersecurity Ontology_1
Figure 2: Semantic Relations enable supporting complex se-
curity use cases, for example, if “Georgia (country)” has
“neighbor” relation with “Russia” it may raise more alarms
if several past incidents originated from Russia.
information about its location on the map and its neighbor-
ing countries. Moreover, such information can be used to
derive inferences about possible source of attack. For exam-
ple, if an incident originates from “Georgia country” and it’s
neighboring country is Russia, then it may raise more alarms
if many cybersecurity attacks have originated from Russia in
the past (Figure 2). Furthermore, these relations can help in
connecting the dots and relating incidents with similar inci-
dents to gain insight into the source and motivation of the
attack.
Semantic technologies are used by big data companies
like Google, Microsoft, Facebook and Apple (Domingue,
Fensel, and Hendler 2011) for information sharing and in-
teroperability and supporting high level functions like an-
alyzing queries, providing semantic search and answering
questions. In order to achieve situational awareness, cyber-
security systems need to transition to produce and consume
semantic information about likely entities, relations, actions,
events, intentions and plans.
We have developed Unified Cybersecurity Ontology
(UCO) as an effort to help evolve the cybersecurity stan-
dards from a syntactic representation to a more semantic
representation. We see several contributions that our work
has to offer:
1. UCO ontology provides a common understanding of cy-
bersecurity domain and unifies most commonly used cy-
bersecurity standards.
2. Compared to existing cybersecurity ontologies which
have been developed independently, UCO has been
mapped to a number of existing publicly available cyber-
security ontologies to promote ontology sharing, integra-
tion and reuse. UCO serves as a backbone for linking cy-
bersecurity ontologies.
3. UCO maps concepts to general world knowledge sources
i.e. Linked Open Data cloud to support diverse use cases.
4. We describe important use cases that can be supported by
unifying cybersecurity data with existing general world
knowledge through the UCO ontology.
5. We have generated a catalog of cybersecurity standards
that is available online1.
This paper is organized as follows: In section 2 we briefly
introduce RDF and a subset of OWL language, OWL DL.
In section 3 we outline our approach for ontology construc-
tion and describe the UCO ontology along with other related
ontologies. Section 4 presents the design and implementa-
tion of a demonstration system with real world cybersecu-
rity data that uses the UCO ontology to support a number of
use cases. We review related work in section 5 and conclude
with a summary for future work in section 6.
Preliminaries
Resource Description Framework (RDF)
The Resource Description Framework2 is a W3C standard
to represent knowledge as a semantic graph in which the
nodes represent entities, concepts or literal values and the
arcs represent relations. Thus, we can think of a knowledge
bases as a collection of triples with a subject, predicate and
object. The subject is usually the entity that is being rep-
resented. The predicate represents an attribute or a relation
of the subject and is used to associate with an object. The
object can be a literal or a resource. Typically, each of the
resource is identified with a URI. Example of an RDF triple
can be < John, studiesAt, School >.
OWL DL
OWL DL3, a sublanguage of OWL, which is based on De-
scription Logics is a tractable fragment of First Order Logic
and is used for knowledge representation. OWL DL is a
W3C standard to represent knowledge and is more expres-
sive compared to RDF. Formal definitions of some of the
constructs used in DL are shown in Table 1. Further details
on the constructs can be obtained from (Baader 2003).
Approach
Our approach to support cyber situational awareness has
been through the development of a core cybersecurity ontol-
ogy that facilitates data sharing across different formats and
standards and allows reasoning to infer new information. We
have surveyed, reviewed and cataloged existing cybersecu-
rity standards and ontologies and selected the most common
and widely used standards to incorporate in UCO ontology.
In this section, we first briefly outline the advantages of us-
ing Semantic Web languages and describe the UCO ontol-
ogy along with its design considerations. We describe the
feasibility to support diverse and complex use cases by link-
ing cybersecurity information to external knowledge sources
in the next section.
1http://tinyurl.com/ptqkzpq
2http://www.w3.org/RDF/
3http://www.w3.org/TR/owl-guide/196
UCO: A Unified Cybersecurity Ontology_2
Table 1: Syntax and Semantics of Description Logic constructors
Name Syntax Semantics Symbol
Top > I AL
Bottom φ AL
Intersection C u D CI DI AL
Union C t D CI DI U
Negation ¬C I \ DI C
Value restriction R.C {a I | ∀b. (a,b) RI b CI } AL
Existential quant. R.C {a I | ∀b. (a,b) RI b CI } E
Nominal I II I with |II | = 1 O
Qualified Number restriction (less than) nR.C {a I | | { ∀b I | (a,b) ∈ RI b CI } | ≤ n } Q
Qualified Number restriction (equal than) = nR.C {a I | | { ∀b I | (a,b) ∈ RI b CI } | = n } Q
Qualified Number restriction (greater than) nR.C {a I | | { ∀b I | (a,b) ∈ RI b CI } | ≥ n } Q
Role Hierarchy R1 v R2 { (a, b) I × I | (a, b) RI
1 (a, b) RI
2 } H
Role Inverse R { (b, a) I × I | (a, b) RI } I
Role Composition R1 R2 { (a, c) | ∃b. (a, b) RI
1 (b, c) RI
2 } R
Advantages of Semantic Web Languages
RDF is a directed graph and unambiguous compared to
XML, which is tree based and has multiple representation
for the same information. As RDF and OWL have formal se-
mantics grounded in First Order Logic they are more prefer-
able for dealing with security situations. RDF and OWL
have a decentralized philosophy which allows incremental
building of knowledge, and its sharing and reuse. For exam-
ple, properties can be defined separately from classes (un-
like Object Oriented Programming). OWL facilitates infor-
mation integration by providing rich semantic constructs for
schema mapping such as Sub Class, Sub Property, Equiv-
alent Class, Equivalent Property, Same As, Union Of, In-
tersection Of etc. to represent complex facts (Table 1). Fur-
thermore, OWL has powerful off-the-shelf reasoners, which
enable detecting inconsistencies during data sharing. For ex-
ample, if there is a constraint for two classes, “Malware” and
“Virus”, to be disjoint and the data sets imported from dif-
ferent sources mention the same software to be both a Mal-
ware and Virus, then in such cases the reasoner will infer
an inconsistency. Semantic Web technologies are well es-
tablished and there are powerful reasoners available both as
Open Source Software and Commercial products.
Unified Cybersecurity Ontology (UCO)
The Unified Cybersecurity Ontology (UCO) is an extension
to Intrusion Detection System ontology (IDS) (Undercoffer
et al. 2004) developed earlier by our group to describe events
related to cybersecurity. Our group has been working on a
number of projects that focus on individual components of
a unified cybersecurity framework to analyze different data
streams and assert facts in a triple store (Undercoffer et al.
2004; More et al. 2012; Mulwad et al. 2011). The UCO on-
tology is essential for unifying information from heteroge-
neous sources and supporting reasoning and rule writing.
The ontology supports reasoning and inferring new informa-
tion from existing information. The ontology also supports
capturing specialized knowledge of a cybersecurity analyst
which can be expressed using ontology classes and terms
as well as rules. Rules are used to infer new information
which cannot be captured with an OWL reasoner. Figure 3
demonstrates a generic rule to infer an attack and alert the
host. The rule uses terms from UCO ontology to connect
information within the organization with external informa-
tion available on the web. The rule states that if the web text
description consists of some vulnerability terms, mentions
some security exploit, has text mentioning a certain product
(with some specific version) and some process which is be-
ing executed, which in turn is also logged by the scanner,
and there is an opening up of an out-bound port; then there
is a possibility of an attack on the host system with Means
and Consequences mentioned in the ontology.
Figure 3: The UCO ontology facilitates writing generic rules
and combining evidence from multiple sources.
UCO ontology provides a common understanding of cy-
bersecurity domain. Among all cybersecurity standards and
formats, STIX (Structured Threat Information eXpression)
(Barnum 2012) is the most comprehensive effort to unify
cybersecurity information sharing and enables extensions
by incorporating vocabulary from several other standards.
However, in STIX the information is represented in XML
and therefore cannot support reasoning which is supported
by UCO. We have created Unified Cybersecurity Ontol-197
UCO: A Unified Cybersecurity Ontology_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Early Detection of Cybersecurity Threats Using Collaborative
|10
|8629
|3

Innovation In Security Frameworks Assesment
|4
|422
|22