UCO: A Unified Cybersecurity Ontology
VerifiedAdded on 2022/11/13
|8
|5971
|495
AI Summary
The Unified Cybersecurity Ontology (UCO) is intended to support information integration and cyber situational awareness in cybersecurity systems. The ontology incorporates and integrates heterogeneous data and knowledge schemas from different cybersecurity systems and most commonly used cybersecurity standards for information sharing and exchange. The UCO ontology has also been mapped to a number of existing cybersecurity ontologies as well as concepts in the Linked Open Data cloud. We also present a prototype system and concrete use cases supported by the UCO ontology.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
UCO: A Unified Cybersecurity Ontology
Zareen Syed, Ankur Padia, Tim Finin, Lisa Mathews and Anupam Joshi
University of Maryland, Baltimore County, Baltimore, MD 21250
{zsyed, ankurpadia, finin, math1, joshi}@umbc.edu
Abstract
In this paper we describe the Unified Cybersecurity On-
tology (UCO) that is intended to support information in-
tegration and cyber situational awareness in cybersecu-
rity systems. The ontology incorporates and integrates
heterogeneous data and knowledge schemas from dif-
ferent cybersecurity systems and most commonly used
cybersecurity standards for information sharing and ex-
change. The UCO ontology has also been mapped to a
number of existing cybersecurity ontologies as well as
concepts in the Linked Open Data cloud (Berners-Lee,
Bizer, and Heath 2009). Similar to DBpedia (Auer et
al. 2007) which serves as the core for general knowl-
edge in Linked Open Data cloud, we envision UCO to
serve as the core for cybersecurity domain, which would
evolve and grow with the passage of time with addi-
tional cybersecurity data sets as they become available.
We also present a prototype system and concrete use
cases supported by the UCO ontology. To the best of our
knowledge, this is the first cybersecurity ontology that
has been mapped to general world ontologies to sup-
port broader and diverse security use cases. We compare
the resulting ontology with previous efforts, discuss its
strengths and limitations, and describe potential future
work directions.
Introduction
Cybersecurity data and information is usually generated by
different tools, sensors and systems expressed using differ-
ent standards and formats, published by different sources
and is often scattered as isolated pieces of information.
Furthermore, cybersecurity data is available in structured,
semi-structured and unstructured forms from both, internal
sources i.e. within the organization, and external sources i.e.
outside the organization. Unifying such scattered informa-
tion will provide better visibility and situational awareness
to cybersecurity analysts. Also, such integration can support
deep investigations and help transitioning from reactive ap-
proach to a more proactive and eventually a predictive ap-
proach.
Semantic Web technologies provide representation lan-
guages to build a common framework that allows data to
Copyright c 2016, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
be shared, integrated and reused across applications, enter-
prises as well as community boundaries. Languages, such as
RDF and OWL, represent the semantics of an entity as a set
of things or concepts rather than strings of words. They pro-
vide rich constructs to represent information that is not only
machine readable, but also machine understandable, thus fa-
cilitating semantic integration and sharing of information
from heterogeneous sources. Languages like OWL have well
defined constructs to map classes and instances present in
the internal knowledge base to corresponding classes and in-
stances in external knowledge bases. This mapping exposes
a larger pool of knowledge and helps in providing a more
complete picture and situational awareness.
Figure 1: Things vs. Strings. “Strings” are ambiguous and
can refer to different concepts in the real world. “Things” are
precise and reference unique concepts using unique identi-
fiers, such as Web URIs.
Semantic Web technologies represent real world entities
as concepts rather than strings, as strings are lexical and
ambiguous. Concepts are associated with a globally unique
identifier called URI. For example, the string “Georgia” may
refer to “Georgia state” in the United States or “Georgia
country” (Figure 1). Moreover, concepts can be associated
with attributes and can have relations with other concepts.
These attributes and relations can be used to build up a con-
text for the concept. An entity like “Georgia country” can
have “longitude” and “latitude” as attributes, which provide
The Workshops of the Thirtieth AAAI Conference on Artificial Intelligence
Artificial Intelligence for Cyber Security: Technical Report WS-16-03
195
Zareen Syed, Ankur Padia, Tim Finin, Lisa Mathews and Anupam Joshi
University of Maryland, Baltimore County, Baltimore, MD 21250
{zsyed, ankurpadia, finin, math1, joshi}@umbc.edu
Abstract
In this paper we describe the Unified Cybersecurity On-
tology (UCO) that is intended to support information in-
tegration and cyber situational awareness in cybersecu-
rity systems. The ontology incorporates and integrates
heterogeneous data and knowledge schemas from dif-
ferent cybersecurity systems and most commonly used
cybersecurity standards for information sharing and ex-
change. The UCO ontology has also been mapped to a
number of existing cybersecurity ontologies as well as
concepts in the Linked Open Data cloud (Berners-Lee,
Bizer, and Heath 2009). Similar to DBpedia (Auer et
al. 2007) which serves as the core for general knowl-
edge in Linked Open Data cloud, we envision UCO to
serve as the core for cybersecurity domain, which would
evolve and grow with the passage of time with addi-
tional cybersecurity data sets as they become available.
We also present a prototype system and concrete use
cases supported by the UCO ontology. To the best of our
knowledge, this is the first cybersecurity ontology that
has been mapped to general world ontologies to sup-
port broader and diverse security use cases. We compare
the resulting ontology with previous efforts, discuss its
strengths and limitations, and describe potential future
work directions.
Introduction
Cybersecurity data and information is usually generated by
different tools, sensors and systems expressed using differ-
ent standards and formats, published by different sources
and is often scattered as isolated pieces of information.
Furthermore, cybersecurity data is available in structured,
semi-structured and unstructured forms from both, internal
sources i.e. within the organization, and external sources i.e.
outside the organization. Unifying such scattered informa-
tion will provide better visibility and situational awareness
to cybersecurity analysts. Also, such integration can support
deep investigations and help transitioning from reactive ap-
proach to a more proactive and eventually a predictive ap-
proach.
Semantic Web technologies provide representation lan-
guages to build a common framework that allows data to
Copyright c 2016, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
be shared, integrated and reused across applications, enter-
prises as well as community boundaries. Languages, such as
RDF and OWL, represent the semantics of an entity as a set
of things or concepts rather than strings of words. They pro-
vide rich constructs to represent information that is not only
machine readable, but also machine understandable, thus fa-
cilitating semantic integration and sharing of information
from heterogeneous sources. Languages like OWL have well
defined constructs to map classes and instances present in
the internal knowledge base to corresponding classes and in-
stances in external knowledge bases. This mapping exposes
a larger pool of knowledge and helps in providing a more
complete picture and situational awareness.
Figure 1: Things vs. Strings. “Strings” are ambiguous and
can refer to different concepts in the real world. “Things” are
precise and reference unique concepts using unique identi-
fiers, such as Web URIs.
Semantic Web technologies represent real world entities
as concepts rather than strings, as strings are lexical and
ambiguous. Concepts are associated with a globally unique
identifier called URI. For example, the string “Georgia” may
refer to “Georgia state” in the United States or “Georgia
country” (Figure 1). Moreover, concepts can be associated
with attributes and can have relations with other concepts.
These attributes and relations can be used to build up a con-
text for the concept. An entity like “Georgia country” can
have “longitude” and “latitude” as attributes, which provide
The Workshops of the Thirtieth AAAI Conference on Artificial Intelligence
Artificial Intelligence for Cyber Security: Technical Report WS-16-03
195
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Figure 2: Semantic Relations enable supporting complex se-
curity use cases, for example, if “Georgia (country)” has
“neighbor” relation with “Russia” it may raise more alarms
if several past incidents originated from Russia.
information about its location on the map and its neighbor-
ing countries. Moreover, such information can be used to
derive inferences about possible source of attack. For exam-
ple, if an incident originates from “Georgia country” and it’s
neighboring country is Russia, then it may raise more alarms
if many cybersecurity attacks have originated from Russia in
the past (Figure 2). Furthermore, these relations can help in
connecting the dots and relating incidents with similar inci-
dents to gain insight into the source and motivation of the
attack.
Semantic technologies are used by big data companies
like Google, Microsoft, Facebook and Apple (Domingue,
Fensel, and Hendler 2011) for information sharing and in-
teroperability and supporting high level functions like an-
alyzing queries, providing semantic search and answering
questions. In order to achieve situational awareness, cyber-
security systems need to transition to produce and consume
semantic information about likely entities, relations, actions,
events, intentions and plans.
We have developed Unified Cybersecurity Ontology
(UCO) as an effort to help evolve the cybersecurity stan-
dards from a syntactic representation to a more semantic
representation. We see several contributions that our work
has to offer:
1. UCO ontology provides a common understanding of cy-
bersecurity domain and unifies most commonly used cy-
bersecurity standards.
2. Compared to existing cybersecurity ontologies which
have been developed independently, UCO has been
mapped to a number of existing publicly available cyber-
security ontologies to promote ontology sharing, integra-
tion and reuse. UCO serves as a backbone for linking cy-
bersecurity ontologies.
3. UCO maps concepts to general world knowledge sources
i.e. Linked Open Data cloud to support diverse use cases.
4. We describe important use cases that can be supported by
unifying cybersecurity data with existing general world
knowledge through the UCO ontology.
5. We have generated a catalog of cybersecurity standards
that is available online1.
This paper is organized as follows: In section 2 we briefly
introduce RDF and a subset of OWL language, OWL DL.
In section 3 we outline our approach for ontology construc-
tion and describe the UCO ontology along with other related
ontologies. Section 4 presents the design and implementa-
tion of a demonstration system with real world cybersecu-
rity data that uses the UCO ontology to support a number of
use cases. We review related work in section 5 and conclude
with a summary for future work in section 6.
Preliminaries
Resource Description Framework (RDF)
The Resource Description Framework 2 is a W3C standard
to represent knowledge as a semantic graph in which the
nodes represent entities, concepts or literal values and the
arcs represent relations. Thus, we can think of a knowledge
bases as a collection of triples with a subject, predicate and
object. The subject is usually the entity that is being rep-
resented. The predicate represents an attribute or a relation
of the subject and is used to associate with an object. The
object can be a literal or a resource. Typically, each of the
resource is identified with a URI. Example of an RDF triple
can be < John, studiesAt, School >.
OWL DL
OWL DL3, a sublanguage of OWL, which is based on De-
scription Logics is a tractable fragment of First Order Logic
and is used for knowledge representation. OWL DL is a
W3C standard to represent knowledge and is more expres-
sive compared to RDF. Formal definitions of some of the
constructs used in DL are shown in Table 1. Further details
on the constructs can be obtained from (Baader 2003).
Approach
Our approach to support cyber situational awareness has
been through the development of a core cybersecurity ontol-
ogy that facilitates data sharing across different formats and
standards and allows reasoning to infer new information. We
have surveyed, reviewed and cataloged existing cybersecu-
rity standards and ontologies and selected the most common
and widely used standards to incorporate in UCO ontology.
In this section, we first briefly outline the advantages of us-
ing Semantic Web languages and describe the UCO ontol-
ogy along with its design considerations. We describe the
feasibility to support diverse and complex use cases by link-
ing cybersecurity information to external knowledge sources
in the next section.
1http://tinyurl.com/ptqkzpq
2http://www.w3.org/RDF/
3http://www.w3.org/TR/owl-guide/
196
curity use cases, for example, if “Georgia (country)” has
“neighbor” relation with “Russia” it may raise more alarms
if several past incidents originated from Russia.
information about its location on the map and its neighbor-
ing countries. Moreover, such information can be used to
derive inferences about possible source of attack. For exam-
ple, if an incident originates from “Georgia country” and it’s
neighboring country is Russia, then it may raise more alarms
if many cybersecurity attacks have originated from Russia in
the past (Figure 2). Furthermore, these relations can help in
connecting the dots and relating incidents with similar inci-
dents to gain insight into the source and motivation of the
attack.
Semantic technologies are used by big data companies
like Google, Microsoft, Facebook and Apple (Domingue,
Fensel, and Hendler 2011) for information sharing and in-
teroperability and supporting high level functions like an-
alyzing queries, providing semantic search and answering
questions. In order to achieve situational awareness, cyber-
security systems need to transition to produce and consume
semantic information about likely entities, relations, actions,
events, intentions and plans.
We have developed Unified Cybersecurity Ontology
(UCO) as an effort to help evolve the cybersecurity stan-
dards from a syntactic representation to a more semantic
representation. We see several contributions that our work
has to offer:
1. UCO ontology provides a common understanding of cy-
bersecurity domain and unifies most commonly used cy-
bersecurity standards.
2. Compared to existing cybersecurity ontologies which
have been developed independently, UCO has been
mapped to a number of existing publicly available cyber-
security ontologies to promote ontology sharing, integra-
tion and reuse. UCO serves as a backbone for linking cy-
bersecurity ontologies.
3. UCO maps concepts to general world knowledge sources
i.e. Linked Open Data cloud to support diverse use cases.
4. We describe important use cases that can be supported by
unifying cybersecurity data with existing general world
knowledge through the UCO ontology.
5. We have generated a catalog of cybersecurity standards
that is available online1.
This paper is organized as follows: In section 2 we briefly
introduce RDF and a subset of OWL language, OWL DL.
In section 3 we outline our approach for ontology construc-
tion and describe the UCO ontology along with other related
ontologies. Section 4 presents the design and implementa-
tion of a demonstration system with real world cybersecu-
rity data that uses the UCO ontology to support a number of
use cases. We review related work in section 5 and conclude
with a summary for future work in section 6.
Preliminaries
Resource Description Framework (RDF)
The Resource Description Framework 2 is a W3C standard
to represent knowledge as a semantic graph in which the
nodes represent entities, concepts or literal values and the
arcs represent relations. Thus, we can think of a knowledge
bases as a collection of triples with a subject, predicate and
object. The subject is usually the entity that is being rep-
resented. The predicate represents an attribute or a relation
of the subject and is used to associate with an object. The
object can be a literal or a resource. Typically, each of the
resource is identified with a URI. Example of an RDF triple
can be < John, studiesAt, School >.
OWL DL
OWL DL3, a sublanguage of OWL, which is based on De-
scription Logics is a tractable fragment of First Order Logic
and is used for knowledge representation. OWL DL is a
W3C standard to represent knowledge and is more expres-
sive compared to RDF. Formal definitions of some of the
constructs used in DL are shown in Table 1. Further details
on the constructs can be obtained from (Baader 2003).
Approach
Our approach to support cyber situational awareness has
been through the development of a core cybersecurity ontol-
ogy that facilitates data sharing across different formats and
standards and allows reasoning to infer new information. We
have surveyed, reviewed and cataloged existing cybersecu-
rity standards and ontologies and selected the most common
and widely used standards to incorporate in UCO ontology.
In this section, we first briefly outline the advantages of us-
ing Semantic Web languages and describe the UCO ontol-
ogy along with its design considerations. We describe the
feasibility to support diverse and complex use cases by link-
ing cybersecurity information to external knowledge sources
in the next section.
1http://tinyurl.com/ptqkzpq
2http://www.w3.org/RDF/
3http://www.w3.org/TR/owl-guide/
196
Table 1: Syntax and Semantics of Description Logic constructors
Name Syntax Semantics Symbol
Top > ∆ I AL
Bottom ⊥ φ AL
Intersection C u D CI ∩ DI AL
Union C t D CI ∪ DI U
Negation ¬C ∆ I \ DI C
Value restriction ∀R.C {a ∈ ∆I | ∀b. (a,b) ∈RI → b ∈CI } AL
Existential quant. ∃R.C {a ∈ ∆I | ∀b. (a,b) ∈RI ∧ b ∈CI } E
Nominal I II ⊆ ∆I with |II | = 1 O
Qualified Number restriction (less than) ≤ nR.C {a ∈ ∆I | | { ∀b ∈ ∆I | (a,b) ∈ RI ∧ b ∈CI } | ≤n } Q
Qualified Number restriction (equal than) = nR.C {a ∈ ∆I | | { ∀b ∈ ∆I | (a,b) ∈ RI ∧ b ∈CI } | =n } Q
Qualified Number restriction (greater than) ≥ nR.C {a ∈ ∆I | | { ∀b ∈ ∆I | (a,b) ∈ RI ∧ b ∈CI } | ≥n } Q
Role Hierarchy R1 v R 2 { (a, b) ∈ ∆I × ∆I | (a, b) ∈ RI
1 → (a, b) ∈ RI
2 } H
Role Inverse R− { (b, a) ∈ ∆I × ∆I | (a, b) ∈ RI } I
Role Composition R1 ◦ R2 { (a, c) | ∃b. (a, b) ∈ RI
1 ∧ (b, c) ∈ RI
2 } R
Advantages of Semantic Web Languages
RDF is a directed graph and unambiguous compared to
XML, which is tree based and has multiple representation
for the same information. As RDF and OWL have formal se-
mantics grounded in First Order Logic they are more prefer-
able for dealing with security situations. RDF and OWL
have a decentralized philosophy which allows incremental
building of knowledge, and its sharing and reuse. For exam-
ple, properties can be defined separately from classes (un-
like Object Oriented Programming). OWL facilitates infor-
mation integration by providing rich semantic constructs for
schema mapping such as Sub Class, Sub Property, Equiv-
alent Class, Equivalent Property, Same As, Union Of, In-
tersection Of etc. to represent complex facts (Table 1). Fur-
thermore, OWL has powerful off-the-shelf reasoners, which
enable detecting inconsistencies during data sharing. For ex-
ample, if there is a constraint for two classes, “Malware” and
“Virus”, to be disjoint and the data sets imported from dif-
ferent sources mention the same software to be both a Mal-
ware and Virus, then in such cases the reasoner will infer
an inconsistency. Semantic Web technologies are well es-
tablished and there are powerful reasoners available both as
Open Source Software and Commercial products.
Unified Cybersecurity Ontology (UCO)
The Unified Cybersecurity Ontology (UCO) is an extension
to Intrusion Detection System ontology (IDS) (Undercoffer
et al. 2004) developed earlier by our group to describe events
related to cybersecurity. Our group has been working on a
number of projects that focus on individual components of
a unified cybersecurity framework to analyze different data
streams and assert facts in a triple store (Undercoffer et al.
2004; More et al. 2012; Mulwad et al. 2011). The UCO on-
tology is essential for unifying information from heteroge-
neous sources and supporting reasoning and rule writing.
The ontology supports reasoning and inferring new informa-
tion from existing information. The ontology also supports
capturing specialized knowledge of a cybersecurity analyst
which can be expressed using ontology classes and terms
as well as rules. Rules are used to infer new information
which cannot be captured with an OWL reasoner. Figure 3
demonstrates a generic rule to infer an attack and alert the
host. The rule uses terms from UCO ontology to connect
information within the organization with external informa-
tion available on the web. The rule states that if the web text
description consists of some vulnerability terms, mentions
some security exploit, has text mentioning a certain product
(with some specific version) and some process which is be-
ing executed, which in turn is also logged by the scanner,
and there is an opening up of an out-bound port; then there
is a possibility of an attack on the host system with Means
and Consequences mentioned in the ontology.
Figure 3: The UCO ontology facilitates writing generic rules
and combining evidence from multiple sources.
UCO ontology provides a common understanding of cy-
bersecurity domain. Among all cybersecurity standards and
formats, STIX (Structured Threat Information eXpression)
(Barnum 2012) is the most comprehensive effort to unify
cybersecurity information sharing and enables extensions
by incorporating vocabulary from several other standards.
However, in STIX the information is represented in XML
and therefore cannot support reasoning which is supported
by UCO. We have created Unified Cybersecurity Ontol-
197
Name Syntax Semantics Symbol
Top > ∆ I AL
Bottom ⊥ φ AL
Intersection C u D CI ∩ DI AL
Union C t D CI ∪ DI U
Negation ¬C ∆ I \ DI C
Value restriction ∀R.C {a ∈ ∆I | ∀b. (a,b) ∈RI → b ∈CI } AL
Existential quant. ∃R.C {a ∈ ∆I | ∀b. (a,b) ∈RI ∧ b ∈CI } E
Nominal I II ⊆ ∆I with |II | = 1 O
Qualified Number restriction (less than) ≤ nR.C {a ∈ ∆I | | { ∀b ∈ ∆I | (a,b) ∈ RI ∧ b ∈CI } | ≤n } Q
Qualified Number restriction (equal than) = nR.C {a ∈ ∆I | | { ∀b ∈ ∆I | (a,b) ∈ RI ∧ b ∈CI } | =n } Q
Qualified Number restriction (greater than) ≥ nR.C {a ∈ ∆I | | { ∀b ∈ ∆I | (a,b) ∈ RI ∧ b ∈CI } | ≥n } Q
Role Hierarchy R1 v R 2 { (a, b) ∈ ∆I × ∆I | (a, b) ∈ RI
1 → (a, b) ∈ RI
2 } H
Role Inverse R− { (b, a) ∈ ∆I × ∆I | (a, b) ∈ RI } I
Role Composition R1 ◦ R2 { (a, c) | ∃b. (a, b) ∈ RI
1 ∧ (b, c) ∈ RI
2 } R
Advantages of Semantic Web Languages
RDF is a directed graph and unambiguous compared to
XML, which is tree based and has multiple representation
for the same information. As RDF and OWL have formal se-
mantics grounded in First Order Logic they are more prefer-
able for dealing with security situations. RDF and OWL
have a decentralized philosophy which allows incremental
building of knowledge, and its sharing and reuse. For exam-
ple, properties can be defined separately from classes (un-
like Object Oriented Programming). OWL facilitates infor-
mation integration by providing rich semantic constructs for
schema mapping such as Sub Class, Sub Property, Equiv-
alent Class, Equivalent Property, Same As, Union Of, In-
tersection Of etc. to represent complex facts (Table 1). Fur-
thermore, OWL has powerful off-the-shelf reasoners, which
enable detecting inconsistencies during data sharing. For ex-
ample, if there is a constraint for two classes, “Malware” and
“Virus”, to be disjoint and the data sets imported from dif-
ferent sources mention the same software to be both a Mal-
ware and Virus, then in such cases the reasoner will infer
an inconsistency. Semantic Web technologies are well es-
tablished and there are powerful reasoners available both as
Open Source Software and Commercial products.
Unified Cybersecurity Ontology (UCO)
The Unified Cybersecurity Ontology (UCO) is an extension
to Intrusion Detection System ontology (IDS) (Undercoffer
et al. 2004) developed earlier by our group to describe events
related to cybersecurity. Our group has been working on a
number of projects that focus on individual components of
a unified cybersecurity framework to analyze different data
streams and assert facts in a triple store (Undercoffer et al.
2004; More et al. 2012; Mulwad et al. 2011). The UCO on-
tology is essential for unifying information from heteroge-
neous sources and supporting reasoning and rule writing.
The ontology supports reasoning and inferring new informa-
tion from existing information. The ontology also supports
capturing specialized knowledge of a cybersecurity analyst
which can be expressed using ontology classes and terms
as well as rules. Rules are used to infer new information
which cannot be captured with an OWL reasoner. Figure 3
demonstrates a generic rule to infer an attack and alert the
host. The rule uses terms from UCO ontology to connect
information within the organization with external informa-
tion available on the web. The rule states that if the web text
description consists of some vulnerability terms, mentions
some security exploit, has text mentioning a certain product
(with some specific version) and some process which is be-
ing executed, which in turn is also logged by the scanner,
and there is an opening up of an out-bound port; then there
is a possibility of an attack on the host system with Means
and Consequences mentioned in the ontology.
Figure 3: The UCO ontology facilitates writing generic rules
and combining evidence from multiple sources.
UCO ontology provides a common understanding of cy-
bersecurity domain. Among all cybersecurity standards and
formats, STIX (Structured Threat Information eXpression)
(Barnum 2012) is the most comprehensive effort to unify
cybersecurity information sharing and enables extensions
by incorporating vocabulary from several other standards.
However, in STIX the information is represented in XML
and therefore cannot support reasoning which is supported
by UCO. We have created Unified Cybersecurity Ontol-
197
Table 2: Statistics for UCO and related ontologies
CCE CVE CVSS UCO Total
Axiom 11 21 197 633 862
Class Count 1 3 35 106 145
Object Property Count 0 2 32 59 93
Data Property Count 5 6 3 45 59
Individual Count 0 0 23 7 30
Equivalent classes 0 0 3 16 19
DL Expressivity AL ALUHO(D) ALUHOQ(D) ALCROIQ(D)
Table 3: Statistics for existing ontologies mapped to UCO
CPC CY CYB C DM KC MC STIX STO
Axiom 7915 296 117 2 63 2 8808 60
Class Count 1219 21 10 1 12 1 1303 15
Object Property Count 6 22 5 0 5 0 114 21
Data Property Count 3 19 13 0 0 0 47 0
Individual Count 10 70 25 0 0 0 91 0
Equivalent classes 2 4 2 0 26 0 17 10
DL Expressivity ALCHO(D) ALUOQ(D) ALUOQ(D) AL ALCIQ AL ALCHOIQ(D) ALE
ogy as a semantic version of STIX. In addition to map-
ping to STIX, UCO has also been extended with a num-
ber of relevant cybersecurity standards, vocabularies and on-
tologies such as CVE4, CCE5, CVSS6, CAPEC7, CYBOX8,
KillChain9 and STUCCO 10. To support diverse use cases,
UCO ontology has been mapped to general world knowl-
edge available through Google’s knowledge graph, DBpe-
dia knowledge base (Auer et al. 2007), Yago knowledge
base (Suchanek, Kasneci, and Weikum 2008) etc. Linking
to these knowledge sources provides access to large number
of datasets for different domains (e.g. geonames) as well as
terms in different languages (e.g. Russian).
Below we describe the list of important classes present in
UCO ontology:
1. Means: This class describes various methods of execut-
ing an attack and consists of sub-classes like BufferOver-
Flow, SynFlood, LogicExploit, Tcp-PortScan etc., which
can further consist of their own sub-classes. The Means
class maps to TTP field in STIX which characterizes
specific details of observed or potential attacker Tactics,
Techniques and Procedures.
2. Consequences: This class describes the possible out-
comes of an attack. It consists of sub-classes like
DenialOfService, LossOfConfiguration, PrivilegeEscala-
tion, UnauthUser, etc. It maps to Observables in STIX.
3. Attack: This class characterizes a cyber threat attack and
is mapped to Incident in STIX.
4https://cve.mitre.org/
5https://cce.mitre.org/
6https://www.first.org/cvss
7https://capec.mitre.org/
8https://cyboxproject.github.io/
9http://vistology.com/ont/STIX/killchain.owl
10https://github.com/stucco/ontology/blob/master/
stucco schema.json
4. Attacker: This class represents identification or charac-
terization of the adversary and is mapped to ThreatActor
in STIX.
5. AttackPattern: Attack Patterns are descriptions of com-
mon methods for exploiting software providing the at-
tackers perspective and guidance on ways to mitigate their
effect. An example of attack pattern is Phishing.
6. Exploit: This class characterizes description of an indi-
vidual exploit and maps to ExploitType in STIX schema.
7. Exploit Target: Exploit Targets are vulnerabilities or
weaknesses in software, systems, networks or configura-
tions that are targeted for exploitation by the TTP (cyber
threat adversary Tactic, Technique or Procedure).
Figure 4: UCO ontology serves as the core for cybersecurity
Linked Open Data (LOD) cloud
8. Indicator: A cyber threat indicator is made up of a pat-
tern identifying certain observable conditions as well as
198
CCE CVE CVSS UCO Total
Axiom 11 21 197 633 862
Class Count 1 3 35 106 145
Object Property Count 0 2 32 59 93
Data Property Count 5 6 3 45 59
Individual Count 0 0 23 7 30
Equivalent classes 0 0 3 16 19
DL Expressivity AL ALUHO(D) ALUHOQ(D) ALCROIQ(D)
Table 3: Statistics for existing ontologies mapped to UCO
CPC CY CYB C DM KC MC STIX STO
Axiom 7915 296 117 2 63 2 8808 60
Class Count 1219 21 10 1 12 1 1303 15
Object Property Count 6 22 5 0 5 0 114 21
Data Property Count 3 19 13 0 0 0 47 0
Individual Count 10 70 25 0 0 0 91 0
Equivalent classes 2 4 2 0 26 0 17 10
DL Expressivity ALCHO(D) ALUOQ(D) ALUOQ(D) AL ALCIQ AL ALCHOIQ(D) ALE
ogy as a semantic version of STIX. In addition to map-
ping to STIX, UCO has also been extended with a num-
ber of relevant cybersecurity standards, vocabularies and on-
tologies such as CVE4, CCE5, CVSS6, CAPEC7, CYBOX8,
KillChain9 and STUCCO 10. To support diverse use cases,
UCO ontology has been mapped to general world knowl-
edge available through Google’s knowledge graph, DBpe-
dia knowledge base (Auer et al. 2007), Yago knowledge
base (Suchanek, Kasneci, and Weikum 2008) etc. Linking
to these knowledge sources provides access to large number
of datasets for different domains (e.g. geonames) as well as
terms in different languages (e.g. Russian).
Below we describe the list of important classes present in
UCO ontology:
1. Means: This class describes various methods of execut-
ing an attack and consists of sub-classes like BufferOver-
Flow, SynFlood, LogicExploit, Tcp-PortScan etc., which
can further consist of their own sub-classes. The Means
class maps to TTP field in STIX which characterizes
specific details of observed or potential attacker Tactics,
Techniques and Procedures.
2. Consequences: This class describes the possible out-
comes of an attack. It consists of sub-classes like
DenialOfService, LossOfConfiguration, PrivilegeEscala-
tion, UnauthUser, etc. It maps to Observables in STIX.
3. Attack: This class characterizes a cyber threat attack and
is mapped to Incident in STIX.
4https://cve.mitre.org/
5https://cce.mitre.org/
6https://www.first.org/cvss
7https://capec.mitre.org/
8https://cyboxproject.github.io/
9http://vistology.com/ont/STIX/killchain.owl
10https://github.com/stucco/ontology/blob/master/
stucco schema.json
4. Attacker: This class represents identification or charac-
terization of the adversary and is mapped to ThreatActor
in STIX.
5. AttackPattern: Attack Patterns are descriptions of com-
mon methods for exploiting software providing the at-
tackers perspective and guidance on ways to mitigate their
effect. An example of attack pattern is Phishing.
6. Exploit: This class characterizes description of an indi-
vidual exploit and maps to ExploitType in STIX schema.
7. Exploit Target: Exploit Targets are vulnerabilities or
weaknesses in software, systems, networks or configura-
tions that are targeted for exploitation by the TTP (cyber
threat adversary Tactic, Technique or Procedure).
Figure 4: UCO ontology serves as the core for cybersecurity
Linked Open Data (LOD) cloud
8. Indicator: A cyber threat indicator is made up of a pat-
tern identifying certain observable conditions as well as
198
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Figure 5: Advanced use cases can be supported using mappings between UCO and general world ontologies in linked open data
cloud
contextual information about the patterns meaning, how
and when it should be acted on, etc. This class is mapped
to IndicatorType in STIX schema and Indicator class in
CAPEC ontology.
The ontology also has a number of classes to characterize
Processes, Network State, Files, System, Hardware and Kill
Chain along with Kill Chain Phases. The statistics for UCO
and related ontologies are given in Table 2 & 3 where ontolo-
gies in Table 2 are developed by our group. These ontologies
are independent and do not have any overlapping classes.
However, to generate a connected graph, the UCO ontology
has a few classes representing parent class of each of the
other ontology. Such a design allows easy maintenance of
the ontology as different ontologies are loosely coupled and
hence each ontology can evolve independently. As shown in
Table 2, CCE contains a class and 5 data type properties like
description, references, platform etc. CVSS ontology con-
tains 35 classes and includes classes like base group, envi-
ronmental group and temporal group, which are represented
as the combination of other classes. Similar with CVE which
contains 3 classes and 8 properties available from the XML
schema. As compared to other ontologies, we designed UCO
to represent and considerably extend STIX framework with
additional classes and defined relations among them. For ex-
ample IPAddress, Software, WebBrowser are a few classes
with WebBrowser being the subclass of Software. Moreover,
there are 16 classes in the ontology which are defined as
the combination of other classes. For example, Product is
represented as the union of Software and Hardware. Fig-
ure 4 shows UCO ontology serving as the core ontology for
linking with other cybersecurity ontologies and LOD cloud.
To facilitate data integration from multiple freely available
knowledge base, we mapped UCO to Linked Open Data
(LOD) cloud. Such an extension allows an analysts to fetch
data from multiple freely available data sources with differ-
ent schema but represented using semantic web technolo-
gies. An example of a mapping from UCO to DBpedia is
shown below:
<uco : acrobat reader owl : sameAs dbr : Adobe Acrobat>
Here, “uco:” is the namespace used for Unified Cybersecu-
rity Ontology and the mapping asserts that the Adobe Ac-
robat from DBpedia (represented with “dbr” namespace) is
same as the acrobat reader present in UCO ontology.
Prototype System Design and Use Cases
We have designed a system that uses STUCCO extrac-
tors11 to extract entities from the National Vulnerability
Database (NVD) XML file. We implemented code to gen-
erate < subject, predicate, object >triples from the XML
file. We defined mappings between entities obtained from
NVD data to corresponding entities in DBpedia. The triples
and the mappings were loaded on to the Fuseki server as
it supports federated queries to integrate data from multiple
sources and it supports reasoning. Any triple store with these
capabilities can be used to host the triples and integrate data.
Ontology Use Cases
In Figure 5 we show several advanced use cases that can be
supported using mappings between UCO and general world
ontologies that cannot be supported by individual ontologies
alone. Below we describe each use-case with corresponding
SPARQL queries. We also show a snippet of results obtained
by executing the same query over a sample NVD data set
that was loaded into the Fuseki triple store. For each use
cases “db” represents the namespace for DBpedia resources.
11https://github.com/stucco/extractors
199
cloud
contextual information about the patterns meaning, how
and when it should be acted on, etc. This class is mapped
to IndicatorType in STIX schema and Indicator class in
CAPEC ontology.
The ontology also has a number of classes to characterize
Processes, Network State, Files, System, Hardware and Kill
Chain along with Kill Chain Phases. The statistics for UCO
and related ontologies are given in Table 2 & 3 where ontolo-
gies in Table 2 are developed by our group. These ontologies
are independent and do not have any overlapping classes.
However, to generate a connected graph, the UCO ontology
has a few classes representing parent class of each of the
other ontology. Such a design allows easy maintenance of
the ontology as different ontologies are loosely coupled and
hence each ontology can evolve independently. As shown in
Table 2, CCE contains a class and 5 data type properties like
description, references, platform etc. CVSS ontology con-
tains 35 classes and includes classes like base group, envi-
ronmental group and temporal group, which are represented
as the combination of other classes. Similar with CVE which
contains 3 classes and 8 properties available from the XML
schema. As compared to other ontologies, we designed UCO
to represent and considerably extend STIX framework with
additional classes and defined relations among them. For ex-
ample IPAddress, Software, WebBrowser are a few classes
with WebBrowser being the subclass of Software. Moreover,
there are 16 classes in the ontology which are defined as
the combination of other classes. For example, Product is
represented as the union of Software and Hardware. Fig-
ure 4 shows UCO ontology serving as the core ontology for
linking with other cybersecurity ontologies and LOD cloud.
To facilitate data integration from multiple freely available
knowledge base, we mapped UCO to Linked Open Data
(LOD) cloud. Such an extension allows an analysts to fetch
data from multiple freely available data sources with differ-
ent schema but represented using semantic web technolo-
gies. An example of a mapping from UCO to DBpedia is
shown below:
<uco : acrobat reader owl : sameAs dbr : Adobe Acrobat>
Here, “uco:” is the namespace used for Unified Cybersecu-
rity Ontology and the mapping asserts that the Adobe Ac-
robat from DBpedia (represented with “dbr” namespace) is
same as the acrobat reader present in UCO ontology.
Prototype System Design and Use Cases
We have designed a system that uses STUCCO extrac-
tors11 to extract entities from the National Vulnerability
Database (NVD) XML file. We implemented code to gen-
erate < subject, predicate, object >triples from the XML
file. We defined mappings between entities obtained from
NVD data to corresponding entities in DBpedia. The triples
and the mappings were loaded on to the Fuseki server as
it supports federated queries to integrate data from multiple
sources and it supports reasoning. Any triple store with these
capabilities can be used to host the triples and integrate data.
Ontology Use Cases
In Figure 5 we show several advanced use cases that can be
supported using mappings between UCO and general world
ontologies that cannot be supported by individual ontologies
alone. Below we describe each use-case with corresponding
SPARQL queries. We also show a snippet of results obtained
by executing the same query over a sample NVD data set
that was loaded into the Fuseki triple store. For each use
cases “db” represents the namespace for DBpedia resources.
11https://github.com/stucco/extractors
199
observable reader
uco:CVE-2002-2435 db:Adobe Acrobat
uco:CVE-2002-2436 db:Adobe Acrobat
uco:CVE-2002-2437 db:Adobe Acrobat
Figure 6: SPARQL Query and results for Use Case 1
Use-Case #1: Vulnerabilities associated with PDF
Readers: An organization or a security analyst may be in-
terested in finding the kind of vulnerabilities associated with
a specific type of software. The CVE entries only mention
software, however one can also retrieve the associated type
of software if these software are linked to external knowl-
edge sources such as Google’s knowledge graph or DBpe-
dia. For example, from CVE we have the information that
Adobe Acrobat has a certain vulnerability identified with
CVE entry reference CVE-2015-5115. Mapping Adobe Ac-
robat instance to the corresponding instance in DBpedia and
YAGO resources will provide additional information that it
is a type of Yago:PDFReaders. This mapping with enable
answering queries asking for vulnerabilities associated with
a specific category of software, such as PDF readers. The
SPARQL query shown in Figure 6 demonstrates this use-
case.
product company vulnerability
db:Adobe Acrobat db:Adobe Systems uco:CVE-2002-2435
db:Adobe Acrobat db:Adobe Systems uco:CVE-2002-2436
db:Adobe Acrobat db:Adobe Systems uco:CVE-2002-2437
Figure 7: SPARQL Query and results for Use Case 2
Use-Case #2: Vulnerabilities associated with products
from a given company: Another interesting use-case is to
explore vulnerabilities associated with products from a given
company. Again by mapping software instances to exter-
nal knowledge sources, one can find the name of the com-
pany which developed the software. The following SPARQL
query, shown in Figure 7, retrieves vulnerabilities for prod-
ucts along with information about the source company.
similarSoftware
db:STDU Viewer
db:Preview (Mac OS)
db:Pdfescape
db:Adobe Digital Editions
db:Adobe Creative Suite
Figure 8: SPARQL Query and Results for Use Case 3
Use-Case #3: Suggest similar software to given soft-
ware: After knowing information about a certain vulnerabil-
ity a security analyst may be interested in finding alternate
software that doesn’t have the given vulnerability. For ex-
ample in case of a PDF readers the SPARQL query, shown
in Figure 8, retrieves software of the same type filtering out
“acrobat” software.
company vulnerability count
db:Opera Software 864
db:Adobe Systems 74
Figure 9: SPARQL Query and Results for Use Case 4
Use-Case #4: Assess impact of changing vendors: In
case an organization is interested in changing vendors, they
can assess the impact of the vendor by using the SPARQL
query shown in Figure 9 which creates a summary of vul-
nerability counts associated with the products from different
vendors.
200
uco:CVE-2002-2435 db:Adobe Acrobat
uco:CVE-2002-2436 db:Adobe Acrobat
uco:CVE-2002-2437 db:Adobe Acrobat
Figure 6: SPARQL Query and results for Use Case 1
Use-Case #1: Vulnerabilities associated with PDF
Readers: An organization or a security analyst may be in-
terested in finding the kind of vulnerabilities associated with
a specific type of software. The CVE entries only mention
software, however one can also retrieve the associated type
of software if these software are linked to external knowl-
edge sources such as Google’s knowledge graph or DBpe-
dia. For example, from CVE we have the information that
Adobe Acrobat has a certain vulnerability identified with
CVE entry reference CVE-2015-5115. Mapping Adobe Ac-
robat instance to the corresponding instance in DBpedia and
YAGO resources will provide additional information that it
is a type of Yago:PDFReaders. This mapping with enable
answering queries asking for vulnerabilities associated with
a specific category of software, such as PDF readers. The
SPARQL query shown in Figure 6 demonstrates this use-
case.
product company vulnerability
db:Adobe Acrobat db:Adobe Systems uco:CVE-2002-2435
db:Adobe Acrobat db:Adobe Systems uco:CVE-2002-2436
db:Adobe Acrobat db:Adobe Systems uco:CVE-2002-2437
Figure 7: SPARQL Query and results for Use Case 2
Use-Case #2: Vulnerabilities associated with products
from a given company: Another interesting use-case is to
explore vulnerabilities associated with products from a given
company. Again by mapping software instances to exter-
nal knowledge sources, one can find the name of the com-
pany which developed the software. The following SPARQL
query, shown in Figure 7, retrieves vulnerabilities for prod-
ucts along with information about the source company.
similarSoftware
db:STDU Viewer
db:Preview (Mac OS)
db:Pdfescape
db:Adobe Digital Editions
db:Adobe Creative Suite
Figure 8: SPARQL Query and Results for Use Case 3
Use-Case #3: Suggest similar software to given soft-
ware: After knowing information about a certain vulnerabil-
ity a security analyst may be interested in finding alternate
software that doesn’t have the given vulnerability. For ex-
ample in case of a PDF readers the SPARQL query, shown
in Figure 8, retrieves software of the same type filtering out
“acrobat” software.
company vulnerability count
db:Opera Software 864
db:Adobe Systems 74
Figure 9: SPARQL Query and Results for Use Case 4
Use-Case #4: Assess impact of changing vendors: In
case an organization is interested in changing vendors, they
can assess the impact of the vendor by using the SPARQL
query shown in Figure 9 which creates a summary of vul-
nerability counts associated with the products from different
vendors.
200
Related Work
STIX (Barnum 2012) is the most comprehensive standard
to unify cybersecurity information sharing and enables ex-
tensions by incorporating vocabulary from several other
standards. Most cybersecurity systems use STIX represen-
tation expressed in XML. A major limitation of XML is
that it cannot support reasoning. One of the earliest efforts
for developing ontologies to support reasoning for cyber-
security was by our group in 2003 (Pinkston et al. 2003;
Undercoffer et al. 2004). Undercoffer et al. implemented a
target centric ontology for the domain of intrusion detec-
tion composed of 23 classes and 190 properties. (More et al.
2012) from our group extended this IDS ontology to incor-
porate and integrate cybersecurity related information from
heterogeneous sources. The UCO ontology is a further ex-
tension and enhancement of the IDS ontology. It represents
and maps publicly available standards and ontologies in the
cybersecurity domain. (Ulicny et al. 2014) created a STIX
ontology based on the STIX schema along with a number of
related ontologies. We have defined mappings between UCO
and STIX ontology classes. (Iannacone et al. 2015) devel-
oped STUCCO ontology for integrating different structured
and un-structured data sources along with data extractors.
STUCCO ontology is composed of 15 entity types and 115
properties and is defined using JSON-schema. We translated
STUCCO from JSON to OWL in order to map it to UCO on-
tology. (Vaibhav Khadilkar and Thuraisingham ) developed
CPE ontology for Common Platform Enumerations in Na-
tional Vulnerability Database. The ontology is described in
the report however it is not publicly available for download
and hence we could not map it to UCO. A number of on-
tologies have been developed for specialized domains within
cybersecurity such as ontology for insider threats in finance
domain (Kul and Upadhyaya 2015), ontology for network
security attack (Simmonds, Sandilands, and van Ekert 2004;
Chan et al. 2015) and a cloud security ontology (Amit Hen-
dre and Joshi 2014). UCO ontology can serve as a backbone
to link these specialized ontologies.
Conclusion and Future Work Directions
The UCO ontology provides a common understanding of cy-
bersecurity domain and unifies most commonly used cyber-
security standards. Unlike existing independent and isolated
cybersecurity ontologies, UCO has been mapped to pub-
licly available ontologies in the cybersecurity domain and
hence offers more coverage. In addition to that, UCO is also
mapped to concepts in general world knowledge sources to
support diverse use cases. To the best of our knowledge this
is the first such effort in the area of cybersecurity ontolo-
gies to unify cybersecurity information with general world
knowledge about entities and relations. We presented differ-
ent use cases that demonstrate the utility and value of UCO
ontology in supporting diverse security scenarios. We briefly
discuss promising future work directions below.
Temporal Representation and Reasoning: Cybersecu-
rity data and information may have a temporal component,
for example timestamps associated with files, system logs
and network events etc. The current version of UCO ontol-
ogy uses a very basic representation of time where time is
represented as a data property associated with classes that
represent events. A number of frameworks and representa-
tions have been proposed in research such as OWL-Time
(Hobbs and Pan 2004) and time-entry (Pan and Hobbs 2004)
which provide vocabularies for stating facts about temporal
instants and intervals. In the future, we plan to extend UCO
ontology to represent time instances and intervals so that it
can support temporal reasoning.
Modeling Uncertainty and Confidence: Ontology de-
sign requires crisp logic i.e. any sentences in these languages
such as asserted facts, domain knowledge, or reasoning re-
sults must be either true or false and nothing in between.
Real world domains contain uncertain knowledge because
of incomplete or partial information that is true only to a cer-
tain degree. Probability theory is a natural choice for dealing
with this kind of uncertainty. There is a large body of liter-
ature on fuzzy logic and probabilistic reasoning. More re-
cent developments seek to combine First Order Logic with
probabilistic models, such as the work on Markov Logic
Networks (Richardson and Domingos 2006) and Bayesian
Logic (Milch et al. 2007). Future work directions include
reviewing different approaches, identifying and analyzing
shortcomings and encountered challenges followed by the
choice of suitable representation to extend UCO ontology.
Cybersecurity Information Extraction from Unstruc-
tured Data: Cybersecurity vulnerabilities are typically
identified and published publicly but response has always
been slow in covering up these vulnerabilities because there
is no automatic mechanism to understand and process this
unstructured text published on the web. There is a strong
need for systems that can automatically analyze unstructured
text and extract vulnerability entities and concepts from vari-
ous non-traditional unstructured data sources such as cyber-
security blogs, security bulletins and hackers forums. This
information extraction task will help expediting the process
of understanding and realizing the vulnerabilities and thus
making systems secure at faster rate. There is some initial
work in (Mulwad et al. 2011; Joshi, Lal, and Finin 2013;
Jones et al. 2015). The UCO ontology can benefit these
approaches by guiding the extraction process and also in
checking consistency of the extracted facts.
Acknowledgements
The research described in this paper was supported by a seed
grant from MITRE.
References
Amit Hendre, T. F., and Joshi, K. P. 2014. Ontology de-
scribing Cloud data threats, security controls and compli-
ance standards. http://ebiquity.umbc.edu/resource/html/id/
361/Cloud-Security-and-Compliance-Ontology. [Online;
accessed 21-October-2015].
Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak,
R.; and Ives, Z. 2007. Dbpedia: A nucleus for a web of open
data. Springer.
Baader, F. 2003. The description logic handbook: the-
201
STIX (Barnum 2012) is the most comprehensive standard
to unify cybersecurity information sharing and enables ex-
tensions by incorporating vocabulary from several other
standards. Most cybersecurity systems use STIX represen-
tation expressed in XML. A major limitation of XML is
that it cannot support reasoning. One of the earliest efforts
for developing ontologies to support reasoning for cyber-
security was by our group in 2003 (Pinkston et al. 2003;
Undercoffer et al. 2004). Undercoffer et al. implemented a
target centric ontology for the domain of intrusion detec-
tion composed of 23 classes and 190 properties. (More et al.
2012) from our group extended this IDS ontology to incor-
porate and integrate cybersecurity related information from
heterogeneous sources. The UCO ontology is a further ex-
tension and enhancement of the IDS ontology. It represents
and maps publicly available standards and ontologies in the
cybersecurity domain. (Ulicny et al. 2014) created a STIX
ontology based on the STIX schema along with a number of
related ontologies. We have defined mappings between UCO
and STIX ontology classes. (Iannacone et al. 2015) devel-
oped STUCCO ontology for integrating different structured
and un-structured data sources along with data extractors.
STUCCO ontology is composed of 15 entity types and 115
properties and is defined using JSON-schema. We translated
STUCCO from JSON to OWL in order to map it to UCO on-
tology. (Vaibhav Khadilkar and Thuraisingham ) developed
CPE ontology for Common Platform Enumerations in Na-
tional Vulnerability Database. The ontology is described in
the report however it is not publicly available for download
and hence we could not map it to UCO. A number of on-
tologies have been developed for specialized domains within
cybersecurity such as ontology for insider threats in finance
domain (Kul and Upadhyaya 2015), ontology for network
security attack (Simmonds, Sandilands, and van Ekert 2004;
Chan et al. 2015) and a cloud security ontology (Amit Hen-
dre and Joshi 2014). UCO ontology can serve as a backbone
to link these specialized ontologies.
Conclusion and Future Work Directions
The UCO ontology provides a common understanding of cy-
bersecurity domain and unifies most commonly used cyber-
security standards. Unlike existing independent and isolated
cybersecurity ontologies, UCO has been mapped to pub-
licly available ontologies in the cybersecurity domain and
hence offers more coverage. In addition to that, UCO is also
mapped to concepts in general world knowledge sources to
support diverse use cases. To the best of our knowledge this
is the first such effort in the area of cybersecurity ontolo-
gies to unify cybersecurity information with general world
knowledge about entities and relations. We presented differ-
ent use cases that demonstrate the utility and value of UCO
ontology in supporting diverse security scenarios. We briefly
discuss promising future work directions below.
Temporal Representation and Reasoning: Cybersecu-
rity data and information may have a temporal component,
for example timestamps associated with files, system logs
and network events etc. The current version of UCO ontol-
ogy uses a very basic representation of time where time is
represented as a data property associated with classes that
represent events. A number of frameworks and representa-
tions have been proposed in research such as OWL-Time
(Hobbs and Pan 2004) and time-entry (Pan and Hobbs 2004)
which provide vocabularies for stating facts about temporal
instants and intervals. In the future, we plan to extend UCO
ontology to represent time instances and intervals so that it
can support temporal reasoning.
Modeling Uncertainty and Confidence: Ontology de-
sign requires crisp logic i.e. any sentences in these languages
such as asserted facts, domain knowledge, or reasoning re-
sults must be either true or false and nothing in between.
Real world domains contain uncertain knowledge because
of incomplete or partial information that is true only to a cer-
tain degree. Probability theory is a natural choice for dealing
with this kind of uncertainty. There is a large body of liter-
ature on fuzzy logic and probabilistic reasoning. More re-
cent developments seek to combine First Order Logic with
probabilistic models, such as the work on Markov Logic
Networks (Richardson and Domingos 2006) and Bayesian
Logic (Milch et al. 2007). Future work directions include
reviewing different approaches, identifying and analyzing
shortcomings and encountered challenges followed by the
choice of suitable representation to extend UCO ontology.
Cybersecurity Information Extraction from Unstruc-
tured Data: Cybersecurity vulnerabilities are typically
identified and published publicly but response has always
been slow in covering up these vulnerabilities because there
is no automatic mechanism to understand and process this
unstructured text published on the web. There is a strong
need for systems that can automatically analyze unstructured
text and extract vulnerability entities and concepts from vari-
ous non-traditional unstructured data sources such as cyber-
security blogs, security bulletins and hackers forums. This
information extraction task will help expediting the process
of understanding and realizing the vulnerabilities and thus
making systems secure at faster rate. There is some initial
work in (Mulwad et al. 2011; Joshi, Lal, and Finin 2013;
Jones et al. 2015). The UCO ontology can benefit these
approaches by guiding the extraction process and also in
checking consistency of the extracted facts.
Acknowledgements
The research described in this paper was supported by a seed
grant from MITRE.
References
Amit Hendre, T. F., and Joshi, K. P. 2014. Ontology de-
scribing Cloud data threats, security controls and compli-
ance standards. http://ebiquity.umbc.edu/resource/html/id/
361/Cloud-Security-and-Compliance-Ontology. [Online;
accessed 21-October-2015].
Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak,
R.; and Ives, Z. 2007. Dbpedia: A nucleus for a web of open
data. Springer.
Baader, F. 2003. The description logic handbook: the-
201
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
ory, implementation, and applications. Cambridge univer-
sity press.
Barnum, S. 2012. Standardizing cyber threat intelligence in-
formation with the structured threat information expression
(stix). MITRE Corporation 11.
Berners-Lee, T.; Bizer, C.; and Heath, T. 2009. Linked data-
the story so far. International Journal on Semantic Web and
Information Systems 5(3):1–22.
Chan, P.; Theron, J.; van Heerden, R.; and Leenen, L.2015.
An ontological knowledge base for cyber network attack
planning. In Iccws 2015-The Proceedings of the 10th In-
ternational Conference on Cyber Warfare and Security, 69.
Academic Conferences Limited.
Domingue, J.; Fensel, D.; and Hendler, J. A. 2011. Intro-
duction to the semantic web technologies. Handbook of Se-
mantic Web Technologies 1–41.
Hobbs, J. R., and Pan, F. 2004. An ontology of time for
the semantic web. ACM Transactions on Asian Language
Information Processing (TALIP) 3(1):66–85.
Iannacone, M.; Bohn, S.; Nakamura, G.; Gerth, J.; Huffer,
K.; Bridges, R.; Ferragut, E.; and Goodall, J. 2015. De-
veloping an ontology for cyber security knowledge graphs.
In Proceedings of the 10th Annual Cyber and Information
Security Research Conference, 12. ACM.
Jones, C. L.; Bridges, R. A.; Huffer, K.; and Goodall, J.
2015. Towards a relation extraction framework for cyber-
security concepts. arXiv preprint arXiv:1504.04317.
Joshi, A.; Lal, R.; and Finin, T. 2013. Extracting cybersecu-
rity related linked data from text. In Semantic Computing
(ICSC), 2013 IEEE Seventh International Conference on,
252–259. IEEE.
Kul, G., and Upadhyaya, S. 2015. A preliminary cyber on-
tology for insider threats in the financial sector. In Proceed-
ings of the 7th ACM CCS International Workshop on Man-
aging Insider Security Threats, 75–78. ACM.
Milch, B.; Marthi, B.; Russell, S.; Sontag, D.; Ong, D. L.;
and Kolobov, A. 2007. 1 blog: Probabilistic models with
unknown objects. Statistical relational learning 373.
More, S.; Matthews, M.; Joshi, A.; and Finin, T. 2012.
A knowledge-based approach to intrusion detection model-
ing. In IEEE Symposium on Security and Privacy Workshops
(SPW), 75–81. IEEE.
Mulwad, V.; Li, W.; Joshi, A.; Finin, T.; and Viswanathan, K.
2011. Extracting information about security vulnerabilities
from web text. In Web Intelligence and Intelligent Agent
Technology (WI-IAT), 2011 IEEE/WIC/ACM International
Conference on, volume 3, 257–260. IEEE.
Pan, F., and Hobbs, J. R. 2004. Time in owl-s. In Pro-
ceedings of the AAAI Spring Symposium on Semantic Web
Services, 29–36.
Pinkston, J.; Undercoffer, J.; Joshi, A.; and Finin, T. 2003. A
target-centric ontology for intrusion detection. University of
Maryland, Baltimore County Department of Computer Sci-
ence and Electrical Engineering.
Richardson, M., and Domingos, P. 2006. Markov logic net-
works. Machine learning 62(1-2):107–136.
Simmonds, A.; Sandilands, P.; and van Ekert, L. 2004. An
ontology for network security attacks. In Applied Comput-
ing. Springer. 317–323.
Suchanek, F. M.; Kasneci, G.; and Weikum, G.2008. Yago:
A large ontology from wikipedia and wordnet. Web Seman-
tics: Science, Services and Agents on the World Wide Web
6(3):203–217.
Ulicny, B. E.; Moskal, J. J.; Kokar, M. M.; Abe, K.; and
Smith, J. K. 2014. Inference and ontologies. In Cyber
Defense and Situational Awareness. Springer. 167–199.
Undercoffer, J.; Pinkston, J.; Joshi, A.; and Finin, T. 2004.
A target-centric ontology for intrusion detection. In 18th
International Joint Conference on Artificial Intelligence, 9–
15.
Vaibhav Khadilkar, J. R., and Thuraisingham, B. Seman-
tic web implementation scheme for national vulnerability
database (common platform enumeration data). Technical
report, The University of Texas at Dallas.
202
sity press.
Barnum, S. 2012. Standardizing cyber threat intelligence in-
formation with the structured threat information expression
(stix). MITRE Corporation 11.
Berners-Lee, T.; Bizer, C.; and Heath, T. 2009. Linked data-
the story so far. International Journal on Semantic Web and
Information Systems 5(3):1–22.
Chan, P.; Theron, J.; van Heerden, R.; and Leenen, L.2015.
An ontological knowledge base for cyber network attack
planning. In Iccws 2015-The Proceedings of the 10th In-
ternational Conference on Cyber Warfare and Security, 69.
Academic Conferences Limited.
Domingue, J.; Fensel, D.; and Hendler, J. A. 2011. Intro-
duction to the semantic web technologies. Handbook of Se-
mantic Web Technologies 1–41.
Hobbs, J. R., and Pan, F. 2004. An ontology of time for
the semantic web. ACM Transactions on Asian Language
Information Processing (TALIP) 3(1):66–85.
Iannacone, M.; Bohn, S.; Nakamura, G.; Gerth, J.; Huffer,
K.; Bridges, R.; Ferragut, E.; and Goodall, J. 2015. De-
veloping an ontology for cyber security knowledge graphs.
In Proceedings of the 10th Annual Cyber and Information
Security Research Conference, 12. ACM.
Jones, C. L.; Bridges, R. A.; Huffer, K.; and Goodall, J.
2015. Towards a relation extraction framework for cyber-
security concepts. arXiv preprint arXiv:1504.04317.
Joshi, A.; Lal, R.; and Finin, T. 2013. Extracting cybersecu-
rity related linked data from text. In Semantic Computing
(ICSC), 2013 IEEE Seventh International Conference on,
252–259. IEEE.
Kul, G., and Upadhyaya, S. 2015. A preliminary cyber on-
tology for insider threats in the financial sector. In Proceed-
ings of the 7th ACM CCS International Workshop on Man-
aging Insider Security Threats, 75–78. ACM.
Milch, B.; Marthi, B.; Russell, S.; Sontag, D.; Ong, D. L.;
and Kolobov, A. 2007. 1 blog: Probabilistic models with
unknown objects. Statistical relational learning 373.
More, S.; Matthews, M.; Joshi, A.; and Finin, T. 2012.
A knowledge-based approach to intrusion detection model-
ing. In IEEE Symposium on Security and Privacy Workshops
(SPW), 75–81. IEEE.
Mulwad, V.; Li, W.; Joshi, A.; Finin, T.; and Viswanathan, K.
2011. Extracting information about security vulnerabilities
from web text. In Web Intelligence and Intelligent Agent
Technology (WI-IAT), 2011 IEEE/WIC/ACM International
Conference on, volume 3, 257–260. IEEE.
Pan, F., and Hobbs, J. R. 2004. Time in owl-s. In Pro-
ceedings of the AAAI Spring Symposium on Semantic Web
Services, 29–36.
Pinkston, J.; Undercoffer, J.; Joshi, A.; and Finin, T. 2003. A
target-centric ontology for intrusion detection. University of
Maryland, Baltimore County Department of Computer Sci-
ence and Electrical Engineering.
Richardson, M., and Domingos, P. 2006. Markov logic net-
works. Machine learning 62(1-2):107–136.
Simmonds, A.; Sandilands, P.; and van Ekert, L. 2004. An
ontology for network security attacks. In Applied Comput-
ing. Springer. 317–323.
Suchanek, F. M.; Kasneci, G.; and Weikum, G.2008. Yago:
A large ontology from wikipedia and wordnet. Web Seman-
tics: Science, Services and Agents on the World Wide Web
6(3):203–217.
Ulicny, B. E.; Moskal, J. J.; Kokar, M. M.; Abe, K.; and
Smith, J. K. 2014. Inference and ontologies. In Cyber
Defense and Situational Awareness. Springer. 167–199.
Undercoffer, J.; Pinkston, J.; Joshi, A.; and Finin, T. 2004.
A target-centric ontology for intrusion detection. In 18th
International Joint Conference on Artificial Intelligence, 9–
15.
Vaibhav Khadilkar, J. R., and Thuraisingham, B. Seman-
tic web implementation scheme for national vulnerability
database (common platform enumeration data). Technical
report, The University of Texas at Dallas.
202
1 out of 8
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.