Index .
Our business group is developing
marketing materials for a commercial Visual Abstraction Database for Emergancy
Response (Vader).
The current tutorial is offered, from the research group, in support of this
effort.
The software and data, for this
tutorial, is at the Free download
This paper assumes that one has
worked through the introduction tutorials. For more discussion please call Dr. Paul Prueitt
at 703-981-2676.
Event detection illustrates factual
informative within the context of an inventory of the event types. Events
indicate that specific programs (such as a e-mail program) are talking between
two or more machines.
The research group must inventory
the known event types as the first step in demonstrating the commercial viability
of Vader systems.
We see event types as being
informative about
1)
expert
tacit knowledge of the domain
2)
objects
of perceptual invariance in the domain experts mind
In this tutorial, we illustrate a
"source port behavior" using snort data. Of course there are many ways of doing this. Like object perception in the human mind,
the visual Abstractions can be seen from different angles; however the number
and kind of objects does not change simply by the perceptual act. Each object has its own reality.
Visual Abstractions are virtual objects. With an object knowledge base, expert tacit
knowledge is directly used to control response processes. A perception/action cycle, with memory and
reinforcement, is established as experts work on real problems.
This has to be demonstrated, but we
are very close to this demonstration.
Section 1: A short note on the analytic
conjecture
Up to now, the SLIP conjectures have
been simple link analysis. However, any
logical construction that involves a calculus on the names of the columns, and
on the contents of the column cells is a viable conjecture.
For example:
The
value in the 5th column is x, y or z,
the
time is between p and q,
and
there is an occurrence of the value a, b, or c in the third column
after
the found value in the 5th column
will produce syntagmatic units of
the form < x, r, a >. Once one
has a set of syntagmatic units
{
< x, r, a > },
one can define a set of atoms and
relationship types. One could even
define several types of atoms during the conjecture. The x values and the a values can be
treated differently, for example, so that the r relationship have
a temporal or cause aspect.
These logical calculi on column
names and contents is beyond the scope of the tutorial, except to indicate that
most of the Intrusion Deception System (IDS) rules can be converted directly
into a logical construction that visualizes atoms and link-types. These atoms and links form a substructure for
the production of visual clues regarding the nature and variation of the IDS
events. The IDS log file need not be
the starting point for visual abstraction.
Observationally, we see that the
conjectures produce object invariance that is driven by data content. For each conjecture there are between 3 –
10 major compounds and between 10 – 50 smaller ones. A conjecture can be applied to a data stream by building a
dual-buffer where real time data is accumulated into one buffer and the other
buffer is used to build the event Chemistry.
The dual-buffering architecture is commonly used in real time compression
and encryption streaming.
The conjecture acts as a convolution
over the data steam to produce an real time imaging of exactly those events
that are occurring in the data stream.
Section 2: Snort data
We
received 1833 snort records, and placed six columns of this data into a datawh.txt
file.
Figure
1: (source
port, destination IP) conjecture
We developed a (source port,
destination IP) conjuncture.
Dean Rich and others have talked
with us about a specific "source port behavior" and we have read
about this in the books on intrusion detections.
We have an internal mental model of
"source port behavior".
"Source
port behavior" is part of what is “in the data” and can be useful as a
means to point precisely at an incident.
The source port will often be incremented each time a packet is sent
from the source IP address to the destination IP address. So this should look like a type of reverse
port scan.
Port
scans should be viewable using any one of several conjectures such as (destination
port, source IP), (destination port, destination IP) or
the conjugant conjectures (source IP, destination port), (destination
port, destination IP).
"Source
port behavior” can be used to illustrate the operational properties of visual
Abstraction. In the conjecture (source
port, destination IP), the destination IP organizes source port log file
events. Compounds suggest how source ports use the various destination IP
addresses and how events are reported to the snort log.
The
reader is encouraged to look in the data set and investigate some of these compounds.
Section 3: Perceptual priming using pictorial
icons
The
issue of mental imaging and conceptual priming of the individual experience of
knowledge is a question of science.
This complete description of the issue of mental priming is highly
relevant to our proposed National cyber defense knowledge base.
The White Hats have many mental
images of Cyberspace events, and these mental images can be triggered by a
visualAbstraction (perceptual priming).
The visual Abstractions can also be
used to transfer some domain specific tacit knowledge from a White Hat (or CERT
domain specialist) to someone who knows nothing about Internet phenomenon. The
transfer of knowledge can be within a highly trained community. Using the event types, this community can
protect the core infrastructure of the Internet.
The human awareness, that is primed,
will sometimes leads to the mental recognition that
1)
Something
is understood
2)
Or
that something is "there" but not understood
For example, our snort data set was
given to the research group with the following information:
"
There were some vulnerability scans and other things going on. "
To investigate the vulnerability
scans one should copy the snort2 folder (from snort2.zip) and delete all files
in the data folder except the datawh.txt file.
Then build a conjecture and look at the compounds.
Figure
2: A possible
source port behavior event type occurring to Dip = 192.168.10.249
We will take a slightly different
approach. We first filter the data to
produce a datawh.txt file having only one specific IP that appears to have been
scanned.
Having identified this potential
port scan, we used the SLIPCore to manually bring the single IP address into a
category so that report generation could produce a file having
only Dip = 192.168.10.249. Clearly some
automation would be helpful here, but the I-RIB data
structure is readily available for this type of query and retrieval.
Figure
3: Report generated
for all records having Dip = 192.168.10.249
Having
retrieved the 193 records from the original dataset, we now develop events
using the (source port, destination IP) conjecture (see Figure 4).
Figure 4: The conjecture (source
IP, dest port) in the query set
By
inspection of Figure 5 we verify that there is only one event compound, and that
this compound is defined by the relationship to a single destination IP,
192.168.10.249.
This
single IP address has been passed information from 95 source ports. It is natural to ask if the compete set of
source ports are related to the same source IP, or are related to source IPs
that are known to have a common locus of control.
Figure 5: The view of the single compound
having the proper number (95) of atoms
The
"source port behavior" is likely to be best shown in a (source
IP , dest port) conjecture. But
we will see this same "object" using anyone of several conjectural
rules.
The
notion of a conjectural rule is part of an OSI patent (currently under
development) that protects the vertical market development by OSI Trusted
Partners.
Section 4: Enumeration of event-types
One may descriptively enumerate all
event-types and to thus define what one means by an event, taken in the
abstract; and what is meant by each event, considered by itself.
Event types will become fully
enumerated in the knowledge base part of Vader systems. One then has available a state-gesture model
to implement priming of human perception through the services of the root_KOS.
The vertical market will prove the horizontal technology.
We have not yet enumerated the Cyber
Security event-types, because we still have some open issues. The descriptive enumeration (DE) of event
types is a knowledge acquisition process that is governed by a process model for enumeration.
What is the proper data
instrumentation, data sensor parameterization, and SLIP/CLIP analytic
conjecture needed to produce good visual Abstraction? When is visual Abstraction evocative of the mental experience by
CERT analysts and other White Hats? We
have only the first few examples. But
clearly the event type knowledge base will develop rapidly, once domain experts
begin to use the tool set.
When the first commercial Vader is
prototyped., we will see new types of expectations from the interface
design.
For
example, when a port scan event appears to be starting we will see an
anticipatory object appear in the Vader interface.
Figure 6: Mock up of a Vader controller
As the scan occurs, we should
see the development of the scan profile.
Other real time objects will be viewable, as will be incident
histories. Variations from the typical
object representation will be seen.
After the scan has been
completed we will register the event at a higher level of organizational
detail. Petri type models anticipate
what might happen because of a scan of a certain type. Incident histories detail the behavior of a
program of a particular type. Incident histories detail that a particular type
of event has in fact occurred at a specific time.
The four-layer taxonomy
{ bit stream, intrusion, incident, and policy }
is seen in Figure 7.
Figure 7: The first of four PowerPoint slides on SenseMaking
The four layer “stratified taxonomy” is to be used in a Vader or CDKB system in order to communicate within professional communities. Knowledge management methodology and technology is useful in facilitating this type of collaborative communication.
Thus we see that the facilitation of terminology development and use in a community of practice is an essential aspect related to the notion of a CERT center.
Other uses of visual Abstraction involve:
1) Automatic known event detection
2) Novelty detection
3) Automatic and mediated response
a.
Automatic and mediated adjustment of sensors and
instrumentation
b.
Automatic and mediated adjustment of vulnerability exposure
c.
Routing
of activity into a HoneyNet
d.
Denial
of access response
4)
Assembly
of historical accounts as incident records
Once event types are identified that the Vader knowledge base can be instructed to alert the Vader viewer that specific events are present in the data stream. We can describe the "event" by name and short description and modify the object formation process to look just for this "object".