Intrusion Detection Systems Based on Anomaly Detection Techniques

Transcript

Intrusion Detection Systems Based on Anomaly Detection Techniques
Intrusion Detection Systems
Based on Anomaly Detection
Techniques
Giorgio Giacinto
DIEE - UniCA, Pattern Recognition
and Applications Group
PRISE 2007 - Roma, 6 Giugno 2007
Research activities @DIEE
PRA Group
The Pattern Recognition and Applications
Group of the DIEE (UniCA, Italy) is active in
the field of intrusion detection in computer
networks based on statistical pattern
recognition
Current activities are focused on developing
“robust” anomaly detector that exhibit
low false alarm rates
hardness of evasion
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
2
Outline of this talk
Anomaly-Based Intrusion Detection System
Hidden Markov Models (HMM) and their use
for Intrusion Detection
A Network-Based IDS based on HMM
Experimental results on FTP and SMTP
traffic
Conclusions
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
3
Anomaly-Based Intrusion
Detection Systems
Intrusion detection can be performed in two
ways
by detecting the traffic patterns that match with
the signatures of known attacks
by labelling as anomalous all traffic patterns that
deviates significantly from a model of normal
(legitimate) traffic
Anomaly detection is aimed at detecting new
attacks
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
4
Objective of this work
To devise a network- and anomaly-based
intrusion detection systems based on the
analysis of sequences of commands for a
particular TCP service
At present analysis of sequences of commands
has been proposed for host-based IDS (i.e., for
detecting misuse of OS commands)
On the other hand, at the network level the
analysis of sequences has been performed at the
packet level
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
5
An example - normal FTP
traffic
% ftp diee.unica.it
Connected to ftp.diee.unica.it.
220 diee FTP server (Version 5.53 Wed Jul 25 9:30:00 CET 2006) ready.
USER (ftp.diee.unica.it:yourlogin): davide
331 Password required for davide.
PASS:
230 User davide logged in.
ftp> PWD
257 “/”
ftp> quit
221 Goodbye.
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
6
An example - anomalous FTP
traffic
% ftp diee.unica.it
Connected to ftp.diee.unica.it.
220 diee FTP server (Version 5.53 Wed Jul 25 9:30:00 CET 2006) ready.
USER (ftp.diee.unica.it:yourlogin): anonymous
230 User anonymous logged in.
ftp> DELE file.txt
250 Requested file action okay, completed.
ftp> quit
221 Goodbye.
The proposed analysis of sequences of commands
should be a component of an anomaly-based N-IDS.
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
7
A model for sequences of
commands
220
USER
230
DELE
Q1
Q2
Q3
Q4
t1
t2
t3
t4
t
Finite state automata
Can be applied in cases of sequences generated
by a grammar
Probabilistic Models
The transitions between states are described by
probability chains P (vn ) = P (vn | vn 1 , vn 2 ...vn k )
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
8
Probabilistic Models
Markov chains
Their computational cost increases as the order of
the process increases
Hidden Markov Models
They are based on the knowledge of
The set of symbols that can be expected in any
sequence
A set of training sequences
HMM of order 1 can model processes of any order
The states of the model are hidden
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
9
Motivation for using HMMs
The length of the sequence is not known in
advance
The correlation between the elements in the
sequence are not known in advance
The internal state of the machine responding
to the commands is unknown
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
10
The proposed HMM-based
N-IDS
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
11
Hidden Markov Models
Let qt be the state of the HMM at time t
A Hidden Markov Model is made up of
A set S of N hidden states
S = {S1 , S 2 ,..., S N }
A set of M symbols emitted from the states
{
V = v1 ,v2 ,...,v M
Giorgio Giacinto
}
PRISE 2007 - Roma, 6 Giugno 2007
12
Hidden Markov Models
A probability distribution of the state
transitions
{ } 1 < i, j < N
= P (q = S | q = S )
A = aij
aij
t
j
t 1
i
N
a
j =1
ij
=1
The probability of emission of symbols
(
bj (k) = P vk | qt = S j
)
where bj(k) is the probability that at time t the
state is Sj and the symbol vk is emitted
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
13
Joint probability of emitted
symbols and hidden variables
Given the probability of being in the state i at
the beginning of the process (t = 1)
i = P(q1 = Si) 1 i N
the joint probability of emitted symbols (y)
and states (q) can be expressed as
(
) = P (q ) P (q
T 1
T
1
T
1
P y ,q
Giorgio Giacinto
1
t =1
t +1
) P( y | q )
T
| qt
t =1
PRISE 2007 - Roma, 6 Giugno 2007
t
t
14
Training HMM
Evaluating a sequence
Training
For a given sequence of symbols {Ol}, find
the HMM model which maximises P ( Ol | )
An iterative procedure is employed (Baum-Welch)
Evaluation
Compute the value of P(O | ) for a given
sequence of symbols {O}
The forward-backward procedure is employed
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
15
Dictionary of symbols
As a first choice the dictionary of symbols can be
made up of all the commands defined in the RFC
Drawback: a large number of symbols are not actually
used, so they can unnecessarily increase the
computational cost of the HMM
Solution
Only the symbols in the training set are used
If new symbols are observed during operation
Giorgio Giacinto
Either they are discarded
Or they are substituted by a wildcard symbol NaS (Not a Symbol)
PRISE 2007 - Roma, 6 Giugno 2007
16
Use of NaS
Dictionary of symbols {a, b, c, d, NaS}
Analysed sequence of symbols [a c g b g a d]
This sequence is thus transformed into
[a c NaS b NaS a d]
It is easy to see that the probability of
emission of NaS is null
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
17
Discarding symbols not seen
at training time
Dictionary of symbols {a, b, c, d}
Analysed sequence of symbols [a c g b g a d]
This sequence is thus transformed into
[a c b a d]
by deleting the unknown symbol ‘g’
Attacks can be still detected if the sequence of
known symbols is anomalous
If the anomaly is the presence of symbol ‘g’, then
the attack is not detected!
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
18
Experimental results
The proposed system has been tested on two
datasets of real traffic collected on the network of
Tiscali Services SpA
FTP
SMTP
The sequence of commands have been extracted by
Snort and the proposed technique has been
implemented as a plug-in for Snort
A number of attacks have been generated using
commercial and open-source software
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
19
Experiments on FTP traffic
40,000 “legitimate” sequences
5 training set have been extracted by drawing
randomly 32,000 sequences
Consequently, for each training set, a test set of
8,000 sequences was available
Each training set has been subdivided into 10
sets of 3,200 sequences each
50 HMM have been trained and their outputs
have been combined
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
20
Attacks
The 40,000 sequences using for training and
testing are considered “attack-free” as they
resulted from the filtering of the traffic by
Snort.
22 attacks have been generated using the
IDS-Informer software, which is used to test
the detection capabilities of IDS
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
21
Experimental setup
5 times
40,000
Sequences
Random sampling 80%
32,000
Training
Sequences
Giorgio Giacinto
8,000
Test
Sequences
PRISE 2007 - Roma, 6 Giugno 2007
Test Set
22 Attack
Sequences
22
Experimental Results
Dictionaries without NaS
30 States
AUC
Detection Rate
100%
FA 1%
FA%
D.R.%
FA(actual)%
avg (std dev)
avg (std dev)
avg (std dev)
avg (std dev)
100 HMM
0.873 (0.006)
89.77 (4.14)
58.72 (2.52)
0.72 (0.23)
Arithmetic
Mean
0.874 (0.002)
97.27 (0.61)
63.63 (4.54)
0.31 (0.15)
Geometric
Mean
0.878 (0.002)
96.19 (1.50)
76.36 (2.03)
0.71 (0.20)
Decision
Templates
0.933 (0.004)
93.84 (8.40)
65.54 (3.80)
0.35 (0.16)
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
23
Experimental results
Dictionaries with NaS
20 States
AUC
Detection Rate
100%
FA 1%
FA%
D.R.%
FA(actual)%
avg (std dev)
avg (std dev)
avg (std dev)
avg (std dev)
100 HMM
0.967 (0.004)
59.60 (4.46)
90.94 (1.27)
0.74 (0.22)
Arithmetic
Mean
0.974 (0.002)
79.23 (3.02)
92.72 (6.1)
0.33 (0.16)
Geometric
Mean
0.972 (0.001)
52.27 (3.06)
95.45 (0)
0.89 (0.09)
Decision
Templates
0.965 (0.002)
52.34 (9.69)
95.45 (0)
0.41 (0.18)
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
24
Experimental results
Dictionaries with NaS
25 States
AUC
Detection Rate
100%
FA 1%
FA%
D.R.%
FA(actual)%
avg (std dev)
avg (std dev)
avg (std dev)
avg (std dev)
100 HMM
0.969 (0.01)
55.15 (7.11)
93.54 (2.33)
0.78 (0.26)
Arithmetic
Mean
0.973 (0.004)
54.65 (5.23)
92.72 (6.1)
0.44 (0.12)
Geometric
Mean
0.971 (0.002)
54.70 (5.22)
95.45 (0)
0.97 (0.05)
Decision
Templates
0.962 (0.003)
88.08 (11.16)
95.45 (0)
0.54 (0.11)
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
25
Experimental results
Dictionaries with NaS
30 States
AUC
Detection Rate
100%
FA 1%
FA%
D.R.%
FA(actual)%
avg (std dev)
avg (std dev)
avg (std dev)
avg (std dev)
100 HMM
0.969 (0.003)
57.85 (3.57)
92.97 (0.65)
0.74 (0.16)
Arithmetic
Mean
0.974 (0.0004)
55.92 (0.62)
95.45 (0)
0.53 (0.13)
Geometric
Mean
0.971 (0.0008)
55.00 (1.20)
95.45 (0)
1.01 (0.10)
Decision
Templates
0.962 (0.004)
86.02 (8.12)
95.45 (0)
0.62 (0.17)
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
26
Experiments on SMTP traffic
5,500 sequences used for training
5,500 sequences used for testing
22 attacks (Nessus)
Performances have been assessed by
comparing the alarms raised by the HMM
with those raised by the rules in Snort
(2.6.0.2)
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
27
Snort configuration
preprocessor smtp: \
ports { 25 } \
inspection_type stateful \
ignore_tls_data \
normalize cmds \
valid_cmds { MAIL RCPT HELO EHLO HELP } \
alert_unknown_cmds \
normalize_cmds { EXPN VRFY RCPT } \
max_command_line_len 500 \
max_header_line_len 500 \
max_response_line_len 500 \
alt_max_command_line_len 500 { MAIL } \
alt_max_command_line_len 500 { RCPT } \
alt_max_command_line_len 500 { HELP HELO ETRN } \
alt_max_command_line_len 555 { EXPN VRFY } \
max_num_cmds 1000000000 \
enable_cmd_order_hmm
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
28
Performance assessment
Alarms
Test Set
FA
Rules on
Validation
DR
Rules on
attacks
FA
Anomaly on
Validation
DR
Anomaly on
Attacks
I
54 (0.98%)
2 (9.09%)
162 (2.95%)
7 (31,82%)
II
85 (1.55%)
2 (9.09%)
293 (5,33%)
7 (31,82%)
Anomaly Detection Alarms
Validation set
Attack set
Test Set
Unknown
command
Command
order
Unknown
command
Command
order
I
67
95
6
1
II
111
182
6
1
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
29
Conclusions
The use of HMM for detecting anomalies in the use
of TCP services showed a good capability of
detecting attacks characterised by
the use of infrequent commands
an atypical order of commands
This module is designed to work in combination with
other anomaly-based modules (i.e., based on
content analysis) to improve the detection
capabilities and decrease the false alarm rate.
Giorgio Giacinto
PRISE 2007 - Roma, 6 Giugno 2007
30