Intrusion Detection Systems Based on Anomaly Detection Techniques
Transcript
Intrusion Detection Systems Based on Anomaly Detection Techniques
Intrusion Detection Systems Based on Anomaly Detection Techniques Giorgio Giacinto DIEE - UniCA, Pattern Recognition and Applications Group PRISE 2007 - Roma, 6 Giugno 2007 Research activities @DIEE PRA Group The Pattern Recognition and Applications Group of the DIEE (UniCA, Italy) is active in the field of intrusion detection in computer networks based on statistical pattern recognition Current activities are focused on developing “robust” anomaly detector that exhibit low false alarm rates hardness of evasion Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 2 Outline of this talk Anomaly-Based Intrusion Detection System Hidden Markov Models (HMM) and their use for Intrusion Detection A Network-Based IDS based on HMM Experimental results on FTP and SMTP traffic Conclusions Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 3 Anomaly-Based Intrusion Detection Systems Intrusion detection can be performed in two ways by detecting the traffic patterns that match with the signatures of known attacks by labelling as anomalous all traffic patterns that deviates significantly from a model of normal (legitimate) traffic Anomaly detection is aimed at detecting new attacks Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 4 Objective of this work To devise a network- and anomaly-based intrusion detection systems based on the analysis of sequences of commands for a particular TCP service At present analysis of sequences of commands has been proposed for host-based IDS (i.e., for detecting misuse of OS commands) On the other hand, at the network level the analysis of sequences has been performed at the packet level Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 5 An example - normal FTP traffic % ftp diee.unica.it Connected to ftp.diee.unica.it. 220 diee FTP server (Version 5.53 Wed Jul 25 9:30:00 CET 2006) ready. USER (ftp.diee.unica.it:yourlogin): davide 331 Password required for davide. PASS: 230 User davide logged in. ftp> PWD 257 “/” ftp> quit 221 Goodbye. Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 6 An example - anomalous FTP traffic % ftp diee.unica.it Connected to ftp.diee.unica.it. 220 diee FTP server (Version 5.53 Wed Jul 25 9:30:00 CET 2006) ready. USER (ftp.diee.unica.it:yourlogin): anonymous 230 User anonymous logged in. ftp> DELE file.txt 250 Requested file action okay, completed. ftp> quit 221 Goodbye. The proposed analysis of sequences of commands should be a component of an anomaly-based N-IDS. Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 7 A model for sequences of commands 220 USER 230 DELE Q1 Q2 Q3 Q4 t1 t2 t3 t4 t Finite state automata Can be applied in cases of sequences generated by a grammar Probabilistic Models The transitions between states are described by probability chains P (vn ) = P (vn | vn 1 , vn 2 ...vn k ) Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 8 Probabilistic Models Markov chains Their computational cost increases as the order of the process increases Hidden Markov Models They are based on the knowledge of The set of symbols that can be expected in any sequence A set of training sequences HMM of order 1 can model processes of any order The states of the model are hidden Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 9 Motivation for using HMMs The length of the sequence is not known in advance The correlation between the elements in the sequence are not known in advance The internal state of the machine responding to the commands is unknown Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 10 The proposed HMM-based N-IDS Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 11 Hidden Markov Models Let qt be the state of the HMM at time t A Hidden Markov Model is made up of A set S of N hidden states S = {S1 , S 2 ,..., S N } A set of M symbols emitted from the states { V = v1 ,v2 ,...,v M Giorgio Giacinto } PRISE 2007 - Roma, 6 Giugno 2007 12 Hidden Markov Models A probability distribution of the state transitions { } 1 < i, j < N = P (q = S | q = S ) A = aij aij t j t 1 i N a j =1 ij =1 The probability of emission of symbols ( bj (k) = P vk | qt = S j ) where bj(k) is the probability that at time t the state is Sj and the symbol vk is emitted Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 13 Joint probability of emitted symbols and hidden variables Given the probability of being in the state i at the beginning of the process (t = 1) i = P(q1 = Si) 1 i N the joint probability of emitted symbols (y) and states (q) can be expressed as ( ) = P (q ) P (q T 1 T 1 T 1 P y ,q Giorgio Giacinto 1 t =1 t +1 ) P( y | q ) T | qt t =1 PRISE 2007 - Roma, 6 Giugno 2007 t t 14 Training HMM Evaluating a sequence Training For a given sequence of symbols {Ol}, find the HMM model which maximises P ( Ol | ) An iterative procedure is employed (Baum-Welch) Evaluation Compute the value of P(O | ) for a given sequence of symbols {O} The forward-backward procedure is employed Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 15 Dictionary of symbols As a first choice the dictionary of symbols can be made up of all the commands defined in the RFC Drawback: a large number of symbols are not actually used, so they can unnecessarily increase the computational cost of the HMM Solution Only the symbols in the training set are used If new symbols are observed during operation Giorgio Giacinto Either they are discarded Or they are substituted by a wildcard symbol NaS (Not a Symbol) PRISE 2007 - Roma, 6 Giugno 2007 16 Use of NaS Dictionary of symbols {a, b, c, d, NaS} Analysed sequence of symbols [a c g b g a d] This sequence is thus transformed into [a c NaS b NaS a d] It is easy to see that the probability of emission of NaS is null Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 17 Discarding symbols not seen at training time Dictionary of symbols {a, b, c, d} Analysed sequence of symbols [a c g b g a d] This sequence is thus transformed into [a c b a d] by deleting the unknown symbol ‘g’ Attacks can be still detected if the sequence of known symbols is anomalous If the anomaly is the presence of symbol ‘g’, then the attack is not detected! Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 18 Experimental results The proposed system has been tested on two datasets of real traffic collected on the network of Tiscali Services SpA FTP SMTP The sequence of commands have been extracted by Snort and the proposed technique has been implemented as a plug-in for Snort A number of attacks have been generated using commercial and open-source software Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 19 Experiments on FTP traffic 40,000 “legitimate” sequences 5 training set have been extracted by drawing randomly 32,000 sequences Consequently, for each training set, a test set of 8,000 sequences was available Each training set has been subdivided into 10 sets of 3,200 sequences each 50 HMM have been trained and their outputs have been combined Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 20 Attacks The 40,000 sequences using for training and testing are considered “attack-free” as they resulted from the filtering of the traffic by Snort. 22 attacks have been generated using the IDS-Informer software, which is used to test the detection capabilities of IDS Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 21 Experimental setup 5 times 40,000 Sequences Random sampling 80% 32,000 Training Sequences Giorgio Giacinto 8,000 Test Sequences PRISE 2007 - Roma, 6 Giugno 2007 Test Set 22 Attack Sequences 22 Experimental Results Dictionaries without NaS 30 States AUC Detection Rate 100% FA 1% FA% D.R.% FA(actual)% avg (std dev) avg (std dev) avg (std dev) avg (std dev) 100 HMM 0.873 (0.006) 89.77 (4.14) 58.72 (2.52) 0.72 (0.23) Arithmetic Mean 0.874 (0.002) 97.27 (0.61) 63.63 (4.54) 0.31 (0.15) Geometric Mean 0.878 (0.002) 96.19 (1.50) 76.36 (2.03) 0.71 (0.20) Decision Templates 0.933 (0.004) 93.84 (8.40) 65.54 (3.80) 0.35 (0.16) Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 23 Experimental results Dictionaries with NaS 20 States AUC Detection Rate 100% FA 1% FA% D.R.% FA(actual)% avg (std dev) avg (std dev) avg (std dev) avg (std dev) 100 HMM 0.967 (0.004) 59.60 (4.46) 90.94 (1.27) 0.74 (0.22) Arithmetic Mean 0.974 (0.002) 79.23 (3.02) 92.72 (6.1) 0.33 (0.16) Geometric Mean 0.972 (0.001) 52.27 (3.06) 95.45 (0) 0.89 (0.09) Decision Templates 0.965 (0.002) 52.34 (9.69) 95.45 (0) 0.41 (0.18) Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 24 Experimental results Dictionaries with NaS 25 States AUC Detection Rate 100% FA 1% FA% D.R.% FA(actual)% avg (std dev) avg (std dev) avg (std dev) avg (std dev) 100 HMM 0.969 (0.01) 55.15 (7.11) 93.54 (2.33) 0.78 (0.26) Arithmetic Mean 0.973 (0.004) 54.65 (5.23) 92.72 (6.1) 0.44 (0.12) Geometric Mean 0.971 (0.002) 54.70 (5.22) 95.45 (0) 0.97 (0.05) Decision Templates 0.962 (0.003) 88.08 (11.16) 95.45 (0) 0.54 (0.11) Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 25 Experimental results Dictionaries with NaS 30 States AUC Detection Rate 100% FA 1% FA% D.R.% FA(actual)% avg (std dev) avg (std dev) avg (std dev) avg (std dev) 100 HMM 0.969 (0.003) 57.85 (3.57) 92.97 (0.65) 0.74 (0.16) Arithmetic Mean 0.974 (0.0004) 55.92 (0.62) 95.45 (0) 0.53 (0.13) Geometric Mean 0.971 (0.0008) 55.00 (1.20) 95.45 (0) 1.01 (0.10) Decision Templates 0.962 (0.004) 86.02 (8.12) 95.45 (0) 0.62 (0.17) Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 26 Experiments on SMTP traffic 5,500 sequences used for training 5,500 sequences used for testing 22 attacks (Nessus) Performances have been assessed by comparing the alarms raised by the HMM with those raised by the rules in Snort (2.6.0.2) Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 27 Snort configuration preprocessor smtp: \ ports { 25 } \ inspection_type stateful \ ignore_tls_data \ normalize cmds \ valid_cmds { MAIL RCPT HELO EHLO HELP } \ alert_unknown_cmds \ normalize_cmds { EXPN VRFY RCPT } \ max_command_line_len 500 \ max_header_line_len 500 \ max_response_line_len 500 \ alt_max_command_line_len 500 { MAIL } \ alt_max_command_line_len 500 { RCPT } \ alt_max_command_line_len 500 { HELP HELO ETRN } \ alt_max_command_line_len 555 { EXPN VRFY } \ max_num_cmds 1000000000 \ enable_cmd_order_hmm Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 28 Performance assessment Alarms Test Set FA Rules on Validation DR Rules on attacks FA Anomaly on Validation DR Anomaly on Attacks I 54 (0.98%) 2 (9.09%) 162 (2.95%) 7 (31,82%) II 85 (1.55%) 2 (9.09%) 293 (5,33%) 7 (31,82%) Anomaly Detection Alarms Validation set Attack set Test Set Unknown command Command order Unknown command Command order I 67 95 6 1 II 111 182 6 1 Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 29 Conclusions The use of HMM for detecting anomalies in the use of TCP services showed a good capability of detecting attacks characterised by the use of infrequent commands an atypical order of commands This module is designed to work in combination with other anomaly-based modules (i.e., based on content analysis) to improve the detection capabilities and decrease the false alarm rate. Giorgio Giacinto PRISE 2007 - Roma, 6 Giugno 2007 30