Thursday, September 07, 2006

Port Independent Protocol Identification

One of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 Holy Grails of network security monitoring is Port Independent Protocol Identification (PIPI -- lousy acronymn, but technically useful). PIPI allows inspection of protocols regardless of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 port in use. PIPI has many security implications for discovery and (preferably) denial of covert channels, back doors, and ocá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r policy-violating channels. PIPI can also help network engineers better understand cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 legitimate use of protocols on cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365ir networks.

Some implementations exist. Last year after visiting Fidelis Security I mentioned cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365ir appliance uses port-neutral methods to identify protocols. Sourcefire's RNA also does PIPI. The Linux-only Application Layer Packet Classifier for Linux (L7-filter) and IPP2P projects use signatures to discover protocols on arbitrary ports. I'd like to hear of ocá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r approaches.

Today, thanks to geek00l, I read cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 paper Dynamic Application-Layer Protocol Analysis
for Network Intrusion Detection
by an all-star team from Technische Universität München and Berkeley's ICSI Center for Internet Research. From cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 abstract:

In this paper, we discuss cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 design and implementation of a NIDS extension to perform dynamic application-layer protocol analysis. For each connection, cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 system first identifies potential protocols in use and cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365n activates appropriate analyzers to verify cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 decision and extract higher-level semantics. We demonstrate cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 power of our enhancement with three examples: reliable detection of applications not using cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365ir standard ports, payload inspection of FTP data transfers, and detection of IRC-based botnet clients and servers.

Even better, cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365ir implementation is scheduled for integration in cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 next release of Bro, perhaps next month.

On a related PIPI note, in cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 future I expect we will not create firewall policies using port numbers as a major component. A security policy enforcement system might instead allow an administrator to implement a policy like "deny all outbound HTTP except [real] HTTP on port 80 and HTTPS on port 443." In ocá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r words, network (i.e., traffic-centric) security policy will be decoupled from ports and instead focus on applications and data.

14 comments:

Joao Barros said...

I once mentioned this subject on cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 pf mailing list. It's almost taboo...
I don't know if you noticed but cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365re was some code added to CURRENT that can tag traffic based on protocol inspection:

Log:
A netgraph node that can do different manipulations with
mbuf_tags(9) on packets.

Submitted by: Vadim Goncharov


Check ng_tag(4)
Before cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 commit cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 discussion started on current@

Oddly enough, Microsoft ISA Server supports L7 inspection.

Richard Bejtlich said...

Joao,

Thanks for your comment! I found cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 thread which includes this helpful example.

Anonymous said...

PADS?

http://taosecurity.blogspot.com/2004/08/passive-asset-detection-system.html

If not fully capable, cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 remaining functionality is in cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 works. Regex is pretty flexible to find what anything you may be looking for.

Richard Bejtlich said...

PADS-man (and ocá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365rs), please read cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 paper I mentioned -- it discusses limitations of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365se sorts of systems. PADS is still awesome though.

By cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 way I appreciate your link to an earlier post!

Unknown said...

I can foresee something like this happening, especially as egress of data out of networks keeps trying to occur over known ports using obfuscated (encrypted or proprietary) protocols.

I admit I need to read up on this some more, but this almost sounds like you need signatures or samples to determine protocols. Does that mean things that evade AV or IDS signatures (slight changes) will become normal in even protocols for application communications?

Joao Barros said...

Yes, you will need signatures to determine protocols, mostly like for example snort works, with regex to find known patterns in cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 traffic.
But like you said, changes that would make cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 regex miss would make cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 marking/detection fail.
Actually this was cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 argument I got from cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 pf mailing list: this is fallible.
Well yes it is, but I'd racá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r have something racá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365n nothing.

It's fallible but cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 technology is used in IDS, firewalls, traffic shapers. Am I cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 one who's wrong?

Richard Bejtlich said...

The implementation in cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 paper does not rely on signatures alone. Bro-PIA uses signatures to signal additional inspection by an application-aware protocol inspection module. The decision is not strictly made on a signature match.

Martin Roesch said...

"Signature matching" (regex) is not a great way of doing it IMO. In RNA we do full validation of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 protocols we detect via programmatic methods, each protocol is validated by a stateful analyzer that knows cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 structure of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 protocol and validates cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 traffic at hand. When we get a non-match, we have a multi-method system that uses three separate approaches to try to determine cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 protocol. As a result RNA has a very low false positive rate, once RNA identifies a protocol it's very rare for it to be wrong. The same can't be said of regex based methods because cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365re's no structural validation of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 protocol typically, just a set of "keyphrase matches" that are subject to all cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 standard string matching problems. We also use a confidence model in RNA give you an idea of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 statistical certainty of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 data, which is useful as a thresholding mechanism for automated systems to take advantage of.

I wish I could go more detail but it's proprietary technology at this point. Maybe we ought to open source it... :)

Anonymous said...

Ocá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r product that performs application layer traffic classification
is Qradar from Q1Labs : http://www.q1labs.com/content.php?id=175

Augusto Barros said...

Check Point already does something like that (to work with protocols instead of port numbers). I had a problem to make an application that uses simple HTTP over port 443 pass through it because cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 firewall was reporting that cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 wrong protocol was being used on that port.

This approach can detect and block lots of tunnels, but we still have problems to detect and block, for example, tunnels like OzymanDNS (DNS tunnel - Dan Kaminsky) and httptunnel, where binary data from cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 protocol being tunneled is encoded with stuff like base64 or base32. For those cases I can only think in something like flows behaviour analysis for detection.

Anonymous said...

This is something near and dear to my heart, since I'm doing my master's on it. :-) I'm with Augusto, I think "Deep Packet Inspection" i.e. protocol parsing is going to hit a brick wall. It's quite useful where it can be used, but it's no good against encrypted data (https anyone?), and it's probably no good against my pet problem, hidden channels in http; http is simply too flexible in what it allows you to send to be able to parse everything. Base64 itself is no problem - if cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 parser can recognize it, it can decode and process it. However, if a hacker is doing a shell over HTTP and tunneling all cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365ir traffic in data blocks labelled as images or javascript literals or cookies or whatever, you're hosed.

I'll post references to some academic papers on classifying traffic on behavioural attributes when I have time, if cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365re's any interest. Annie DeMontigny-LeBouef came up with a nice set of attributes, Mike Collins has a paper coming up at ESORICS, Zuev and Moore did some work in cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 area, Borders and Prakash did 'WebTap' in 2005 (I think), and cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 folks at Swinburne in Australia have an interesting set of papers on cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 subject. That's just off cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 top of my head, though, cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365re's more that I'm forgetting. Again, those are mainly focused on classifying traffic based on attributes derived mainly from packet lengths, interpacket delays, data volumes, directional dynamics, and cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 like.

Very interesting stuff, but I think it's still a ways off from being ready to go live. :-) Protocol parsing is much more feasible, but like I said, I think cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365re's a lot of ground that it won't cover. That's not to say, of course, that it's not worth doing, just like I wouldn't suggest that it's not worth having a firewall because of its limitations.

Richard Bejtlich said...

Tim,

Is your work published on cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 Web?

Anonymous said...

Hi Richard,

No not yet, I'm still working on it. I'm aiming to be finishing up within a couple of months, at which point I'll make it available.

Here's some of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 works I mentioned earlier, though:

Kevin Borders and Atul Prakash. WebTap: Detecting covert web
traffic. In Proceedings of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 11th ACM Conference on Computer
and Communications Security (CCS ’04), October 2004.
http://www.eecs.umich.edu/~aprakash/papers/borders-prakash-ccs04.pdf

T.T.T. Nguyen, G. Armitage, "Training on multiple sub-flows to optimise cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 use of Machine Learning classifiers in real-world IP networks," in (to be presented) IEEE 31st Conference on Local Computer Networks, Tampa, Florida, USA, November 2006.
http://caia.swin.edu.au/pubs/lcn2006-nguyen_armitage_marked.pdf

A. DeMontigny-LeBoeuf. Flow attributes for use in traffic characterization.
Technical report CRC-TN-2005-003, December 2005.
http://www.crc.ca/files/crc/home/research/network/system_apps/network_systems/network_security/publications/ADeMontigny_CRCTN2005003.pdf

Those should give you a flavor for what sort of work is going on, though it's by no means complete. IMHO, cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 field is still fairly immature, and should improve to cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 point of at least being a good compliment to existing approaches.

Erik said...

I recently got a report published, in which I describe a new efficieant algorithm for protocol identification. The report is called "The SPID Algorithm - Statistical Protocol IDentification" And can be downloaded from: www.iis.se/docs/The_SPID_Algorithm_-_Statistical_Protocol_IDentification.pdf.

The SPID algorithm can reliably detect/identify/classify cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 protocol based on just cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 first 5 TCP packets with payload. I do however need more training data in order to use cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 full potential of cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 SPID algorithm.

There is also a proof-of-concept application for cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 SPID algorithm available at SourceForge.