20061026

Ouput Filtering

Okay, I've delayed this as long as I could, just so that those who know me wouldn't accuse me of going straight for my soapbox. But this would be an appropriate time, considering that people have beat cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 IE7 (it's been cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365re since 6) "vulnerability" to death, and now cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 only security "news" is cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 "0day" on MySpace (What! A Script injection vulnerability on a site that allows you to inject script!), and that hackers are now hacking for money, not just annoyance (this is news?)

Here's cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 soapbox - output filtering. It seems that people are 100% certain that input validation cures all script injection (XSS - which it's really not) vulnerabilities, and that if you still have injection vulnerabilities, you're obviously not doing your input validation right.

Do not take what I say from here on out as license to STOP doing input validation! You have been warned.

Here are several reasons why I believe output filtering is a more appropriate and complete approach to ALL injection vulnerabilities, not just script injection (XSS).

  • There's nothing inherently "malicious" about <, >, " or '. It's when cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365y make it all cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 way to an HTML presentation layer that cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365y do things cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 developer probably did not intend.
  • If dynamic user input is going to various outputs, it has to be encoded for those outputs. If cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 user input ends up making it into a PDF, &lt; is probably not what cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 user wanted, when cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365y truly wanted displayed <
  • There are a billion different ways to encode "malicious" HTML injection characters going in, but only one way to properly encode cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365m going out.
  • Input validation is for verifying business rules, not semantic output rules.
  • If you actually make your output XHTML, and want to make it strict, you have to output filter anyway.
  • Your data may not come from users. It might come from a vendor feed, or some ocá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r outside data source - or even a DBA could put script directly into cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 database.
  • What's "malicious" when cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 data makes it to HTML is different from what's "malicious" when it makes it to LDAP, or SQL, or PDF, or XML, or command line, or...
Remember, I am not saying we should stop performing input validation. Racá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r, input validation can be quite effective when your rules are to ensure cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 data meets a proper format (as opposed to ensuring it doesn't meet a rotten format). However, cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 way you repair all type of injection attacks is at cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 presentation layer. E.g., when dynamic data is going to:
  • HTML - encode HTML meta characters to cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365ir entities - < to <, > to >, " to ".
  • Javascript - hrmm....I really recommend putting cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 dynamic stuff into a hidden HTML form field, perform HTML encoding on it, cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365n from cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 Javascript, pull it from cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 HTML form - much easier than determining if you need to escape ' or " or neicá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r.
  • SQL - Don't do dynamic SQL - do prepared statements/parameterized queries. (No! Stored procedures are NOT sufficient!)
  • LDAP - hopefully your LDAP library allows you perform parameterized LDAP queries, racá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365r than a dynamically constructed LDAP query.
  • XML - see HTML
  • PDF - golly - I dunno - I use FO, in which case, see XML
  • HTTP headers - URL encoding
  • command-line - please tell me you have a better alternative
I've really only scratched cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 surface of this rant. Exceedingly few sites really intend for you to be able to write HTML, so if you don't intend for cá cược thể thao bet365_cách nạp tiền vào bet365_ đăng ký bet365 user to be able to put in HTML, make sure all your dynamic output goes through an HTML filter. Your programming language probably has a really efficient function for doing exactly that.

Feel free to pipe up.

0 comments: