Why is XSS still a problem?
For all of the non-security people who read this blog there is an attack vector called Cross-Site Scripting (XSS). I won't get into the details but the main reason it is useful is because an attacker can run their code as you. Why is that bad? For many reasons they could:
- Steal your login session
- Make requests as you (e.g. Transfer $1,000 to evil account)
- And on and on and on :)
The point of my post is this. Why is XSS is even still around except for corner cases? In other words why haven't the frameworks made it hard for a developer to create an XSS? XSS happens for only one reason, poor output encoding, period. I agree output encoding can be tricky on weird edge cases but even then the question might not be poor output coding but rather poor coding.
This question has been on my mind for the past few weeks. I think I have narrowed it down to a few reasons:
1) Frameworks are too scared to make this change because it would bust existing web applications
2) Frameworks expect it is up to the developer to do proper output encoding
3) The framework developers don't know any better
Since the third one has a simple solution (educate or fire the developers) I will only focus on the first two.
Frameworks are too scared to make this change because it would bust existing web applications. This case would only happen when there is an existing framework and we wanted to retro-fit uniform output encoding across the board. My example for this is the .Net framework. .Net is a decent framework and I can't knock it. However, the output encoding is not uniform allowing developers to easily screw up having a developer remember the difference between 20+ web controls and the variations depending on the property used. While it is possible to remember all of this, remembering it while your boss is yelling at you to get the feature down by 5PM or else your arse is out of there could be an issue :). The .Net framework just release v3 and it seems even more solid and polished but still encoding is not uniform. Why though? My guess is because that if they broke how the controls operated developers would break out the pitchforks and storm Redmond, WA. I can understand this because everyone loves to beat-up on MS but boy I would love to see this change.
Another example is Ruby On Rails(RoR). They recently released 2.0 my own little application had a few things busted from the new version because RoR is known for being on the leading edge and trying to remove cruft and always doing the best thing for the framework. If the devs have to work a bit harder so be it. I am sure there are other examples but these are the ones I thought of.
How could we fix the .Net framework and not have it bust existing web applications? How about having a configuration value that says we should encode everything or not? Maybe "LegacyEncoding" could be the configuration name. Have it set to false by default but if someone wanted to turn it on let them hang themselves.
Frameworks expect it is up to the developer to do proper output encoding. Boy I love this reason. Yes, there are quite a few good web developers out there that understand proper output encoding and when and why we need to set a charset and everything else. However, there are even more developers out there that have no clue. I wish this was not the case but my experience has shown me otherwise. Sure, a framework initially might be adopted only by the l33t coders who know all of this stuff correctly and you can point your tongue out at me and say "nah-nah-nah you are wrong so pooh on you" but eventually the non-l33t coders will come and then there will be issues. At this point you will have the first issue I pointed out if you didn't do proper output encoding from the beginning.
One example is once again with RoR - I apologize for all the RoR examples but I am a bit enamored at the moment with that framework - by default doing SQL is through methods and all examples show parameterized queries, etc.... If you ever see dynamic SQL you will see warnings about it being bad, leads to possible security issues and is a code smell. Great! Why? Because there are these attacks called SQL injections that were/are fairly common and can wreak havoc on database-backed systems. RoR is a good example of making it hard to do something that is known to be bad when there is a better alternative to it.
Another example is Visual Studio and its warnings/errors when you use certain C++ methods that have safer alternatives out there. This allows the user not to hang themselves with system calls that are known to lead to certain types of attacks.
I think it would be possible to have a set of controls and/or framework that by default output encoded everything. If you wanted something not output encoded you would have to think of doing it instead of the other way around. Would this solve all XSS? Heavens no, but, as I said earlier if we can kill out 80+% of the issues that would go a long way.