XSS scripting vulnerabilities

Why is Cross-Site Scripting (XSS) still the most common web vulnerability? The theory of identifying XSS is pretty straightforward, there are many static analysis tools created to detect it and yet there are so many undiscovered vulnerabilities. So, what gives?

Well, one of the reasons is that traditional program analysis methods often fail to identify the intent of a given piece of code. For example, a tool might struggle with figuring out which objects in the program could contain user input.

In my previous post I described how we addressed this problem by building a system that learns security specifications from thousands of Open Source projects and uses them to find real vulnerabilities. I also promised to share some cool examples of what it learned.

I decided to start with an interesting and rather unexpected source of possible issues in projects using Django. This post is a guide to identifying and exploiting XSS vulnerabilities using validation errors in Django forms. Here’s a real example: https://github.com/mozilla/pontoon/pull/1175.

Let’s jump straight into it and start with a little quiz. How many times have you written/seen code similar to the following snippet?

What about this one?

Or this one?

All of them are widespread examples of how you usually let the user know that the provided input is invalid, right? The input is taken from the HTTP request parameters and neatly unmarshalled into a MyForm object. If any of the fields contains invalid input (e.g. someone entered the string "foobar" into a numeric field), then a 400 Bad Request page gets returned with a description of the error. The difference between snippets is the format of the returned error — an HTML list, plain text or JSON.

Now a million dollar question — which of these snippets will make your web app XSSable?

                                                       .    .    .

To answer it, let’s investigate the Django forms API from two points of view:

  1. Is the attacker able to inject malicious input into the web page displayed in user’s browser?
  2. Will this malicious input always be properly escaped before it gets returned to the user?

According to Django’s documentation the way to build dynamic error messages for field validation errors is to raise a django.core.exceptions.ValidationError exception with the corresponding message. Such an exception thrown from any of the validation functions of the form (e.g. the methods clean() and clean_<fieldname>()of the django.forms.BaseForm class) will cause the message to be stored in the form’s error dictionary (django.forms.utils.ErrorDict) and later possibly shown to the user.

One way to exploit such an exception is to use some of the built-in form fields that conveniently reflect the faulty input into the exception message. I tried all Django form fields types listed here and got the following list: ChoiceField, TypedChoiceField, MultipleChoiceField, FilePathField. Each of these generates an error message like "Select a valid choice. %(value)s is not one of the available choices.” , where value is the faulty input. <script>alert(1)</script> for the win.

The second option is to exploit custom fields and/or validation procedures. For example, consider the following snippet (taken from a real project and modified for brevity):

Here a good payload would be something like foo.<img src=x onerror=alert(1)>.

                                                              .    .    .

Yep, you’re right, the ValidationError exceptions alone will not grant us an XSS. For a proper vulnerability we need one more ingredient — the ability to inject error messages into the final HTML page that gets returned to the user.

The aforementioned ErrorDict class has the following methods to extract the error messages:

  1. as_data() — no sanitization
  2. get_json_data(escape_html=False) — no sanitization if escape_html == False (the default)
  3. as_json(escape_html=False) no sanitization if escape_html == False (the default)
  4. as_ul() — safe
  5. as_text() — no sanitization
  6. __str__() (calls as_ul()) — safe

Now let’s go back to our little quiz. It’s easy to see that snippet 1 is safe, because it uses the __str__() method, which escapes the input. Snippets 2 and 3, however are dangerous and may result in XSS.

                                                                 .    .    .

There are two main takeaway messages here. The one for developers is the mantra “always sanitize untrusted input”. The one for security researchers is: a simple grep -R "ValidationError" might broaden the attack surface for you.

By the way you have my respect if you passed the quiz correctly without reading the full post.