Sanitizing CSS in Rails

As web developers, we all know the golden rule: "Never trust the user input". If you are, then you are digging your own grave. At SupportBee we display tickets with HTML and CSS. Sanitization is essential in securing the site. Rails provides you with an action view helper: ActionView::Helpers::SanitizeHelper which you can use while outputting HTML. However, we use a whitelist based sanitization gem called Sanitize by Ryan Grove to get non lethal HTML output.

Sanitize allows you to whitelist protocols, HTML tags or even attributes of the elements. It is very powerful as it allows you to write custom transformers to further process the sanitized output. For example one can write a transformer to whitelist a youtube video as shown here by the author.

Why Sanitize CSS?

Sanitize is amazing at what it does on HTML but fails to do it on CSS. CSS can be used to fool browsers into executing javascript as it happened with Myspace several years ago using CSS-injection. Courtenay Gasking has written a well tested library which can detect "Evil CSS" called css_file_sanatize ( Github Repo ) which you can include in your ActiveRecord Model. To increase the ease of use, we have extracted out the "Evil CSS" regex's in the library and made a transformer out of it.

CSS transformer

Sanitize allows you specify the transformers to execute and also order in which they have to run. In fact the gem internally uses three transformers to do its job. You can specify the CSS transformer as an option to the Sanitize's clean method.

Sanitize.clean(html, { :transformers => check_css, :elements => ... })

The transformer check_css is called on every element that Sanitize encounters and whitelists it depending on the output of the transformer. In check_css we filter out style element and the inline style attribute.

check_css = lambda { |env|
      node      = env[:node]
      node_name = env[:node_name]
      # Don't continue if this node is already whitelisted or is not an element.
      return if env[:is_whitelisted] || !node.element?
      parent = node.parent
      return unless node_name == 'style' || node['style']
      if node_name == 'style'
        unless good_css? node.content
          node.unlink
          return
        end
      else
        unless good_css? node['style']
          node.unlink
          return
        end
      end
      {:node_whitelist => [node]}
    }

The good_css? method checks the content of the style element or attribute against Courtenay Gasking's "Evil CSS".

def good_css? text
    return false if text =~ /(\w\/\/)/    # a// comment immediately following a letter
    return false if text =~ /(\w\/\/*\*)/ # a/* comment immediately following a letter
    return false if text =~ /(\/\*\/)/            # /*/ --> hack attempt, IMO
    # Now, strip out any comments, and do some parsing.
    no_comments = text.gsub(/(\/\*.*?\*\/)/, "") # filter out any /* ... */
    no_comments.gsub!("\n", "")
    # No backslashes allowed
    evil = [
      /(\bdata:\b|eval|cookie|\bwindow\b|\bparent\b|\bthis\b)/i, # suspicious javascript-type words
      /behaviou?r|expression|moz-binding|@import|@charset|(java|vb)?script|[\<]|\\\w/i,
      /[\<>]/, # back slash, html tags,
      /[\x7f-\xff]/, # high bytes -- suspect
      /[\x00-\x08\x0B\x0C\x0E-\x1F]/, #low bytes -- suspect
      /&\#/, # bad charset
    ]
    evil.each { |regex| return false if no_comments =~ regex }
    true
  end

Here is a gist which has the above snippets.

Have you done cool stuff with Sanitize transformers? Do let us know.



blog comments powered by Disqus