Encoding and Sanitizing Data

The Phantom Menace

Must secure input from untrustworthy sources so it cannot take control of the output


Encoding vs Sanitizing

Encode or escape content to make it structurally compatible with the output

Sanitize or create trustworthy content using a whitelist

There has to be some contextual understanding of the data

Exploitable data is escaped

Prohibited data is sanitized

The Problem Space

Your Application has hundreds of HTML templates containing dynamic variables from mixed sources


A ticking time-bomb of potential exploits that need to be defused before it can be used to support token based applications

XSS - Cross site scripting

Exploitation of XSS (temporarily or persistent) results in the complete compromise of the targeted application

The attacker enters bobbie" onmouseover="alert(1) as their name
    Resulting in an exploit for the user agent

If an attacker can access your browser environment then other security protections like XSRF can be overcome

Making Wrong Code Look Wrong

Exploits are only avoided by developers following best practices all the time

Some convention is needed to determine right from wrong

HTML Sanitizer

Allow HTML authored by third-parties into your web application while protecting against XSS

String unsafe = "<p>Can be anything <script>alert('Boo!')</script></p>
HtmlSafe safe = "<p>Only what we explicitly allow</p>"

Values are a type determined Safe or UnSafe

Safe or Unsafe

Enforce a whitelist of allowable content through Policies and produce trustworthy markup

String unsafe = "<p...<script...<a href=...";
PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS);
HtmlSafe safe = HtmlSafe.from(policy.sanitize(unsafe));

Ensures HTML content is sanitized, otherwise escape it!

Baked into the templating engine only Safe types can be rendered without being escaped

Keeping Tokens Secure

Like the money in your wallet, once stolen, bearer tokens can be used without provenance

Not enough to just say tokens are safe with XSS protection

Requires continuous research and development testing with contextual escaping filters