My defences, become fences, now i'm stumbling, i change my face and if you think i'm fake up wait around till i take off my make-up.
Tricky, Christiansands
On a shared or simple server, forms that allow e-mail to be sent can be the source of many security problems. Misconceived, scripts that run behind these forms can be abused by spamers as a relay for their massive sending of unsolicited messages.
If there ain't much to do about the forms themselves, the scripts that process the data and sends the e-mail have to be carefully crafted. A single leak or shortage can be exploited by black hats, users with a nasty twist and bad intentions.
Sendmail is a mail transfert agent that runs on unix-like systems. In other words, it handles electronic mail for the operating system of your server. It is pretty easy to use but it can be the victim of that simplicity. When it comes down to using it, you open a system pipe to it, pass it on as parameters the headers and the body of the e-mail, closing the pipe causing the message to be sent.
The first measure to take when you call sendmail is to do it in the -t mode. If you pass parameters to a command line system utility, you're taking a big risk of openning the doors to anyone and lead to a complete disaster. The use of a system handler is a classic source of security problems. Even if you try to securize these parameters that you are going to pass on to the system, you're taking chances that people may try to exploit. When you make such a call, the operators can be overloaded and allow users to execute system commands freely and so doing, endanger your server.
That's why, we recomend coding your pipe exclusively like this: open(SENDMAIL, "|/path/sendmail -t"). One can want to open the pipe to sendmail and to pass on the address of the message destination, but it's too big a risk to consider. The -t mode allows you to define the destination of your message inside the headers of the message itself, to gather the list of recipients from the message headers. This may sound like leaving the hard bit for latter but this syntax has the major advantage not to pass user parameters to the system, and so to close the door to an operator overload attack. The system pipe being hardwired, there's no risk of a system command being passed on maliciously.
On a more general level, try to avoid system calls with user variables, be it pipes, system() calls or reversed apostrophes. You may put in place the best strategies for insuring the safety of the data passed along to the system, you never know what can happen and how reliable your code is. You have to understand that you'll never check hard enough to trust your own code.
If you have to use sendmail and can't find a way to securize you data as system parameters, change your whole strategy and, by example, use the Net::SMTP module from CPAN to send your messages, if that's possible in your situation and doesn't make your code too messy. Modules authors can be trusted most of the time. Don't re-invent the wheel and use the knowledge of others. There are so many useful modules out there, open-sourced so that they have been checked by many good coders, that you can find an alternative to a direct system call most of the time.
As we've said, there isn't much to do about the html forms if you want to keep a certain amount of flexibility on your set up. However, it is important to check the identity of the form, that is it's location.
The environment variable HTTP_REFERER (beware of the spelling mistake in the official protocols) is supposed to provide the address of the form that calls the script. We say "supposed" because there are ways to cheat on this variable, most probably if it's a script calling your own script. If you are to assume those tricks aren't used by your attackers, we can see how to check the referrer.
If we want to prevent any form not present on our domain to access to this script, we can start with a single referrer check as in this code: unless (($referrer =~ /^http:\/\/habett\.org\/.*/) or ($ref =~ /^http:\/\/www\.habett\.org\/.*/)) doing so, we ensure that the url of the page that calls the script is on our domain, with www or not. Note the ending slash and the http prefix. This offers a wide enough range of uses.
However, this formula can be tricked using a form whose address would be in a shape like http://habett.com/@spammer.net/form.html a syntax used for username protected web sites. It's an easy way to cheat on a referrer checking based only on the beginning of a url. To counter this use, we just have to check that there are no @ in the referrer by adding and !($ref =~ /@/)) to our unless test code.
We can now estimate that our referrer checking procedure is ok, keeping in mind that the environment variable it relies on can be hacked. If your script is only called by a limited range of html form, then you must use the list of the possible referrers inside your script, hard coding the potential usages.
On the same level, always check for the presence of the required fields in the form, and the absence of any other ones. It is not always possible but it's pretty efficient.
What shall you do once you've identified an incorrect referrer ? We believe that the simplest thing to do is to send back the user to where he's from, that is issueing a single print "Location: $ref \n\n"; and then terminate the program. It's pretty simple and still efficient. If we want to dig deeper, we can keep a log of bad referrers in order to trace unauthorized calls.
If this operation proves it's inefficiency, log all referrers and study where the leaks are and tweak your code accordingly.
This return to sender counter strike can seem simplistic and futile but it is efficient in most cases. Some would rather kill the process straight away so that the visitor gets an http error, but we believe it's good psychology to let the intruder believe he's had his way in.
When you're creating an e-mail form, certain elements of the message headers depend upon user input in the html form. Each time it is possible, you must hard code as many elements as possible. Though, if you want to use your script with many different forms, that is not always possible.
Consider a contact form that sends you a message from a visitor. You are going to hard wire the To: field, but you are going to insert the visitor's e-mail address in the From: field in order to ease your further discussion with him. It functions if the user enters his e-mail address in the designated form field. Now consider a malicious user who enters an address, then a newline caracter, and then lines starting with CC: or BCC: and many other addresses. This way, the message gets sent to other persons. Overloading the variable of many lines, he can modify the To: field and end up with an open e-mail relay and so, a free lunch.
We hope to have demonstrated you the importance of securizing all user variables you are going to paste into the new message headers. If in our example, this only applies to the From: field, you have to understand the any header field can be the victim of such hacks. Headers are very sensitive. The same manoeuver can be achieved with the Subject:, To: or any other header field. The lesson is never to paste user variables in a message headers without previous inspection of their contents.
To securize and e-mail address to paste into the message headers, be it From: or To:, you must previously process the data to ensure it's safety. We propose this code:
$totreat = lc $totreat; $totreat =~ s/(.*)\n.*/$1/; $totreat =~ s/,/ /; $totreat =~ s/[^a-z0-9_\-\.@]/ /g; $totreat =~ s/^ +//; $totreat =~ s/ .*//;
It turns the supposed e-mail address in lower case, gets rid of anything after a newline character, replaces comas with spaces, and forbids any other character than those supposed to occur in legitimate e-mail addresses that is letters, numbers, underscores, dashes, points and arobases. We end up by getting rid of space before and after the address. We now have a securized e-mail address.
For other data that is meant to be pasted into the message headers, it only takes a simple $totreat =~ s/(.*)\n.*/$1/; to ensure that data is restricted to a single line. That how we'll treat by example the Subject: fireld of a message in order to prevent from any overloading.
For data to be pasted into the message's body, you don't have to securize them even though that can't hurt.
In our messages, we like to insert a little resolving of the senders IP address. It's a bit silly but it can help to know who we're dealing with. As you know the REMOTE_ADDR environment variable is supposed to contain the IP address of the one that submits the form. As all environment variable it can be tweaked by a script. Under certain conditions, it is possible to obtain the textual equivalent of this numerical address composed of four 8 bits digits as in 192.168.0.1. By example, our IP address at the time of writing this article could be resolved to ATuileries-XXXX-X-XX-XX.wXX-XX.abo.wanadoo.fr thus giving indications about our provider and in this example, the geographical location of ours.
Having previously inserted a use Socket; and initialised the variable $ip = $ENV{'REMOTE_ADDR'};, we can resolve it with $name = gethostbyaddr(inet_aton($ip), AF_INET); and @addr = gethostbyname($name); then we can gather the designated data in $addr[0].
We like to finish our messages with line that reads the IP address of the visitor and it's resolving. This is usefull for identification but that doesn't always work, far from it. If the IP hasn't been tweaked, the resolving is not guaranteed to work. If large ISPs have IPs that resolve well, corporate networks and specialized ISP don't or rarely do.
Note that you shall be able to get the IP resolving directly by using the REMOTE_HOST envirnoment variable.
You may wonder what it has to do with security at all. First if the hacker doesn't deserve his name, his address would appear clearly. If he's a real black hat or a professional spamer, he wouldn't feel comfortable with having his message polluted with an homemade line. That can be achieved with any bit of text, the IP resolving being just a taylored example that has the advantage of not being static.
In the end, for background check, create a log of the last IPs that have accessed to your script and check for redundancies. A black hat finding a leak into you process would only be interested in it if it has a potential for spamming activities, thus trying to use your script many times. A small log of the IPs that call your script (REMOTE_ADDR environment variable) and a pathern checking bit of code can help. The malicious hacker would certainly use an IP address rotation throuh a large number of possibilities or even use random numbers but it is still useful.
Spamers who will try to exploit the weakness of your scripts won't do it by hand but will use robot like scripts agents. That's why it can help to place an intermediary step in the procedure to check for the data. This complicates the robot writing process of the hacker. If this confirmation is doubled with a cookie adjonction and checking, this can contribute to the elimination of the use of a robot.
As we've said, all environment variables can be faked with a well crafted script, that's why it is always important to check that they are present, not empty and at least containing plausible information. By exemple, check that your visitor is using a real browser, or at least a well written script, by looking at the HTTP_USER_AGENT environment variable.
A hacker would see his job greatly facilitated if you knows how your script work. That's why compiled script have a slight edge over interpreted ones, like perl in our example. If you want to stick to an interpreted language, then make shure that under no circumstances can a visitor access to the source of your script. It is very important that visitors are only granted executing rights over your script. You must have a strict policy on that point. When it comes to chmod your script, be very careful. A bad server configuration or a wrong chmod can lead to a total disaster.
In terms of security, you never go far enough but that doesn't mean you have to be a passive victim and not try to follow any lead. When your system is ready, try to hack it yourself and see what you can control while remaining effective and doing the very task your script is supposed to achieve under normal circumstances. If you sense a weakness or that you are the victim of bad clients, create of log of the suspicious elements and work it out. You can't forcast everything that can happen but you have to do everything you can for your own sake and for decency towards your server's host.
Nous sommes tous quelque chose de naissance, musicien ou assassin, mais il faut apprendre le maniement de la harpe ou du couteau.
Augustin Vidovic