When talking about security, the first thing that usually comes to mind is encryption. Spies secretly coding (or de-coding) some secret message that should not be revealed to the enemy. Encryption is this mysterious thing that turns all text into a part of the matrix. Developers generally like encryption. It’s kinda cool. You pass stuff into a function, get some completely scrambled output. Nobody can tell what’s in there. You pass it back through another function – the text is clear again. Magic.
Encryption is cool. It is fundamental to doing lots of things on the Internet. How could you pay with your credit card on Amazon without encryption? How can you check your bank balance? How can MI5 pass their secret messages without Al-Qaida intercepting it?
But encryption is actually not as useful as people think. It is often used in the wrong place. It can easily give a false sense of security. Why? People forget that encryption, by itself, is usually not sufficient. You cannot read the encrypted data. But nothing stops you from changing it. In many cases, it is very easy to change encrypted data, without knowledge of the encryption key.
Scoring a goal against google is never easy. Google analytics allows you to do some strange and wonderful things, but not without some teeth grinding. I was struggling with this for a little while, and it was a great source of frustration, since there’s hardly any info out there about it. Or maybe there is lots of info, but no solution to this particular problem. I think I finally nailed it.
Dynamic Goal Conversion Values
I was trying to get some dynamic goal conversion values into Analytics. I ended up reading about Ecommerce tracking and it seemed like the way to go. Not only would I be able to pick the goal conversion value dynamically, it gives you a breakdown of each and every transaction. Very nice. After implementing it, I was quite impressed to see each transaction, product, sku etc appear neatly on the ecommerce reports. So far so good. But somehow, goals – which were set on the very same page as the ecommerce tracking code – failed to add the transaction value. The goals were tracked just fine, I could see them adding up, but not the goal value. grrrr…
I’ve come across a small nuisance that seemed to appear occasionally with unicode urls. Some websites seem to encode/escape/quote urls as soon as they see any symbol (particularly % sign). They appear to assume it needs to be encoded, and convert any such character to its URL-Encoded form. For example, percent (%) symbol will convert to %25, ampersand (&) to %26 and so on.
This is not normally a problem, unless the URL is already encoded. Since all unicode-based urls use this encoding, they are more prone to these errors. What happens then is that a URL that looks like this:
http://www.frau-vintage.com/2011/%E3%81%95%E3%81%8F%E3%82%89 …
will be encoded again to this:
http://www.frau-vintage.com/2011/%25E3%2581%2595%25E3%25 …
So clicking on such a double-encoded link will unfortunately lead to a 404 page (don’t try it with the links above, because the workaround was already applied there).
A workaround
This workaround is specific to wordpress 404.php, but can be applied quite easily in other frameworks like django, drupal, and maybe even using apache htaccess rule(?).
<?php /* detecting 'double-encoded' urls * if the request uri contain %25 (the urlncoded form of '%' symbol) * within the first few characeters, we try to decode the url and redirect */ $pos = strpos($_SERVER['REQUEST_URI'],'%25'); if ($pos!==false && $pos < 10) : header("Status: 301 Moved Permanently"); header("Location:" . urldecode($_SERVER['REQUEST_URI'])); else: get_header(); ?> <h2>Error 404 - Page Not Found</h2> <?php get_sidebar(); ?> <?php get_footer(); endif; ?>
This is placed only in the 404 page. It then grabs the request URI and checks if it contains the string ‘%25’ within the first 10 characters (you can modify the check to suit your needs). If it finds it, it redirects to a urldecoded version of the same page…
On my previous post I talked about django memory management, the little-known maxrequests parameter in particular, and how it can help ‘pop’ some balloons, i.e. kill and restart some django processes in order to release some memory. On this post I’m going to cover some of the things to do or avoid in order to keep memory usage low from within your code. In addition, I am going to show at least one method to monitor (and act automatically!) when memory usage shoots through the roof.
A while ago I was working on optimizing memory use for some django instances. During that process, I managed to better understand memory management within django, and thought it would be nice to share some of those insights. This is by no means a definitive guide. It’s likely to have some mistakes, but I think it helped me grasp the configuration options better, and allowed easier optimization.
Does django leak memory?
In actual fact, No. It doesn’t. The title is therefore misleading. I know. However, if you’re not careful, your memory usage or configuration can easily lead to exhausting all memory and crashing django. So whilst django itself doesn’t leak memory, the end result is very similar.
Memory management in Django – with (bad) illustrations
Lets start with the basics. Lets look at a django process. A django process is a basic unit that handles requests from users. We have several of those on the server, to allow handling more than one request at the time. Each process however handles one request at any given time.
But lets look at just one.
cute, isn’t it? it’s a little like a balloon actually (and balloons are generally cute). The balloon has a certain initial size to allow the process to do all the stuff it needs to. Lets say this is balloon size 1.
timthumb vulnerability
About a month ago I posted about tweaking timthumb to work with CDN. Timthumb is a great script, but great scripts also have bugs. A recently discovered one is a rather serious bug. It can allow attackers to inject arbitrary php code onto your site, and from there onwards, pretty much take control over it.
Luckily no websites I know or maintain were affected, possibly since the htaccess change I used shouldn’t allow using remote URLs in the first place (and also it renamed timthumb.php from the url string, making it slightly obfuscated). I still very strongly advise anybody using timthumb to upgrade to the latest version to avoid risks.
ajaxizing
Following from my previous post, I’ve come across another issue related to caching in wordpress: dynamic content. There’s a constant trade-off between caching and dynamic content. If you want your content to be truly dynamic, you can’t cache it properly. If you cache the whole page, it won’t show the latest update. W3 Total Cache, WP Super Cache and others have some workarounds for this. For example, W3TC has something called fragment caching. So if you have a widget that displays dynamic content, you can use fragment caching to prevent caching. However, from what I worked out, all it does is essentially prevent the page with the fragment from being fully cached, which defeats the purpose of caching (especially if this widget is on the sidebar of all pages).
The best solution for these cases is using ajax, to asynchronously pull dynamic content from the server using Javascript. So whilst many plugins already support ajax, and can load data dynamically for you, many others don’t. So what can you do if you have a plugin that you use, and you want to ‘ajaxize’ it?? Well, there are a few solutions out there. For example this post shows you how to do it, and works quite well.
The thing is, I wanted to take it a step further. If I can do it by following this manual process, why can’t I use a plugin that, erm, ‘ajaxizes’ other plugins?? I tried to search for solutions, but found none. So I decided to write one myself. It’s my first ‘proper’ plugin, but I think it works pretty well.
thumbs up
[IMPORTANT: please check that you have the latest version of timthumb! older versions might have a serious security vulnerability. A little more about it here]
I’ve been recently trying to optimize a wordpress based site. It was running fine, but I wanted to run it even faster, and make the best use of resources. So I ended up picking W3 Total Cache (W3TC). It’s very robust and highly configurable, if perhaps a bit complicated to fully figure out. So eventually things were running fine, and my next task was to boost it even further by using a Content Delivery Network (CDN). In this case, the choice was Amazon Cloudfront. The recent release allowed managing custom origin from the console, which made things even easier. One of the remaining issues however, was trying to optimize timthumb.
Timthumb was already included with the theme, and I liked the way it works. It allowed some neat features, like fitting screenshots nicely, and also fitting company logos well within a fixed size (with zc=2 option). Google search has led me to a couple of sources. However, for some reason none of them worked, so I ended using a slightly different solution…
timing is everything
A quick-tip on the importance of timestamps and making sure your time zone is set correctly.
I was recently playing around with fail2ban. It’s a really cool little tool that monitors your log files, matches certain patterns, and can act on it. Fail2ban would typically monitor your authentication log file, and if for example it spots 5 or more consecutive failures, it would simply add a filter to your iptables to block this IP address for a certain amount of time. I like fail2ban because it’s simple and effective. It does not try to be too sophisticated, or have too many features. It does one thing, and does it very well.
I was trying to build a custom-rule to watch a specific application log-file. I had a reasonably simple regular expression and I was able to test it successfully using fail2ban-regex. It matched the lines in the log file, and gave me a successful result
Success, the total number of match is 6
However, when running fail2ban, even though it loaded the configuration file correctly, and detected changes in the log files, fail2ban, erm, failed to ban… I couldn’t work out what was the problem.
As it turns-out, the timestamps on my log file was set to a different time-zone, so fail2ban treated those log entries as too old and did not take action. Make sure your timestamps are correct and on the same timezone as your system!! Once the timezone was set, fail2ban was working just fine.
passwordless password manager
[Also published on testuff.com]
Most people I know tend to simply use the same password on ALL websites. Email, Paypal, Amazon, Ebay, Facebook, Twitter. This is obviously a very bad idea.
Passwords are always a problem. Difficult to remember, hard to think of a good one when you need a new one, tricky to keep safe. For the moderately-paranoid and the sufficiently-techie there are many good solutions out there. Password managers. Online, offline, commercial, free. So I usually suggest to my friends and colleagues to use a password manager.