Transparent Proxy as Adblock using Tinyproxy and Dansguardian
As I mentioned in my last post about the migration from Gentoo to Kubuntu I’ll write about how to setup iptables, Tinyproxy and Dansguardian as an Adblocker. That said the setup might be better using Squid instead of Tinyproxy. Why not having a caching transparent proxy around? I don’t do that because I installed all of that on my Notebook and caching there wouldn’t make much sense as the applications cache anyway (per default).
I won’t tell you how to install them, that should be found on the sites above. And its really easy, in Kubuntu you just need to select the packets, they are all available.
I left most of the configuration unchanged as the default values should do. The listen port is set to 8888, you should check the “Allow” setting for security reasons and you might give your Filter a name through “ViaProxyName”. Maybe you wanna set the “ConnectPort” to 0. Or you may want to filter https traffic as well but I doubt it.
Then go ahead and start it! Probably with something like “/etc/init.d/tinyproxy start”. And you may want to make it start automatically on bootup.
The dansguardian configuration is a bit trickier, or lets say more time consuming. You will need to play around and adjust the filter settings after some testing, or maybe you can use already existing rules from Firefox Adblock (I think these are RegExps too?). At first you wanna have a look through the dansguardian.conf (should all be in “/etc/dansguardian/”). Here you can adjust a lot of stuff, you even can add ClamAV, an anti-virus scanner. I didn’t but I guess its not too hard to do.
For a start you should set the log levels quite high so you can see in the log file what happens. The important options are
“filterip” – I left that empty as I use my notebook in a safe environment. You may want to set this though…
“filterport” – set to 8080, but you can choose every port you want, basically. On this port the filter listens for incoming requests from browsers etc.
“proxyip” – I have this set to 127.0.0.1, the localhost as my proxy (tinyproxy) sits on the same machine.
“proxyport” – set to 8888 (as in the tinyproxy configuration), dansguardian requests the files on this port from the proxy.
I deactivated the “weightedphrasemode”, I think this can be really useful (probably mainly for the main purpose of dansguardian that is web content filtering for children) but I didn’t use it yet.
There are a lot of options where you can tune to get better performance or better results, to get started the default configuration should be suitable. Just one more thing that I turned of is “virusscan” as mentioned above.
Here are all the filter files defined. These should be alright. There is also the “Temporary Denied Page Bypass” called “bypass”, I activated that one and changed it to 300 (5 minutes). For this to work properly (so that you can click on a link in your browser to unblock the blocked content for these 5 minutes) you need to modify the “/etc/dansguardian/languages/%YOURLANGUAGE%/template.html”. Its just HTML with a few placeholders, very easy to adjust. The important part to show the link to unblock the content is ‘…<a href=”-BYPASS-”>…’.
I will only mention the files that I changed and that seem to be important to me. The names of those files are pretty self-explaining and have examples so just go ahead and have a look!
The banned* files have the stuff that will block the file from being delivered.
In this file a have three lines (you can create as many as you want as these are regular expressions). Don’t ask why I have three lines. I created that stuff a few years ago and it still serves me good!
Very simple regexps.
Here you write the urls in that you want to block. Not whole sites, these will work but you should put them in the bannedsitelist.
Mine is looking like that:
Only these two lines, thats enough.
(to be honest I have these lines in the bannedurllist as well, its working without a problem but the documentation says it should be in this file. I guess its a performance thingy)
If you want to block whole sites you put them in here. Like this:
Yes, I know, thats not even a real regexp that I wrote.
Here you put in the sites that are allowed to send you every ad and crap that they want. So why should you want to do that? Well, I have just one line in there:
Thats a site from where you can call to and/or from Germany for free for half an hour. Therefor you have to watch their ads. So yeah, you need to display that stuff.
And thats all about the dansguardian configuration. Start it, check the logfile if it logged any errors and solve these issues (if there are any).
Now theres only one simple thing left.
There are only two rules to setup:
1. /sbin/iptables -A OUTPUT -t nat ! -d 127.0.0.1 -p tcp –dport 80 -m owner ! –uid-owner nobody -j REDIRECT –to-ports 8080
2. /sbin/iptables -A POSTROUTING -t nat -o lo -p tcp –dport 8080 -j SNAT –to 127.0.0.1
You should run them as part of your firewall or network if-up script.
One very important thing is to change the “nobody” in line 1 to the user under which tinyproxy is running. This user needs to be allowed to talk directly to the outside world as else we would end up in an infinite loop!
So rule 1 redirects every output that we produce locally thats going to port 80 somewhere to port 8080 on localhost.
Rule 2 sets the source address of these packets to 127.0.0.1. Thats needed to get this working properly.
Now it should be working, you just have to play around with your filters and check in the logfiles if everything works as supposed to!
I had this setup running on my Linux router for a few years. The setup was a bit different as I used a caching proxy (Squid) and I didn’t filter the traffic from the local box. With a setup like that you can easily filter all computers in your network with no hassle and platform independently. For Windows machines this is even more helpful as you often have software (like the ICQ client) that shows ads. These are often requested through port 80 so its easy to block all of that!
And careful, this setup is not meant to be a Web Content Filter for children! If you want that you need to change your configuration and maybe check the dansguardian website!
Another thing you should consider is that many Websites live from ads and banners and all that stuff. So if you block it they don’t get money anymore for your visits. Depending on the site you visit you may add them to the greylists or exceptionslists…
Sorry Glen, nothing interesting for you, again!