Skip to main content

Hello ! This is my blog powered by Known. I post articles and links about coding, FOSS, but not only, in French (mostly) or English (if I did not find anything related before).

phyks.me

github.com/phyks

phyks@phyks.me

 

Filtering ads with your Raspberry Pi

13 min read

TL;DR: Please have a look at the benchmark section below, to be aware of the limitations of this particular setup and decide whether to spend some time putting it in place or not.

I recently came across this Pi-Hole project that claims to be "a black hole for Internet advertisements" (thanks nicofrand for making me discover this!). The idea was really attractive: having a simple Raspberry Pi on the network doing all the ad filtering for the whole network, rather than having to maintain a separate uBlock Origin install on each and every computer of the network. It was also particularly attractive as having such ad blocker on a smartphone requires a rooted device. Plus there was a really nice web interface to control the whole ad blocking device.

While looking at it more in depth, I realized it was actually very limited:

  1. First, it was built around specific software and was doing some magical stuff using these software. It was really painful to get away from them. Basically, it uses dnsmasq to expose a DNS service, a standard hosts file to block the hosts serving ads, and a lighthttpd webserver. Problem is I already have a DNS resolver (unbound) on this Raspberry Pi, and a web server (nginx). I did not want to spend a lot of time trying to integrate it in my existing setup if finally it was not that powerful, so I decided to look at it in detail before installing.
  2. Second issue was that it relies on dnsmasq. dnsmasq is a simple program that allow you to answer DNS queries by using the hosts defined in /etc/hosts and to forward every other requests to another DNS server (typically your ISP DNS server). Pi-Hole lets you configure two DNS servers to forward queries to, default one being 8.8.8.8 (Google :/). I already have a resolver on this Raspberry Pi and I do want to do the resolution myself, especially since my ISP DNS servers lies, and I do not want to use public DNS server on another network. So I had to hack on Pi-Hole to do some DNS resolution. About these issues, I'd like to point to two very interesting articles from Bortzmeyer: this one about Google DNS (in French) and this one about having your own DNS resolver (same). Also, being a DNS resolver, it may be cumbersome to disable it temporarily to load some website that absolutely requires the ads to be loaded.
  3. Last issue was that contrary to uBlock which filters at the requests level (and even sometimes at the HTML level), the fact that it is basically an alternative DNS resolver means you can only filter at the domain level. That is, you either whitelist (default) or blacklist a domain which is serving ads or malware, but you cannot differentiate different paths for a given domain. While browsing Libération website, I can see uBlock is blocking queries such as http://s3.amazonaws.com/files.wrapper.theadtech.com/native/placements/liberation.fr/pconfig?r=5a6f5d98b4d608. Such queries cannot be blocked by Pi-Hole without blacklisting the whole Amazon S3 network.

Given these facts, I remembered Privoxy which can be used as a filtering proxy, in a way similar to uBlock. Given that it is a proxy, it can filter in details, just as uBlock do and you can very easily disable it (simply disable the proxy). Plus, almost any devices offers you a proxy setting, so it should work both on my Android phones and computers. In this article I describe how I set up a Pi-Hole alternative based on Unbound (to have my own DNS resolver and block some things at the domain level) coupled with a Privoxy proxy to filter out ads.

Limitations: So, contrary to Pi-Hole, the setup described here will be able to remove ads in a similar way to uBlock/AdBlock. If you go through the whole article till the end, it will also have the element hiding features. Being a proxy setting, it will be easy to toggle it on and off (either by Prixovy toggling features or by manually turning off proxy on your device), if required. However, be aware of the remaining limitations with regards to HTTPS streams (section 4.15). As AdBlock/uBlock runs in the browser, it can filter ads in HTTPS streams as well, Privoxy will not be as efficient without HTTPS interception (which is generally not a good idea). However, it should perform rather well in the vast majority of situations (also note that AdBlock for rooted Android devices is also a proxy, so for them, it will not change anything).

I assume you already have a running Raspberry Pi with some basic install. Typically, see this previous article if this is not the case

Set up a DNS resolver

Let's install a DNS resolver on the Raspberry Pi, to answer DNS queries on the network. I am installing unbound and configuring Unbound here.

$ sudo apt-get install unbound
$ curl -o /etc/unbound/root.hints https://www.internic.net/domain/named.cache
$ # (Optional) Set crontask to download root.hints file every six months

The last curl command is used to fetch the root hints, to query hosts that are not cached. See this section of the ArchWiki page for more infos.

Then, you can create a basic configuration file for unbound.

$ cat /etc/unbound/unbound.conf.d/local.conf
server:
    username: "unbound"
    interface: 0.0.0.0  # Listen on all interfaces
    root-hints: "/etc/unbound/root.hints"
    access-control: 192.168.0.0/8 allow  # Access-control, see "Example" section in https://www.unbound.net/documentation/unbound.conf.html

Then, enable and start Unbound service at startup:

$ sudo systemctl start unbound && sudo systemctl enable unbound

You can now change the resolver to be used on your Raspberry Pi and on your whole network. To change it on your Raspberry Pi, have a look at this wiki page (for Raspbian). To set it as the default DNS resolver on your network, have a look at your router configuration, and set the DNS resolvers address to the address of your Raspberry Pi. Don't forget to open the ports in your firewall (53 tcp and udp).

To check everything is working fine, you can use dig (from dnsutils package on Debian-based distributions). Typically, dig google.fr should give you some results (in the ANSWER SECTION) and the IP address in the SERVER line should be the one of your Raspberry Pi.

Note: At this point, we should emphasize that having an open DNS resolver (that is, a DNS resolver that can answer to anyone) can be a security risk especially since some DDoS attacks use it. Then, you should make sure that your Raspberry Pi DNS server is only accessible from your local network, and that no third-party has access to it. This should be done through the access-control line in the above configuration, but this can also be enforced by the firewall running on your Raspberry Pi and the firewall on your router (typically, most routers provided by your ISP block any incoming connections, check this).

Block some domains based on hosts

Now, we would like unbound to block some domains that are known to serve ads and malware, in a similar way as Pi-Hole does. For this purpose, we will use unbound-block-hosts script to import hosts files into Unbound configuration. Basically, for every such domain, Unbound will return 127.0.0.1.

unbound-block-hosts is designed with the Dan Pollock's hosts file in mind, whereas I wanted to be able to import any host file in Unbound. Here is a forked and patched version for this purpose (very ugly patch, as I am not fluent in Perl :/).

We will create an includes dir in the Unbound configuration directory (mkdir /etc/unbound/includes/), and include the rules in the main configuration by appending include: "/etc/unbound/includes/*.conf" to the /etc/unbound/unbound.conf.d/local.conf previously created.

Now, you can run ./unbound-block-hosts --url="SOME_URL" --file=/etc/unbound/includes/FOOBAR-blocking.conf to generate a matching configuration for a given hosts list. Typically, I have a script doing:

#/bin/sh
set -e

cd "$(dirname "$0")"

echo "Fetch Malware domains list and append to unbound"
./unbound-block-hosts --url="http://www.malwaredomainlist.com/hostslist/hosts.txt" --file=/etc/unbound/includes/malwaredomainlist-blocking.conf --address="YOUR_RASPBERRY_PI_IP"
echo "Fetch Yoyo ad servers list and append to unbound"
curl "https://pgl.yoyo.org/adservers/serverlist.php?hostformat=unbound;showintro=0&mimetype=plaintext" > /etc/unbound/includes/yoyoadservers-blocking.conf

systemctl reload unbound

which is crontask-ed to run every day. Default address is 127.0.0.1 which means the local host for the client machine. I do not want to have too many 404 on my local webservers, so I'd rather put the IP address of the Raspberry Pi and have a webserver answering a 404 on it.

Install a webserver

As simple as

sudo apt-get install nginx

Install and configure Privoxy

Now, you can install Privoxy:

sudo apt-get install privoxy

The default configuration should be mostly ok. You can look at the /etc/privoxy/config file to adapt it to your needs (the file is really an example of well documented config file). Two options you might be interesting in changing are the debug option (to enable logging, which is disabled by default) and listen-addr. You will want to set the latter to:

listen-address  127.0.0.1:8118
listen-address  YOUR_RASPBERRY_PI_IP:8118

so that the Prixovy proxy is accessible from the rest of your LAN. As always, do not forget to configure your firewall to let the Privoxy connections pass through. At this point, you should try to set the proxy in your browser's preferences and check that everything is working fine. You should be able to browse to any web page, but the proxy will not do anything else for the moment.

Note: At this point, we should emphasize that having such a proxy is a security risk, as anyone having access to your proxy can browse the web with your IP address (and you may be held liable for anything illegal done with it). Then, you should make sure that your Raspberry Pi Privoxy is only accessible from your local network, and that no third-party has access to it. This should be enforced by the firewall running on your Raspberry Pi and the firewall on your router (typically, most routers provided by your ISP block any incoming connections, check this).

Privoxy, as installed by the Raspbian package, enables a couple of filters out of the box. As we will be translating Adblock rules into Privoxy rules, we can disable them. Edit the /etc/privoxy/match-all.action file to get something like this:

#############################################################################
# Id: match-all.action,v
#
# This file contains the actions that are applied to all requests and
# may be overruled later on by other actions files. Less experienced
# users should only edit this file through the actions file editor.
#
#############################################################################
{ \
+change-x-forwarded-for{block} \
+client-header-tagger{css-requests} \
+client-header-tagger{image-requests} \
+filter{refresh-tags} \
+filter{webbugs} \
+filter{jumping-windows} \
+filter{ie-exploits} \
+hide-from-header{block} \
+hide-referrer{conditional-block} \
}
/ # Match all URLs

In particular, I disabled the filters img-reorder (which is really intensive for the Raspberry Pi, and takes a few hundreds of milliseconds to process a regular page) and banners-by-size as we will be importing Adblock rules which should give better results. deanimate-gifs and session-cookies-only is a matter of taste (respectively it prevents animated GIFs by replacing them by their last frame and only allowing temporary cookies).

Block ads using Privoxy

We will now be importing adBlock rules into privoxy. One way to do it is to use this Haskell script.

To install it directly on your Raspberry Pi, provided you have a recent Raspberry Pi:

This was not my case, so I set up a builder on my server to run daily. Some rules are provided by the author and my builds are available here.

The way to set up the resulting files into Privoxy is very well detailed on the page of the project.

Note: My builds are made with the Element Hiding feature and example.com as the domainCSS parameter. You should replace any occurrence of example.com by the FQDN or IP adrdess of your Raspberry Pi when importing it. Please, do not put too much load on my hosted builds and consider hosting your owns.

Benchmark

All the tests were made with a Raspberry Pi of the first model with 512MB of RAM (model 1B). The Raspberry Pi has a wired access to internet (100MB/s port only on the Raspberry Pi). My laptop is wired as well (gigabit ethernet). Home connection to internet is a fiber access (923 Mbps download, 250 Mbps upload, as reported by DSLReports).

Testing the DNS server

Without the DNS server,

$ # Using my ISP resolver
$ % dig @192.168.0.254 example.com
...
;; Query time: 7 msec
;; SERVER: 192.168.0.254#53(192.168.0.254)

$ # Using Google DNS
$ dig @8.8.8.8 example.com
...
;; Query time: 5 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)

$ # Using the DNS resolver on my Raspberry Pi
$ dig @192.168.0.1 example.com
...
;; Query time: 505 msec
;; SERVER: 192.168.0.1#53(192.168.0.1)

$ # Using it another time, now that the domain is in cache
$ dig @192.168.0.1 example.com
...
;; Query time: 5 msec
;; SERVER: 192.168.0.1#53(192.168.0.1)

These are typical times, the value is typically the one obtained as average of a few runs.

We can see that there is some overhead when first accessing a domain, as the Pi has to do the full DNS resolution. Afterwards, the domain is kept in cache and it is as fast to use the DNS server from the Pi as it is to use any other one.

Testing the Privoxy setup

Now, let us focus on the performances of the Privoxy on the Raspberry Pi. I tested it with a few websites, and results were roughly the same. Here is a detailed example of Liberation's website, a French journal. This example is interesting as my µBlock setup on my laptop blocks 23 different things when I don't use the DNS nor the Privoxy proxy. It is also an interesting example as out of the 23 blocked contents, only 14 of them could be blocked by DNS (with the setup described above).

The main issue here is that Privoxy is very long to process the page with all the filters, and it is way too heavy for my low power Raspberry Pi first model.

The main HTML document for this page takes 7 seconds to load when passing through the proxy, mainly due to the processing time. When reloading the page, it only takes 400ms as it is already in cache. As a comparison, it takes only 24ms when loading it directly.

The complete setup looks equivalent to the µBlock setup on my laptop.

I don't have a more recent version of the Raspberry Pi (typically Raspberry Pi 3) to test what the performances are on such a more powerful system. If you can try it, let me know, I am curious about the way it handles the load, and I could publish an edit to this article.

 

Don du mois de mars : Jupyter

1 min read

Je continue les dons du mois en donnant ce mois-ci 20$ à Jupyter.

Jupyter (anciennement iPython) est issu de la scission entre iPython (le noyau) et la partie notebook (feuilles de calcul). Jupyter Notebook recouvre cette deuxième partie, et supporte différents noyaux en plus de Python (R, Haskell, Julia, Ruby, etc). Un bon exemple du rendu est disponible ici (en lecture seule).

Une feuille de calculs est composée de cellules, et chaque cellule peut être soit du texte (markdown étendu avec le support de LaTeX pour les équations, rendues avec MathJaX), soit du code. C'est un très bon outil pour faire un genre de programmation lettrée où code et équations sont ensemble, dans un unique document cohérent. De plus, Jupyter supporte l'export vers de nombreux formats (grâce à LaTeX et à Pandoc), et notamment PDF, TeX et HTML, ce qui permet de générer un document final propre contenant l'intégralité des notes et des simulations pour un projet donné.

 

Don du mois de février : i3wm

1 min read

J'ai été assez débordé le mois dernier et viens de me rendre compte que mon article sur le don du mois de février était resté en brouillon non publié :/

Je continue donc les dons du mois en donnant ce mois-ci 15€ à i3wm.

i3wm est un gestionnaire de fenêtres pour X11 (Linux) qui fait du tiling ("pavage" en bon français). Il organise automatiquement les fenêtres à l'écran, de sorte qu'elles ne se superposent jamais, mais pavent l'espace disponible.

Je l'utilise quotidiennement, et une fois qu'on l'a testé (et qu'on s'y est habitué), ça devient très rapidement indispensable.

À noter, il existe une alternative (sans lien avec i3wm) pour Wayland: sway.

 

Raspberry Pi install checklist

2 min read

This is some memo for me, to use as a checklist whenever I set up a new Raspberry Pi which is to be running continuously (typically as a webserver).

First, I start from the lite version of Raspbian.

After install:

  1. sudo apt-get update && sudo apt-get upgrade

  2. sudo raspi-config and tweak according to my needs.

  3. Install some useful tools:

sudo apt-get install ack-grep fail2ban git heirloom-mailx htop libxml2-dev libxslt1-dev libyaml-dev moreutils msmtp-mta python-dev python-pip python3 python3-dev python3-pip screen vim zlib1g-dev

  1. Install RPi-Monitor. First install its dependencies:

sudo apt-get install librrds-perl libhttp-daemon-perl libjson-perl libipc-sharelite-perl libfile-which-perl

  1. cd $HOME; git clone https://github.com/XavierBerger/RPi-Monitor; cd RPi-Monitor; sudo TARGETDIR=/ STARTUPSYS=systemd make install to install it. Be careful about a current bug with systemd install

  2. Some useful bash config: echo 'export PATH=$HOME/.local/bin:$PATH' >> $HOME/.bashrc; echo 'export EDITOR=vim' >> $HOME/.bashrc.

  3. Use NTP to keep the system in sync with current time: sudo timedatectl set-ntp true.

  4. Load ip_conntrack_ftp module: sudo echo "ip_conntrack_ftp" >>& /etc/modules-load.d/modules.conf.

  5. Set up an iptables systemd service à la Arch Linux. See this unit. Put iptables config in /etc/iptables/ip{6,}tables.rules.

  6. Remove the file in /etc/sudoers.d which prevents pi user from having to type its password.

  7. Configure msmtp to be able to send emails using the mailserver on my main server.

  8. Harden SSH configuration as you would do for a server.

  9. Set a MAILTO address in crontab and edit aliases.

 

Don du mois de janvier : Framasoft

1 min read

Je continue les dons du mois en donnant ce mois-ci 15€ à Framasoft.

Framasoft est un réseau dédié à la promotion du « libre » en général et du logiciel libre en particulier et offre de nombreux services et projets innovants mis librement à disposition du grand public, notamment dans le cadre de leur campagne de « dégooglisation » (des services libres, hébergés par Framasoft, qui offrent des alternatives aux services offerts par Google / Doodle / Facebook / Github etc, et la liste va croissante !). Bien évidemment, les services peuvent être très facilement autohébergés, et ils l'encouragent à travers leur campagne des CHATONS.

En particulier, leur liste d'alternatives est très bien faite et très pertinente.

 

Don du mois de décembre : EFF

2 min read

Je suis tombé sur les dons du mois de Sam & Max, qui donnaient chaque mois à une organisation qui fournit des produits et des services qu'ils utilisaient et qui avaient été importants pour eux le mois passé, tout en écrivant un billet sur leur blog afin de faire parler de l'organisme.

J'ai récemment migré l'intégralité des certificats SSL utilisés sur phyks.me et ses sous-domaines, pour passer de StartSSL à Let's Encrypt, principalement suite à cette annonce de Mozilla. Je n'ai jamais payé pour un certificat SSL depuis que j'ai ce nom de domaine (StartSSL, tout comme Let's Encrypt les fournissent gratuitement), tandis que les autorités facturent jusqu'à 100$ le certificat.

Ce mois-ci, c'est donc 25$ qui vont à l'EFF principalement pour leur soutien à Let's Encrypt et leur certbot qui facilite énormément la gestion de ses certificats. L'EFF s'engage également pour défendre la liberté d'expression sur le net, pour lutter contre les brevets logiciels et contre les DRMs, ainsi que sur les questions de vie privée. Ils sont également derrière un certain nombre de logiciels et extensions tels que HTTPS Everywhere.

 
 

Moving from URxvt to st

2 min read

I have been using URxvt terminal for a while, but was suffering many issues with it recently. In particular, I had a weird locale issue, leading to unicode encoding errors whenever I copy accentuated characters using primary keyboard, some weird issues due to urxvt-tabbed and it just blew up when I tried to get new unicode characters right in it (such as smileys).

A friend told me about st which may be quite daunting at first, especially since all the configuration is made statically in a C header file, but it is working incredibly well, and just doing the job fine.

I have a mirror repo with my own configuration in case you want to have a look at it. This reproduces most of my URxvt user experience, except from two things:

  1. I don't have any tabs in st. But this is not a real issue and I'd rather depend on another program to handle tabs, such as tmux or even i3.
  2. I don't have clickable URLs as I used to have in URxvt. But once again, after a few weeks without this feature, I prefer selecting and copy/pasting URLs rather than clicking on them. This way, I don't open links unintentionally.

I was relying on a hack to get local notifications for my Weechat running through SSH + screen, using an extended escape sequence, and if you are also using it this commit will implement this behavior in st.

#