Skip to content

How to Configure Plausible for Searx⚓︎

Summary⚓︎

Although Searx comes with it's own built in statistics, it doesn't natively allow for adding analytics. This is largely by design considering the privacy aspect of the project. However, I was curious to see if my instance gets any traffic that isn't from me.

Trial and Error⚓︎

In order to do this, I had to find out where the base.html file was located. This was confusing to find because the Searx config file resides in /etc/searx, although after some digging, I found base.html in the following directory...

/usr/local/searx/searx-src/searx/templates/oscar

Once in the directory, I tried adding the following...

  <!--Plausible Analytics-->
  <script defer data-domain="search.cc" data-api="/data/api/event" src="/data/js/script.js"></script>

This would allow me to proxy the tracking snippet through Cloudflare. I've already done this with most of the other services I manage, but for some reason, the tracking snippet kept returning a 404 error.

The site was correct, - https://search.cc/data/js/script.js - but would not return the tracking snippet. After a lot of trial and error, I found that the tracking snippet was available at https://www.search.cc/data/js/script.js. I checked the settings.yml file for Searx, as well as my configuration in Cloudflare, but could not find where the www was coming from.

Resolution⚓︎

Because I wasn't able to locate where the www was coming from in the tracking snippet, I decided to proxy the snippet through Nginx. Since I already use Nginx as the web server for Searx, it wasn't a big deal to modify the config file.

To modify the config file, I added the following:

# Only needed if you cache the plausible script. Speeds things up.
proxy_cache_path /var/run/nginx-cache/jscache levels=1:2 keys_zone=jscache:100m inactive=30d  use_temp_path=off max_size=100m;

server {
    ...
    location = /js/script.js {
        # Change this if you use a different variant of the script
        proxy_pass https://plausible.io/js/plausible.js;

        # Tiny, negligible performance improvement. Very optional.
        proxy_buffering on;

        # Cache the script for 6 hours, as long as plausible.io returns a valid response
        proxy_cache jscache;
        proxy_cache_valid 200 6h;
        proxy_cache_use_stale updating error timeout invalid_header http_500;

        # Optional. Adds a header to tell if you got a cache hit or miss
        add_header X-Cache $upstream_cache_status;
    }

    location = /api/event {
        proxy_pass https://plausible.io/api/event;
        proxy_buffering on;
        proxy_http_version 1.1;

        proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Host  $host;
    }

After reloading Nginx, I navigated back to /usr/local/searx/searx-src/searx/templates/oscar and added the following to base.html...

  <!--Plausible Analytics-->
<script defer data-api="https://search.cc/api/event" data-domain="search.cc" src="https://search.cc/js/script.js"></script>

Once this was added, I navigated back to /usr/local/searx/searx-src and used the following command to update the Searx instance...

sudo -H ./utils/searx.sh update searx

During the update, I made sure to keep the same config file.

Testing⚓︎

Once it was finished, I did the following...

  • Navigated back to my browser.
  • Opened the Developer Console.
  • Navigated to the Network tab.
  • Loaded https://search.cc
  • Confirmed the script appeared at https://search.cc/js/script.js

Outcome⚓︎

Although it's not perfect, it so far seems to be giving me what I'm looking for. I'd like to figure out how to get insight into usage from searching through a browser address bar, but I have a feeling this may be a bit of a limitation with either Plausible or Searx; likely the latter. I think it has something to do with Content Security Policy in Nginx, but I haven't dug far enough into it to be sure.

Edit: It turns out this was due to a misconfigured Firewall rule on Cloudflare. Any API with a Cloudflare threat score greater than 5 was being blocked. This is overly aggressive and has since been reconfigured to greater than 10. A breakdown of how the Cloudflare threat score works can be found at the following link...

https://support.cloudflare.com/hc/en-us/articles/200170056-Understanding-the-Cloudflare-Security-Level

Once the rule was reconfigured, Plausible began picking up searches done through the browser address bar.

The important thing is that I was able to configure it properly so that analytics are implemented and the tracking snippet is served from the search.cc domain.

Resources⚓︎