$HTTP["host"] =~ "^(piwik|whatever)" + { server.document-root = "/path/to/piwik/website" }
The procedure below also shows the preliminary steps while searching for details about the database itself and the data I'm trying to delete. These steps are still there "for future reference" but also because I'm not sure so far this procedure is final and erased all the unnecessary information. I'll amend it if need be.
MariaDB [(none)]> use piwik; Database changed MariaDB [piwik]> show tables; +--------------------------------+ | Tables_in_piwik | +--------------------------------+ | piwik_access | | piwik_archive_blob_YYYY_MM | | piwik_archive_numeric_YYYY_MM | | piwik_goal | | piwik_log_action | | piwik_log_conversion | | piwik_log_conversion_item | | piwik_log_link_visit_action | | piwik_log_profiling | | piwik_log_visit | | piwik_logger_message | | piwik_option | | piwik_report | | piwik_segment | | piwik_sequence | | piwik_session | | piwik_site | | piwik_site_setting | | piwik_site_url | | piwik_user | | piwik_user_dashboard | | piwik_user_language | +--------------------------------+ 68 rows in set (0.00 sec) MariaDB [piwik]> DESCRIBE piwik_site; +--------------------------------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------------------------+------------------+------+-----+---------+----------------+ | idsite | int(10) unsigned | NO | PRI | NULL | auto_increment | | name | varchar(90) | NO | | NULL | | MariaDB [piwik]> SELECT idsite, name FROM piwik_site; +--------+-----------+ | idsite | name | +--------+-----------+ | 1 | mySite_1 | | 2 | mySite_2 | | 5 | mySite_5 | | 7 | mySite_7 | +--------+-----------+ 4 rows in set (0.00 sec)
The sites with ids 1, 2, 5 and 7 are the sites that are registered and still appear in Piwik's web UI.
Sites with ids 4 and 6 (see below) have been deleted : they are not displayed anymore in the web UI, but they still appear in the database.
MariaDB [piwik]> SELECT idsite, count(idvisit) FROM piwik_log_visit WHERE idsite in (4, 6) GROUP BY idsite; +--------+----------------+ | idsite | count(idvisit) | +--------+----------------+ | 4 | 9022 | | 6 | 72890 | +--------+----------------+ 3 rows in set (0.19 sec) MariaDB [piwik]> DESCRIBE piwik_log_link_visit_action; +------------------------------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------------------------+------------------+------+-----+---------+----------------+ | idlink_va | int(11) unsigned | NO | PRI | NULL | auto_increment | | idsite | int(10) unsigned | NO | MUL | NULL | | +------------------------------+------------------+------+-----+---------+----------------+ 27 rows in set (0.51 sec) MariaDB [piwik]> SELECT idsite, count(idlink_va) FROM piwik_log_link_visit_action WHERE idsite in (4, 6) GROUP BY idsite; +--------+------------------+ | idsite | count(idlink_va) | +--------+------------------+ | 4 | 281508 | | 6 | 4231128 | +--------+------------------+ 3 rows in set (1 min 42.07 sec)
At that point, you may be interested in the size of the tables :
dbName='piwik'; ls -Shl /var/lib/mysql/$dbName | head -20 -rw-rw---- 1 mysql mysql 956M Dec 16 17:50 piwik_log_link_visit_action.ibd -rw-rw---- 1 mysql mysql 120M Dec 16 17:28 piwik_archive_blob_2015_01.ibd -rw-rw---- 1 mysql mysql 120M Dec 16 17:28 piwik_archive_blob_2015_12.ibd MariaDB [piwik]> SELECT table_name, table_rows, ROUND(( data_length + index_length ) / 1024 / 1024, 0) AS table_size_MB FROM information_schema.tables WHERE table_schema="piwik" ORDER BY table_size_MB DESC, table_rows DESC; +--------------------------------+------------+---------------+ | table_name | table_rows | table_size_MB | +--------------------------------+------------+---------------+ | piwik_log_link_visit_action | 436605 | 365 | | piwik_archive_blob_YYYY_MM | 35384 | 41 | | piwik_archive_blob_YYYY_MM | 14581 | 22 | | piwik_log_visit | 35806 | 15 | | piwik_archive_numeric_YYYY_MM | 14217 | 12 | | piwik_archive_blob_YYYY_MM | 12322 | 12 | | piwik_archive_numeric_YYYY_MM | 6270 | 4 | | piwik_log_action | 5947 | 4 | +--------------------------------+------------+---------------+ 68 rows in set (0.72 sec) This is the "after" version.Time to flush out some data, now :
MariaDB [piwik]> DELETE FROM piwik_log_link_visit_action WHERE idsite in (4, 6); Query OK, 4512645 rows affected (1 hour 9 min 35.56 sec) MariaDB [piwik]> DELETE FROM piwik_log_visit WHERE idsite in (4, 6); Query OK, 81914 rows affected (3 min 53.47 sec)
Now, let's reclaim some HDD space to MySQL.
MariaDB [piwik]> OPTIMIZE TABLE piwik_log_link_visit_action; +-----------------------------------+----------+----------+-------------------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-----------------------------------+----------+----------+-------------------------------------------------------------------+ | piwik.piwik_log_link_visit_action | optimize | note | Table does not support optimize, doing recreate + analyze instead | | piwik.piwik_log_link_visit_action | optimize | status | OK | +-----------------------------------+----------+----------+-------------------------------------------------------------------+ 2 rows in set (31 min 45.15 sec) MariaDB [piwik]> OPTIMIZE TABLE piwik_log_visit; +-----------------------+----------+----------+-------------------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-----------------------+----------+----------+-------------------------------------------------------------------+ | piwik.piwik_log_visit | optimize | note | Table does not support optimize, doing recreate + analyze instead | | piwik.piwik_log_visit | optimize | status | OK | +-----------------------+----------+----------+-------------------------------------------------------------------+ 2 rows in set (3 min 46.07 sec)
You should create a dedicated "site" in the Piwik console to use the log analytics method to avoid counting twice the hits on dynamic content. Indeed, server-generated pages already carry the Javascript snippet that will hit our Piwik instance once displayed in a web browser, but sending those pages already created a hit on our webserver, which is already written in access logs. (source)
If you already have a running Piwik instance, this will be rather straightforward :
If you prefer to ignore such visits, just append this IP address to the blacklist on the Administration | Web sites page.
This is strongly inspired by this FAQ.
Here are some points you should check :
idsite
in the snippet matches the one from the web interfaceBefore : my webserver is running Lighttpd and Piwik and everything is going extremely well : visitors are identified (as well as their location) using their IP address.
After : I installed Varnish (on the same server) and made a basic configuration so that "it works", but now, everybody seems to connect from 127.0.0.1
sub vcl_recv { if (req.restarts == 0) { if (req.http.x-forwarded-for) { set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip; } else { set req.http.X-Forwarded-For = client.ip; } } }
server.modules = ( , "mod_extforward", )
extforward.headers = ("X-Forwarded-For") extforward.forwarder = ("127.0.0.1" => "trust")
To make sure this is working as expected :
HTTP request sent, awaiting response... HTTP/1.1 200 OK Cache-Control: public, max-age=600 Vary: Accept-Encoding Last-Modified: Mon, 20 Oct 2014 19:48:47 GMT Content-Type: text/html Expires: Tue, 21 Oct 2014 09:44:22 GMT Date: Tue, 21 Oct 2014 09:34:22 GMT Server: lighttpd/1.4.31 X-Varnish: 196645 Age: 0 Via: 1.1 varnish-v4 ETag: W/"3135140236" X-Cache: MISS Transfer-Encoding: chunked Connection: keep-alive Accept-Ranges: bytes Length: unspecified [text/html] Saving to: `/dev/null'
a.b.c.d myGreatWebsite - [21/Oct/2014:11:34:22 +0200] "GET /?13574 HTTP/1.1" 200 2221 "-" "Wget/1.13.4 (linux-gnu)"where a.b.c.d is the client IP address.
looks like there's nothing special to do. Forget about stuff below.
To do so, all we have to do is to instruct Piwik to get the client IP from the X-Forwarded-For HTTP header. This can be done in /pathTo/piwik/documentRoot/config/config.ini.php :
proxy_client_headers[] = HTTP_X_FORWARDED_FOR
When making tests, keep in mind that Piwik can not detect Wget hits since no Javascript is executed then.
This is actually a feature request.
As said on the Piwik forum, it's as simple as :Database changed
MariaDB [piwik]>
+--------+-------+---------------------------+ | idsite | name | main_url | +--------+-------+---------------------------+ | 1 | site1 | http://site1.example.com | | 2 | site2 | http://site2.example.com | +--------+-------+---------------------------+
+---------+ | idvisit | +---------+ | 1 | | 2 | | 3 | | 20 | +---------+