AWStats - Open source log file analyzer for advanced statistics

mail

How to import old logs into AWStats ?

Situation

As a result : there's a "hole" in my stats for the 8th of October. Logs are still there, how can I import them ?

Details

As a summary, in reverse-chronological order, we have :

Solution

So the idea is to :
  1. reset AWStats statistics by putting aside datafiles :
    • all the GOOD datafiles (these will be restored)
    • and the BAD one (will be discarded)
    Don't actually delete files at this step : rename or move them so that AWStats can't see them.
  2. re-import ALL the logs of the BAD month, in chronological order. You'll have to run something like this up to 31 times :
    perl /var/www/awstats/wwwroot/cgi-bin/awstats.pl -update -config=virtualhost.myDomain.tld -LogFile=xxx
    See details below.
  3. restore the GOOD datafiles

Re-import logfiles in chronological order :

If you rotate logfiles more often than once per month, you'll end up with several logfiles :
  • logfile.1
  • logfile.2
  • logfile.3
  • logfile.n
The chronological order may be from 1 to n... or from n to 1. How can I know ?
for i in {1..n}; do logFile="logfile.$i"; echo -e "\n$logFile"; awk 'NR==1 {print $4" "$5} ENDFILE { print $4" "$5}' "$logFile"; tail -1 "$logFile" | awk '{print $4" "$5}'; done
logfile.1
[09/Oct/2019:02:17:21 +0200]
[09/Oct/2019:18:42:11 +0200]

logfile.2
[08/Oct/2019:02:16:25 +0200]
[09/Oct/2019:02:15:08 +0200]

logfile.3
[07/Oct/2019:22:41:23 +0200]
[08/Oct/2019:02:15:46 +0200]
The chronological order is 3 - 2 - 1. There's just to run the import command in a loop in the corresponding order.
mail

AWStats

Global setup / initial conditions :

  • a Debian host running the Lighttpd web server
  • several sites are served via distinct virtualhosts
  • all sites (virtualhosts, actually) log their hits and errors in shared logfiles (splitting those will be our first task )
The goal is to analyze the logs of several of these virtualhosts with AWStats. This will imply creating an extra virtualhost for AWStats itself.
A single AWStats instance is enough for several analyzed websites : we'll just have to generate a distinct AWStats configuration for each virtualhost to be analyzed. Which website data is displayed in AWStats web interface is specified as a URL parameter : http://awstats.myDomain.tld/cgi-bin/awstats.pl?config=www.example.com

Prerequisite : Configure your webserver to split logs per virtualhost

AWStats (source) :

Install :

My first idea was to :
apt install awstats
But it turned out that some files / directories mentioned in the docs (Perl scripts, wwwroot, classes, ...) were missing (). So I tried a different solution.
  1. cd /var/www/awstats/ && wget https://prdownloads.sourceforge.net/awstats/awstats-7.7.tar.gz
    find the latest version on the download page
  2. tar zxf awstats-7.7.tar.gz
  3. chown -R www-data. *

Configuration :

  1. open a shell as the system user running the webserver (so that generated files have the rights permissions) :
    su - www-data -s /bin/bash
  2. cd /var/www/awstats/tools && ./awstats_configure.pl
    ... and answer questions...
  3. You may see :
    Sorry, configure.pl does not support automatic add to cron yet.
    You can do it manually by adding the following command to your cron:
    /var/www/awstats/wwwroot/cgi-bin/awstats.pl -update -config=www.example.com
    Or if you have several config files and prefer having only one command:
    /var/www/awstats/tools/awstats_updateall.pl now
    We'll see this later

Build the statistics database :

  1. Since it says :
    You can then manually update your statistics for 'www.example.com' with command:
    > perl awstats.pl -update -config=www.example.com
    You can also read your statistics for 'www.example.com' with URL:
    > http://localhost/awstats/awstats.pl?config=www.example.com
    you can now build the database :
    perl /var/www/awstats/wwwroot/cgi-bin/awstats.pl -config=www.example.com
  2. At this step :
    • AWStats has analyzed /var/log/yourWebserver/www.example.com.log and created + populated its internal database
    • As new hits arrive continuously, this database has to be updated periodically with the provided command
    • After updating your webserver (the next step), you'll also be able to have a look at the web interface at the given URL
  3. Have a look at the generated configuration file for fine tuning : /etc/awstats/awstats.www.example.com.conf

Configure Lighttpd :

  1. Modules :
    server.modules = (
    	,
    	"mod_cgi",
    	
    	)
  2. Virtualhost :
    $HTTP["host"] =~ "^awstats" + domainNameRegExp {
    	server.document-root	= "/var/www/awstats/wwwroot"
    	accesslog.filename	= logRoot + "awstats.access.log"
    	cgi.assign = (
    		".pl"   => "/usr/bin/perl",
    		".cgi"  => "/usr/bin/perl"
    		)
    	alias.url  = ( "/awstatsclasses"    => "/var/www/awstats/wwwroot/classes/" )
    	alias.url += ( "/awstatscss"        => "/var/www/awstats/wwwroot/css/" )
    	alias.url += ( "/awstatsicons"      => "/var/www/awstats/wwwroot/icon/" )
    	}
  3. Disable cache :
    $HTTP["host"] =~ "^awstats" + domainNameRegExp {
    	setenv.add-response-header = ( "Cache-Control" => "no-cache" )
    	}
  4. Check Lighttpd configuration
  5. Restart Lighttpd :
    systemctl restart lighttpd.service
  6. No need to let robots in : put in /var/www/awstats/wwwroot/robots.txt
    User-agent: *
    Disallow: /
  7. If you use a reverse web proxy / load balancer, update its settings accordingly (cache, ...). If you've been too impatient and already had a look at the AWStats web interface, consider purging its cache to see updates next time you'll press F5.

Periodically update the database with crontab :

*/15 * * * * perl /var/www/awstats/wwwroot/cgi-bin/awstats.pl -update -config=www.example.com

And finally : logrotate (source) :

  1. Create the configuration file :
    cat << EOF > /etc/logrotate.d/www.example.com
    /var/log/varnish/www.example.com.log
    	{
    	missingok
    	notifempty
    	daily
    	rotate 30
    	compress
    	sharedscripts
    
    	prerotate
    		/var/www/awstats/wwwroot/cgi-bin/awstats.pl -update -config=www.example.com
    	endscript
    
    	create
    
    	postrotate
    		kill -HUP \$(cat /run/varnishncsa/varnishncsa_www.example.com.pid)
    	endscript
    	}
    EOF
  2. check it :
    /usr/sbin/logrotate -f /etc/logrotate.d/www.example.com; echo $?