| AWFFull Stuff |
|
After a meeting at our offices, a request was made to add some extra functionality to the Webstats service, to summarize the 404 errors from the access logs. After a quick search on the Internet, I found two ways of accomplishing this. The first was to apply some patches to the existing Webalizer software, and the second was to replace Webalizer by a new program called AWFFull. I tried both approaches, but eventually settled on AWFFull as its development was ongoing and it provided several other potentially useful facilities over the aging Webalizer software.
At first I did a simple fix by using a combination of gunzip/sed to append a couple of dummy fields to each record and pipe the result into AWFFull. Although this worked, it put quite a bit of load on the statistics server, so I hacked the sources to reinstate the CLF as 'Common Log Format' but then some of the extra functionality in AWFFull was lost. I later hacked the sources to make AWFFull work for either 'Common Log Format' or 'Combined Log Format' using a command line switch option or a config file option, but with 'Common Log Format' as the default. Since March 2006, my version of AWFFull has been running on the WebStats service, and other feature requests have come in. The next was to count the Partial (206) requests for PDF files. With some web browsers you can read a PDF file in the browser, and the browser only request enough of the PDF file to display at a time. As the user reads through the PDF file the browser issues a request to download a bit more of the file. This can lead to over-inflated statistics for PDF files as there may only be one real request but the browser generates repeat requests for each new part. I have added a couple of extra columns to the file request tables to show the Partial requests separately. For the past few months I've been helping out with the development/testing of AWFFull. Most of my mods are created on a Debian (sarge) system but that may change in the near future to either Debian (etch) or Ubuntu (dapper drake). You can download the Debian (Sarge) and Ubuntu (Dapper) packages for either the standard copy of AWFFull or the one with my mods from here A sample run of AWFFull with the logs of PharmWEB can be viewed by clicking on the image to the left. PharmWEB is one of the oldest sites in continual (two minor glitches, way back in the early days when we switched server and June/July 2006 when the NVRAM battery was flat and the system was down till it was replaced) use on the Internet - 151 months to September 2007 which is why I increased the builtin limit from 5 to 20 years.. |
The University of Manchester has been running a WebStats service based on the Webalizer software for several years. This has been working well but development on Webalizer seems to have been a bit stagnant recently.
