Introduction
Visitors is a very fast web log analyzer for Linux, Windows, and other
Unix-like operating systems. It takes as input a web server log file, and outputs statistics in form of different reports. The design principles are very different
compared to other software of the same type:
|

click on the image to enlarge
|
- No installation required, can process up to 150,000 lines of log entries per second in fast computers (20MB/s with my log files average length).
- Designed to be executed by the command line, output html and text reports.
The text report can be used in pipe to less to check web stats from ssh.
- Support for real time statistics with the Visitors Stream Mode introduced with version 0.3.
- To specify the log format is not needed at all. Works out of box with apache and most other web servers with a standard log format (see the documentation for more information on the format).
- It's a portable C program, can be compiled on many different systems. Binaries for Windows systems are in the Download section of this page.
- The produced html report doesn't contain images or external CSS, is self-contained, you can send it by email to users.
- Visitors is free software (and of course, freeware), under the terms of the GPL license. You don't need to pay to use it. Visitors is supported, if you want a custom version made directly by the original author for a modest price, contact me at antirez (at) invece (dot) org. ISPs may take advantage of the high processing speed.
0.7 released
Changes for the 0.7 version include:
--grep and --exclude options to process only line matching
or not matching specified patterns. Patterns can include jolly characters
like *, ?, ranges of chars and so on, this kind of matching
is called glob-style matching (see the
online documentation
for more
information). Multiple grep/exclude patterns can
be used at the same time and are processed sequentially.
Added --ignore-404 in order to avoid that 404 errors are processed
like other log lines to create statistics, useful for
sites where there are a number of requests producing 404 error codes.
A new report about users screen resolution and color depth was added,
but only works after the inclusion of the javascript code you can find
in the README file on your web pages. Some other minor fix.
0.61 released
That's just a bugfix release, the only difference with 0.6
is that stats about unique visitors here are much more accurate,
now Visitors can better recognize (and ingore) bots other
than Google bots and strip their accesses from stats.
News! 0.6 released
Changes for the 0.6 version include:
Two new reports, AdSensed pages listing pages crawled by
the Google AdSense crawler, and Google Human Languages report
with common human languages set in the Google Preferences
(information obtained using the 'hl' variable of Google's refers).
New feature: Referer spam filtering using a keyword based blacklist.
Better text report formatting. More robust parsing resulting in more
precise reports. Browser list updated and ability to report different
versions of Internet Explorer. HTML report genereation modified to
avoid too long lines, this makes simpler to send reports via email.
Important bugs fixed, reports should be more accurate than before,
notably the unique visitors report.

The new day/hour combined map
|
Changes for 0.5
note, Fix required for
googlebot changes, New reports including a new bidimentional map that
shows traffic level in the whole year, unique visitors for every month,
better generation of graphviz graphs including percentages in arcs, nodes
for google, external links, and no referere, 50% less memory used,
Highlight color for weekend changed to be more visible,
most stats are now made by unique visits and not by number of
accesses, many bugs fixed, a real manual page.
|
Reports information
Visitors reports contain a number of useful informations and statistics:
- Requested pages.
- Requested images.
- Referers by hits and age.
- Unique visitors in each day.
- Page views per visit.
- Pages accessed by the Google crawler (and the date of google's last access on every page).
- Percentage of visits originated from Google searches for every day.
- User's navigation patterns (web trails).
- Keyphrases used in Google searches.
- User agents.
- Weekdays and Hours distributions of accesses.
- Weekdays/Hours combined bidimentional map.
- Month/Year combined bidimentional map.
- Visual path analysis with Graphviz.
- Operating systems, browsers and domains popularity.
- 404 errors.
You can see a report in html format and the version in in text format
|

click on the image to enlarge
|
|
Trivial to use
Visitors requires no database, nor ability to write on the disk, there is no installation, configuration file, or any other thing. It takes as arguments options to control how many lines to show of every report, if to include or exclude some optional stats, and a list of web log files. You can give more than just one log file in the command line, and they don't need to be ordered by date, nor all of the same format!.
|
Using the graphviz mode Visitors will process the web log files and output a graph ready to be rendered using Graphviz. The generated graph is the visual equivalent of web trials, but is much more interesting for complex sites, so the focus of this feature is not to create a generic graph of the whole site, but a graph of the usage patterns that shows how the users are using it. Click on the image to see the graph for www.hping.org, or read how to generate it in the on line documentation.
Examples
The simplest usage, to be used interactively when you have a web log to check (for example over ssh in your web server), just type:
visitors access.log | less
that will produce an human readable output in text only.
To generate html web stats with much more information you may use instead this:
visitors -A -m 30 access.log -o html > report.html
If you want information on the usage patterns for your site you must provide the url prefix of your web site, and specify the --trails option.
visitors -A -m 30 access.log -o html --trails --prefix http://www.hping.org > report.html
Note that's ok to specify multiple file names, or to provide the input using the standard input like in the following two examples:
visitors /var/log/apache/access.log.*
zcat access.log.*.gz | visitors -
Check the documentation for more information on how to use it.
Extending Visitors
Visitors is internally designed to be extended. You can develop more output modules for it if you are not happy with html and text only. Check the source code to see how the output modules already developed work. You have just to define methods for your new output module like print_header, print_title, and so on.
Download Source Code
(check what's new)
Source code of version 0.7 (for Linux and all the other OSes) visitors-0.7.tar.gz
On line documentation doc.html
Htmlized source code here
Download Windows Binary
While visitors is free software under the GPL license we sell
precompiled windows binaries for &euro 8; (one time fee, you'll receive all the next versions for free). If you want you can compile the source code
yourself of course. In order to make you able to test if visitors is useful
before to buy it we provide a demo version here:
Note that the demo version is limited and only outputs the
basic report without the advanced features and reports.
Windows DEMO binary (free): visitors-demo.exe
Windows FULL version binary: buy it for € 8 using the following button to pay.
Note: after the payment you'll be automatically redirected to the download page.
Payment via PayPal. PayPal may provide Credit Card and Bank Account Transfer facilities.
Promotional Button
If you want to promote Visitors add the following antipixel button
to your website or blog. Thank you!

click to get the HTML code
Bugs
Visitors is a new program, for the nature of the fast parsing technique it uses
it is possible that there are bugs triggered by non common log entries.
If you found such problems, please send me an email with the description
of the problem and if possible the part of access.log causing troubles.
On the other hand, after 4 versions, 3000 downloads and the inclusion
of Visitors as Gentoo and
FreeBSD port, we discovered only a
minor bug in the parser that version 0.4 fixes, so we consider Visitors ready to
be used in production environments. Thank you for the help.
Security
This section will contain a list of bugs related to security problems found
in visitors. Currently no bug of this kind was reported or found.
VISITORS CODE QUALITY AND AUDITING
Visitors is tested with big log files and the valgrind program before of every
release to ensure that there aren't obvious memory violation problems
with normal log files, and was written with care about security.
POSSIBLE PROBLEMS
Still it's a C program that does a lot of pointer math in order to
be fast, and its work is to process untrusted input (what's written in
log files is in part client-driven), this section is here to contain
detailed information if this kind of bugs will be found in future.
LIST OF SECURITY BUGS FOUND
None for now. If you will find one you are cool and you will be acknowledged
here in the home page. Thanks!
See Also
PHP Interactive, a web based interactive shell for PHP.
xadsen, a Google Adsense (*) monitor for XFree.
Tcl IRCd, A simple IRC server written in Tcl.
Free Tcl Book, Tclwise is a book about the Tcl programming language with many chapters online for free.
The Jim Interpreter A small footprint implementation of the Tcl programming language.
The Hping News Archives, newsgroups www archive service.
Archivio Discussioni Newsgroups Italiani
Archivio Newsgroups Scientifici
Blog di cucina siciliana, ricette e tradizioni culinarie siciliane
(*) Google Adsense is a trademark of Google Inc.
Visitors is listed in the Big Webmaster Directory.
Authors
Salvatore Sanfilippo <antirez (at) invece (dot) org>.
|