Understanding Server Logs

Let me put it to you straight: The most difficult way to track traffic on your
Web site is through your server logs. Server logs are also the only way to get
certain types of in-depth detail about your site. I guess you need to know what
sever logs are though before I get too deep into what you can do with them.
A server log — more accurately a Web server log — is a group of files automatically
generated by a server that tracks statistics about the traffic on your
Web site. This group of files might contain information on where a user came
to your site from, what pages on your site she visited, how long she spent on
each site, and even more detailed information like what country she lives in
(or the country her Internet access account is registered in) and some of the
specifications about the browser she’s using.
Server logs are a complicated mess of facts and information that most people
just can’t read. Seriously. You have to be one step above a NASA geek to
understand all the gibberish contained in a server log.
Because most people won’t ever reach that level of geekiness, some programs
— log analyzers or log parsers — take all that data, analyze it, and then
spit out more understandable statistics. Programs like AWStats (which is
free, available at www.awstats.sourceforge.net) and Summary (which is
free to try but can be costly to own, available at www.summary.net) can give
you the information you seek from the raw data that the server collects.
Even though these programs are easier to use than trying to figure out server
logs on your own, they’re still not the easiest programs available. With
AWStats, for example, you get to track your Web site statistics, but you have
to have access to your Web server to use it. It’s also requires a little more
technical knowledge than some of the other Web site statistics programs
that are available — like Google Analytics. Still, if you’re ready to take on this
program, it can potentially provide very in-depth analyses of the data that is
collected in your server logs. I’m not ready to jump too deep into this pool
right now, though. You’ll find more information on AWStats in the “Installing
AWStats” section,
I’ll be honest with you. Working with log analyzers can sometimes seem
nearly as complicated as just trying to use the raw data coming from the
server. Most log analyzers require that code be added to your Web site or
Web server and then the reports have to be programmed before you can
receive them.
On the flip side, server log analyzers can allow you to parse server data in
ways that some other programs won’t let you. With this technology, you can
design reports that meet very specific needs (if you know how). For example,
if you need a report that not only tells you what page of your Web site that
visitors entered on but also what time of day they came to your site most
often, you can program a report to divulge that kind of information.
If you’re using a program like AWStats, the first thing to understand is that
log analyzers count visitors differently than analytics programs do — one like
Google Analytics, for example. AWStats looks at the IP address — the unique
numerical address of a computer on the Internet, kind of like a street address
for your house — of each site visitor. If one person visits your site a number
of different times, AWStats counts that as only a single visitor. By comparison,
a program like Google Analytics tracks computers by placing a cookie
on the hard drive. That means that if a user clears out his browser cache —
that’s a record of the sites the user visited using that computer — or if the
user logs in from another computer, Google Analytics counts him as more
than one visitor. Looking at IP addresses is a little more accurate because
even if a user clears his cache, the IP address for his computer remains the
same. (Logging in from a different computer is still a problem, but as far as I
know, there’s no way around that kind of user being counted more than once
with any stats program.)
Next, understand that programs like AWStats are more about the numbers
than what can actually be extrapolated from those numbers. For example,
with AWStats, Web crawlers are identified according to a list of crawlers
defined by the log analyzer. Usually, a person creates the list, and the program
then compares data against that list to determine which visits are
from Web crawlers and which are from real people. The problem with this
approach is that if the list of Web crawlers is not all-inclusive, a crawler could
be counted as a visitor. The result, then, is that the number of visitors can be
skewed. Because AWStats doesn’t look at things like where a visitor comes
from, it’s hard to tell what’s a crawler and what’s a visitor if the crawler
doesn’t appear on the list of excluded IP addresses.
On the other hand, Google Analytics does look at where visitors come from.
And Web crawlers have very specific origins, so it’s usually pretty easy to tell
which of your visitors are people and which are programs that are designed
to crawl a Web site.

0 comments:

Post a Comment