Log File Analysis
Estimating user interest and motivation by just counting page requests
from a World Wide Web server log (or "hits") provides a distorted
metric of user activity. Log file analysis can determine how effectively
users can perceive content and navigational alternatives, since the poorly
designed structure and content of the documents themselves can inhibit
users from finding what they are looking for. Providing simple human factor
principles have been applied, amount of time users spend looking at a
page than page hits are better estimates of user interest.
The most common method of monitoring user interest is to count the number
of page accesses. There are numerous problems with using metrics like
the number of access requests made to a web server as the indicator of
user interest. One of these problems is that in a non-indexed hyperspace
users are forced to follow paths that they or others have previously forged--with
each leap being recorded as a request from some server. To assume that
"hits" reflect user interest is to assume that users are as
interested in not finding the information they need (the "misses,"
and "false alarms") as they are in finding what they are looking
The second problem with using hits, as a metric of user interest is that
any inequality in a user's ability to access content--because of network
bottlenecks, user terminated transfer, excessive delays in downloading,
and server errors--will bias the resulting record of user activity. The
third reason why "hits" should not be used as a measure of user
interest is that the users might be constrained in their ability to view
content because the author failed to consider the human factors related
to navigation and ease of understanding in the design of the documents.
Counting Page "Hits" is Not an Adequate Measure, Why?
A simple count of the number of web browsers requesting a data transfer
from a web server is not enough to explain the information browser's interest
because it does not account for the user's ability to access to information,
how effectively the information is organized and structured for comprehension,
and the appropriateness of the information to the user. It is not possible
to use unobtrusive measures to determine the appropriateness of the information
--one must survey users to determine if they find the information useful.
But if authors and system administrators can log a user's access to, and
the design and structure of, web pages then one can infer that the time
spent by a user viewing information is an indicator of that user's interest.
Useful Information For A Consistent Performance
The current method used for determining interest is to log the number
of "hits" that that page has received. This is inadequate because
the general browser will log "hits" not only on the page of
interest but also on every other page the user visited in getting there.
What is needed is a path independent measure of user interest.
When a user accesses a page in the World Wide Web they could be doing
one of two things: searching for the information they need or processing
it. Both of these activities take up user time and interest that will
confound these two activities. One way to minimize this confusion is to
adopt a consistent style of document design.
If your usersknow how to navigate your pages to find what they need quickly
then using a measure of the time spent on various pages will reflect the
browser's search for content. Other methods of minimizing the time that
users spend looking for the content that they need are to provide an information
overview (or content summary), to utilize human factors principles to
improve user comprehension of the information accessed, to allow access
to search engines that offer more than keyword search, and to provide
easily understood navigational guides of the content.
By the measure of time spent on a page by users, we monitor those areas
of the conference that our users find most interesting, and will predict
how they spend their time. This same method can be used to predict how
motivated users are toward other content--if the authors and system administrators
utilize consistent document design, provide navigation guides to their
users, and construct their pages so that an estimate of network variability
can be extracted from the log. If this is done the system administrator
can use this measure to predict which content their users find important
What types of data you collect on your server depends on how it has been
set up and defined by the technical staff. The ideal log file for performance
and traffic analysis should contain the following data:
Types of log data to collect
Who is visiting your site? You want unique visitor identification so
you know whether a visitor is returning to your site.
The path visitors take through your pages. With knowledge and the order
of each page a visitor viewed, you can identify trends in how visitors
navigate through your pages. You also want to know what element (link,
icon) a visitor clicked on each page to go to the next page.
How much time visitors spend on each page. A pattern of lengthy viewing
time on a page might lead you to deduce the page is very interesting or
Where visitors are leaving your site. The last page a visitor viewed
before leaving your site might be a logical place to end the visit, or
it might be a place where the visitor bailed out.
Benefits Of Log File Analysis
Measuring user interest improves the quality and delivery of information
services to the end user by providing tools for the site administrator
and author to determine the value of the data they are serving. The advantage
of measuring user interest from web server log files is that both administrators
and authors can use such metrics to allocate scarce resources related
to the value of the information they serve or author. The ability to monitor
user interest also facilitates the commercialization of the web by allowing
the owners of server logs to collect information of strategic importance
to the products and services they offer.
To be effective in a knowledge and information based society, individuals
need tools that allow them to collect, manipulate, and distribute the
products of their own or others work. Included in this lifecycle of information
use is the need to gain access to a large variety of files, search diverse
resources, collect and summarize the information found, and finally redistribute
this information. The success of user experiences at your site. Purchases
transacted, downloads completed, and information viewed are concrete indicators
of tasks accomplished.
Save money, add value to
your e-businesses, stand apart from competitions,
Contact us now for a FREE
Basic Web Site Usability Review, to get a feel of our web
usability services towards strategizing your next business move.
Recommend this page