Home

Site Map Links Web Mining Information  Retrieval Glossary Bibliography
Common Web Log File Format


        A common log format file is       

        created by the web server to keep track of the requests that occur on a       

        web site.       

        
sandbox.sambar.com - - [09/Sep/1997:10:42:45 -0800] "GET / HTTP/1.0" 200 1234

sandbox.sambar.com - - [09/Sep/1997:10:43:22 -0800] "GET /docs/index.htm HTTP/1.0" 304 0

sandbox.sambar.com - admin [09/Sep/1997:10:46:12 -0800] "GET /sysadmin/index.stm HTTP/1.0" 2000

207.86.139.145 - - [09/Sep/1997:10:47:43 -0800] "GET /wwwping/index.htm HTTP/1.0" 200 954

207.86.139.145 - - [01/Jan/1997:13:06:51 -0600] "GET /session/wwwping HTTP/1.0" 200 0

The common log file format has the following fields:

remotehost Remote hostname or IP address number if DNS is not enabled/available.
rfc931 The remote login name of the user. (This is not implemented by the Sambar Server).
authuser The username of the authenticated user. This is available when using password protected WWW pages.
[date] Date and time of the request.
"request" The HTTP request line as it came from the client.
status The HTTP response code returned to the client. Indicates whether or not the file was successfully retrieved, and if not, what error message was returned.
bytes The number of bytes transferred. If the status is 200 and bytes are 0, the dynamic page size could not be determined.

Sometimes Web log files stored in common log file format can not provide enough information to analyze, the extended common log file can meet the analysis needs.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Created by Lan Man

Last Modified: Nov 11, 2002