Lots of changes: - Article about IPC - New TUWF release - New ncdu release - Atom feeds for the bug tracker - Bug tracker switch to sqlite
102 lines
4 KiB
Text
102 lines
4 KiB
Text
=pod
|
|
|
|
People who run AWStats on large log files have most likely noticed: the data
|
|
files can grow quite large, resulting in both a waste of disk space and longer
|
|
page generation times for the AWStats pages. I wrote a small script that
|
|
analyzes these data files and can remove any information you think is
|
|
unnecessary.
|
|
|
|
B<Download:> L<awshrink|https://dev.yorhel.nl/download/code/awshrink> (copy to
|
|
/usr/bin to install).
|
|
|
|
|
|
=head2 Important
|
|
|
|
Do B<NOT> use this script on data files that are not completed yet (i.e. data
|
|
files of the month you're living in). This will result in inaccurate sorting of
|
|
visits, pages, referers and whatever other list you're shrinking. Also, keep
|
|
in mind that this is just a fast written perl hack, it is by no means fast and
|
|
may hog some memory while shrinking data files.
|
|
|
|
|
|
=head2 Usage
|
|
|
|
awshrink [-c -s] [-SECTION LINES] [..] datafile
|
|
-s Show statistics
|
|
-c Overwrite datafile instead of writing to a backupfile (datafile~)
|
|
-SECTION LINES
|
|
Shrink the selected SECTION to LINES lines. (See example below)
|
|
|
|
|
|
=head2 Typical command-line usage
|
|
|
|
While awshrink is most useful for monthly cron jobs, here's an example of basic
|
|
command line usage to demonstrate what the script can do:
|
|
|
|
$ wc -c awstats122007.a.txt
|
|
29916817 awstats122007.a.txt
|
|
|
|
$ awshrink -s awstats122007.a.txt
|
|
Section Size (Bytes) Lines
|
|
SCREENSIZE* 74 0
|
|
WORMS 131 0
|
|
EMAILRECEIVER 135 0
|
|
EMAILSENDER 143 0
|
|
CLUSTER* 144 0
|
|
LOGIN 155 0
|
|
ORIGIN* 178 6
|
|
ERRORS* 229 10
|
|
SESSION* 236 7
|
|
FILETYPES* 340 12
|
|
MISC* 341 10
|
|
GENERAL* 362 8
|
|
OS* 414 29
|
|
SEREFERRALS 587 34
|
|
TIME* 1270 24
|
|
DAY* 1293 31
|
|
ROBOT 1644 40
|
|
BROWSER 1992 127
|
|
DOMAIN 2377 131
|
|
UNKNOWNREFERERBROWSER 5439 105
|
|
UNKNOWNREFERER 20585 317
|
|
SIDER_404 74717 2199
|
|
PAGEREFS 130982 2500
|
|
KEYWORDS 288189 27036
|
|
SIDER 1058723 25470
|
|
SEARCHWORDS 5038611 157807
|
|
VISITOR 23285662 416084
|
|
* = not shrinkable
|
|
|
|
$ awshrink -s -c -VISITOR 100 -SEARCHWORDS 100 -SIDER 100 awstats122007.a.txt
|
|
Section Size (Bytes) Lines
|
|
SCREENSIZE* 74 0
|
|
WORMS 131 0
|
|
EMAILRECEIVER 135 0
|
|
EMAILSENDER 143 0
|
|
CLUSTER* 144 0
|
|
LOGIN 155 0
|
|
ORIGIN* 178 6
|
|
ERRORS* 229 10
|
|
SESSION* 236 7
|
|
FILETYPES* 340 12
|
|
MISC* 341 10
|
|
GENERAL* 362 8
|
|
OS* 414 29
|
|
SEREFERRALS 587 34
|
|
TIME* 1270 24
|
|
DAY* 1293 31
|
|
ROBOT 1644 40
|
|
BROWSER 1992 127
|
|
SEARCHWORDS 2289 100
|
|
DOMAIN 2377 131
|
|
SIDER 3984 100
|
|
UNKNOWNREFERERBROWSER 5439 105
|
|
VISITOR 5980 100
|
|
UNKNOWNREFERER 20585 317
|
|
SIDER_404 74717 2199
|
|
PAGEREFS 130982 2500
|
|
KEYWORDS 288189 27036
|
|
* = not shrinkable
|
|
|
|
$ wc -c awstats122007.a.txt
|
|
546074 awstats122007.a.txt
|