string - Parsing Apache logs in Perl -
updated 5-10-2013
okay, can filter out ip addresses no problem. come next 3 things i'd thought done sort($keys)
, wrong , trying more complex approach below didn't seem solution either. next thing need accomplish gathering dates, , browser version. provide sample of formatting of log files , current code.
apache log
24.235.131.196 - - [10/mar/2004:00:57:48 -0500] "get http://www.google.com/iframe.php http/1.0" 500 414 "http://www.google.com/iframe.php" "mozilla/4.0 (compatible; msie 6.0; windows 98)"
my code
#!usr/bin/perl -w use strict; %seen = (); open(file, "< access_log") or die "unable open file $!"; while( $line = <file>) { chomp $line; # regex ip address. if( $line =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ ) { $seen{$1}++; } #regex date example [09\mar\2009:05:30:23] if( $line =~ /\[[\d]{2}\\.*[\d]{4}\:[\d]{2}\:[\d]{2}\]*/) { print "\n\n $line matched : $_\n"; } } close file; $i = 0; # program bugs out if uncomment below line, # understanding i'm trying do. # $key ( keys %seen ) (keys %date) { $key ( keys %seen ) { ($ip) = sort {$a cmp $b}($key); # i'd able sort ip addresses , if # proper numeric way generates errors saying contents not numeric. print @$ip->[$i] . "\n"; # print "the ipv4 address : $key , has accessed server $seen{$key} times. \n"; $i++; }
you're pretty close. , yes, use hash
. it's commonly called "seen hash".
#!usr/bin/perl use warnings; use strict; $log = "web.log"; %seen = (); open (my $fh, "<", $log) or die "unable open $log: $!"; while( $line = <$fh> ) { chomp $line; if( $line =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ ){ $seen{$1}++; } } close $fh; $key ( keys %seen ) { print "$key: $seen{$key}\n"; }
here's sample log file output:
$ cat web.log [mon sep 21 02:35:24 1999] msg blah blah [mon sep 21 02:35:24 1999] 192.1.1.1 [mon sep 21 02:35:24 1999] 1.1.1.1 [mon sep 21 02:35:24 1999] 10.1.1.9 [mon sep 21 02:35:24 1999] 192.1.1.1 [mon sep 21 02:35:24 1999] 10.1.1.5 [mon sep 21 02:35:24 1999] 10.1.1.9 [mon sep 21 02:35:24 1999] 192.1.1.1 $ test.pl 1.1.1.1: 1 192.1.1.1: 3 10.1.1.9: 2 10.1.1.5: 1
a few things careful of:
my @array = <fh>;
pull entire file memory, isn't great idea. in case log files, can grow pretty large. more if not rotated
properly. for
or foreach
have same problem. while
being best practice reading file.
you should in habit of using 3-arg lexically scoped open
in example above.
your die
statement shouldn't "precise". see message die
. since reason permissions, doesn't exist, locked, etc...
update
this work dates.
my $line = '[09\mar\2009:05:30:23]: plus message'; #example [09\mar\2009:05:30:23] if( $line =~ /(\[[\d]{2}\\.*\\[\d]{4}:[\d]{2}:[\d]{2}:[\d]{2}\])/ ){ print "$line matched: $1\n"; }
update2
there's few things you've done wrong.
i don't see storing stuff date hash
.
print "\n\n $line matched : $_\n";
should seen hash
, doesn't make sense. trying stored date data?
$data{$1} = "some value, you";
you cannot loop on 2 hashes
in 1 for
loop.
for $foo (keys %h)(keys %h2) { # stuff }
and last sorting bit, should sort
keys
for $key (sort keys %seen ) {
Comments
Post a Comment