13

I need to monitor some large noisy log files (500m/day) from a Java application (log4j). Right now I manually look at the files, grep for "ERROR" and so on. However it should be possible for a tool to spot repeating patterns in the file, count them and provide drill down for the details of individual entries. Anyone know of such a tool? A text or Web based UI would be nice.

  • 2
    To me this question absolutely screams perl. – John Gardeniers Dec 19 '11 at 09:38
  • Hmm its starting to look like I will have to write a bash script with lots of greps. I was hoping to have something figure out the patterns automatically. – David Tinker Dec 22 '11 at 06:43
  • seriously, this is exactly what perl was created for. You can write a self-learning script for those patterns, although that's obviously out of scope here. – John Gardeniers Dec 22 '11 at 20:43
  • http://stackoverflow.com/questions/2590251/is-there-a-log-file-analyzer-for-log4j-files has a solution called Chainsaw. – John K. N. Dec 14 '16 at 11:53
  • https://www.datadoghq.com/blog/log-patterns/ <-- highly recommend, but while not crazy expensive it's not super cheap either. – neoakris Jul 28 '19 at 21:03

9 Answers9

6

Splunk works wonders for this sort of stuff. I use it internally to gather all the logs and do quick searches via its excellent browser-based interface.

3

syslog-ng has a patterndb named feature. You can make patterns and match log entries to them in real time then send those entries to separate logfiles.

Stone
  • 7,081
  • 1
  • 22
  • 33
3

I've heard of people applying Bayesian filtering on log files to spot interesting stuff versus routine log entries. They used spam filters, where the routine uninteresting entries were considered "good" while the unusual ones were considered as "spam" and using that coloring they were able to shift through.

It sounds a lot like machine learning stuff to me, but then again I've not seen it in action, only heard of it over beers.

adamo
  • 6,965
2

While looking into syslog-ng and patterndb (+1 to that answer, above), I encountered a web-based tool called ELSA: http://code.google.com/p/enterprise-log-search-and-archive/. It's F/OSS in perl, with a web interface, and supposed to be really fast.

I haven't tried it yet, but once I'm done filtering using patterndb, I'll be trying ELSA.

EdwardTeach
  • 632
  • 9
  • 20
1

Try out petit.
I'm not sure if it will work with log4j format, but you might be able to write a custom filter for that.
Petit has no web interface, it displays graphs in your shell (ASCII art ftw!).
It's very useful to quickly see repeating messages and figure out when they happened or started to happen more frequently.

faker
  • 17,616
  • 2
  • 62
  • 71
0

If you are using debian/squeeze on your server, have a look at log2mail: http://packages.debian.org/squeeze/log2mail

ThorstenS
  • 3,152
0

Glogg is a very good log explorer as you have the possibility to create filter base on string and color line or retrieve all occurrence to a string.

http://glogg.bonnefon.org/

0

Splunk is usually a good solution for this. But you mentioned that it is too expensive for you. So I recommend you to look at Logstash or GrayLog.

Raffael Luthiger
  • 2,001
  • 2
  • 17
  • 26
-1

You can try SEQREL's LogXtender, that automatically detects patterns and aggregates similar logs. The way does it is by creating regular expressions on the fly and using the cached regex to match other logs. With additional taxonomy detection more granularity can be added. A free version can be downloaded under https://try.logxtender.net.

Mihnea
  • 1