Archive for May, 2016

Tail error logs to slack for fun and profit

12 May

We love slack, so we thought, wouldn’t it be great if our error logs posted to an #errors channel in Slack? This is obviously a very bad idea if you have noisy error logs, but for us we try to keep the chatter in the logs down so that people will actually pay attention to the errors, and it’s working out nicely, but it was a little complicated to get it set up so I thought I’d share.

First we installed and set up on our servers using the instructions there. It’s pretty simple and straightforward to get going, which is awesome!

Then we created two files on each server we wanted to monitor:

trap "pkill -TERM -g $$; exit" INT TERM EXIT
while true; do
/path/to/ &
inotifywait -e create /var/log/xxx/
pkill -TERM -P $PID
kill $PID

while true; do
FILETOWATCH=$(ls -t /var/log/xxx/*.log | head -1)
tail $FILETOWATCH -f -n1 | grep -v "DEBUG:|^$" --color=never --line-buffered | /root/slackcat
sleep 31

So let’s look at what this is actually doing line by line:
trap "pkill -TERM -g $$; exit" INT TERM EXIT
This says, if we get any of the signals, INT TERM or EXIT, execute “pkill -TERM -g $$; exit”. Ok, well what does THAT mean? That utilizes the pkill command to send the TERM signal to all processes in the process group $$, and $$ means “my PID”. So essentially, we send TERM to all the processes that we spawned, so if we get killed, we don’t leave zombie children lying around.

while true; do
pretty self explanatory. all the code inside this block, we will try to do forever until we are killed &
this spawns and runs it as a background process. tailtoslack needs to basically always run forever and we don’t want it to block what we’re about to do next

this grabs the pid of and stores it in a variable called PID

inotifywait -e create /var/log/xxx/
this utilizes inotifywait to watch the /var/log/xxx/ directory for new files added. Our log files are written in the format of /var/log/xxx/yyyymmdd.log so when the day changes, a new file is written and that means we want to start tailing that log file and ignore the old one. inotifywait is blocking, so our code will sit here waiting for the new log file to be written, doing nothing until that moment.

pkill -TERM -P $PID
if we got here that means we have a new file, so we want to kill the old and start the process over again. so this line is sending the TERM signal to all the processes owned by the old

kill $PID
now we send the kill signal to as well.

That’s it! on to the next file:

while true; do
this time we need a while loop because sometimes slackcat will exit (bad response from slack, or if you send it whitespace), but we don’t actually want to give up so we loop

FILETOWATCH=`ls -t /var/log/xxx/*.log | head -1`
this finds the most recently modified log file, ls -t sorts by most recently modified and “head -1” grabs the first line of output, then we store that in a variable called FILETOWATCH

tail $FILETOWATCH -f -n1 | grep -v "DEBUG:\|^$" --color=never --line-buffered | /root/slackcat
here we tail the file that we just determined was the most recently modified one, then we strip out lines starting with DEBUG: (since we don’t care about them) as well as empty lines (since empty lines crash slackcat), then we have to tell grep to not colorize the output and to buffer the output so it only sends complete lines to slackcat, since slackcat sends a message per line.

sleep 31
if we get to this line it means something in the tail pipleine on the previous line crashed. we don’t know why, but we’re hoping whatever condition caused the crash will pass soon, so we take a nap before we iterate through the loop again

we crashed, we took a nap, time to start over

that’s it! I am by no means a bash expert so it’s possible that some of this could be done better, but it works, and has been surprisingly robust!

For bonus points, here’s how I added it to systemd in CentOS 7:

Create file: /etc/systemd/system/logwatch.service :
Description=Watch web log files and pipe to slack

ExecStart=/usr/bin/bash -c '/path/to/'


Then you just need to run:
systemctl start logwatch

and once everything looks good:
systemctl enable logwatch

No Comments

Posted in Coding