Rsyslog: Testing replace() and re_extract()

Please see Learning rsyslog for the introduction and index to this series of blog posts about rsyslog.

In my previous article on Rsyslog (Rainerscript vs. Old-Style/Legacy Configuration) I mentioned the excellent Selivan introductory article on Rsyslog, . This is required reading: go read it now. But I found myself stumbling on the Rsyslog local variables and set commands, particularly those using re_extract(). I know basic Regular Expressions, but there are about a dozen implementations (each of which behaves differently), and I needed a way to test what this one was doing. So I wrote this (initially testing replace()):

# /etc/rsyslog.d/93-filtertest.conf

    name = "filtertemplate"
    type = "string"
    string = "<%PRI%>%timegenerated% %HOSTNAME% ORIGINAL MESSAGE: %msg% FILTERED MESSAGE: %$.tmp%\n"

ruleset(name="filtertest") {
    set $.tmp = replace($msg, "...", "___");
        type = "omfile"
        dirCreateMode = "0700"
        FileCreateMode = "0644"
        Template = "filtertemplate"
        File = "/var/log/filtertest.log"

if ( $syslogtag contains "filter" )
then {
    call filtertest

This won't affect any other logging on your machine - unless you use syslog Tags that include the word "filter," in which case you've already done some customisation and should understand the implications of installing this.

On Debian (at least 9 and 10, probably earlier) and Fedora (32 and 33, probably earlier) systems, run rsyslogd -N 2 to test that you've typed that in right. Then restart Rsyslog with my preferred magic formula: systemctl restart rsyslog ; systemctl status rsyslog | cat. Once that's done, you can use logger to test:

$ logger -t filter "Test $(date +%Y-%m-%h.%H%M.%S) ... .... ..... ...... ."

What the config does is to look for log messages that are tagged "filter", and send any it finds to the "filtertest" ruleset. This sends those messages to output file /var/log/filtertest.log. While doing so it creates a local variable $.tmp (this is one of many things Selivan taught me: variables start with a dollar sign, but local variables have a leading dot) and uses a template for the message. What is finally logged is both the original message you sent and the result of filtering that message:

ORIGINAL MESSAGE:  Test 2020-11-26.1234.10 ... .... ..... ...... . FILTERED MESSAGE:  Test 2020-11-26.1234.10 ___ ___. ___.. ______ .

And so we begin to see how replace() works. I initially mentioned re_extract, and that was my next target. I changed the replace() line:

set $.tmp = re_extract($msg, "(.*)/([^/]*)", 0, 2, "NOMATCH")

If you've read the Selivan article (did I mention? You should read it!), you'll recognize this as their method of munging file paths:

ORIGINAL MESSAGE:  /var/log/messages FILTERED MESSAGE: messages
ORIGINAL MESSAGE:  /var/log/nginx/access.log FILTERED MESSAGE: access.log

I admit I wasn't sure: I thought the latter source would hold onto the directory name "nginx/" and return "nginx/access.log" as the final answer, but I was reading that wrong. And this is why I needed this test.