rsyslog: Choosing Your Remote Protocol

Please see Learning rsyslog for the introduction and index to this series of blog posts about rsyslog.

To get log messages from the transmitting server to the receiving log server, rsyslog has to use a protocol of some kind. The original default is UDP. UDP is fast and unreliable, the poster child for fire-and-forget. It sends its message and immediately forgets anything ever happened. Does it arrive? Who cares?

Your next choice is TCP: TCP transmits messages with sequence numbers, and if the receiving server doesn't get a sequence number, it queries to have it sent again. This is reliable - but not 100%. Edge cases occur when either software or a server crashes and the sequence gets disrupted, meaning packages can get lost.

With this in mind, the author of rsyslog, Rainer Gerhards, designed a new protocol called RELP ("Reliable Event Logging Protocol") that uses TCP for transmission, but always confirms message reception. RELP was intended for rsyslog, but has found success in other places as well: apparently it's a good design. The quote I found most enlightening is from Wikipedia: "[RELP] is most often used in environments which do not tolerate message loss, such as the financial industry." And it's on this statement that that I build a flimsy tower of assumptions.

UDP's unreliability is one reason to switch to TCP. Another reason is that you need TCP if you want your log messages encrypted in flight (recommended!). rsyslog supports TLS but doesn't appear to support DTLS so TCP is needed.

TCP is very reliable: it takes a fairly severe disruption to cause data loss. Server crashes are the likeliest case - and at that point, are those last couple log messages what's most important to you? Probably you just want to get things running again, and if you've got four nines worth of the logs that's good enough. But for some people, they may see those last couple log messages as critically important because it may indicate the cause of the crash. I would argue that in that case, you should log into the machine that crashed and examine the logs there anyway (most rsyslog installations that log remotely also log locally - you can change that, but it would be a bad idea in my opinion). Again, the exception is the financial industry: when you're logging financial transactions, dropping even one isn't acceptable.

I was planning on changing from TCP to RELP after I'd got the rest of my rsyslog infrastructure working, but as I worked through the rest of the setup, I've had time to think about just how infrequently I was going to lose messages using TCP. It looks like RELP is easy to set up with rsyslog, but its use must impose some overhead, some increased load on both the transmitting and receiving servers (I don't know how much and haven't attempted to research that). I already have a working TCP configuration, and until I see a strong reason to implement RELP I'm considering its use premature optimization and will stick with plain TCP.

'Learning rsyslog' Index