Welcome to It-Slav.Net blog
Peter Andersson
peter@it-slav.net

I've already got a female to worry about. Her name is the Enterprise.
-- Kirk, "The Corbomite Maneuver", stardate 1514.0
10
Dec

And so it has come. The day when Merlin outgrows its infancy is at hand.

The rite of passage into adulthood was smoother than expected, but not
without minor bumps. As with people, those bumps made the code stronger.

Testing would be most welcome.

Noteworthy bugfixes in 1.0.0 vs 0.9.0:
The protocol has been changed so that Merlin now adds a signature at
the start of each packet. If this signature isn’t found, the offending
node is disconnected. This will restore synchronization in case the
stream ever gets garbled. It also means that older merlin daemons (0.9.x)
won’t be able to talk to newer merlins (>= 1.0).

A serious issue has been found and fixed in the event-reading code
where one node might experience crashes after some time of running.
The crash only happened if the read-buffer gets filled twice directly
after each other and a packet is broken in two each time the buffer
gets filled. Subsequent reads would then overwrite other segments of
memory, causing random crashes later on (or sometimes not at all,
depending on the load-order of configuration items; Yes, it was that
weird). This issue has been rather rare. In fact, only one customer
that I know of have seen it, and even they only occasionally (although
with an unacceptably high frequency, such as a few times per week).
The symptoms include daemon and module crashing a few seconds apart
at most, and usually on the same second.

An issue in the module has been corrected. This issue could sometimes
cause Nagios to hang indefinitely when exiting or reloading, and could
at other times lead to Nagios crashing on reload or exit. It appears
this only happened upon a hard restart (in which case everything worked,
although the exit-procedure resulted in a segfault rather than a clean
exit), or when the RESTART_PROGRAM command is submitted to the command
pipe, in which case it blocked execution indefinitely by either stalling
or crashing.

Apart from those two issues, no flaws have been found in either module
or daemon during the month that merlin 0.9.0 has been running at most
of our customers and a lot of users worldwide.

Some minor issues have also been fixed, such as:
import and showlog have been fixed to be sturdier when looking at
broken or borked logs.

A hexlog function is added (although disabled). Users with troubles on
systems where I can’t test are welcome to enable hexlog debugging of
inbound (and outbound) events.

Excessive logging for disconnected or unreachable nodes has been fixed.

The binary backlog api has been fortified. There were no issues in it,
although I thought there was, so several checks now ensure that it will
always remain in good working order.

Several thousand new tests have been added to ensure receiver stability
when facing broken packets.

Some minor memory leaks have been fixed. They were not repetitive and
therefore never endangered the runtime stability, and hence they go
under the ‘minor issue’ list.

Features added in 1.0.0:
One can now set "takeover = no" for a poller node and have the master
node ignore taking over checks for that node.

There’s now more indexes on the database, which will significantly
improve Ninja’s performance. Some query analysis has been done to
verify this and come up with the indexes now added.

Special thanks to the fine folk (yes, it’s autogenerated) for helping
us bring Merlin 1.0.0 out the door by testing, reporting bugs and
submitting patches:
Andreas Ericsson <ae@op5.se>
Ken Menzel <Ken.Menzel@fisglobal.com>
Martin Kamijo <mk@op5.com>
Mattias Bergsten <mattias@westbahr.com>
Ola Sandström <ola.sandstrom@WSPGroup.se>
Per Asberg <perasb@op5.se>
Robin Sonefors <robin.sonefors@op5.com>
Stephan Beal <sgbeal@googlemail.com>

For the full list of contributors, clone the git repository and run
  make thanks

— Andreas Ericsson


Leave a Reply

Filled Under: Uncategorized




Book reviews
FreePBX 2.5
Powerful Telephony Solutions






Asterisk 1.6
Build a feature rich telephony system with Asterisk






Learning NAGIOS 3.0





Cacti 0.8 Network Monitoring,
Monitor your network with ease!