RRD writting performance improvement

Addons for Cacti and discussion about those addons

Moderators: Moderators, Developers

Post Reply
Author
Message
kazuya
Posts: 2
Joined: Mon Oct 11, 2010 8:55 am

RRD writting performance improvement

#1 Post by kazuya » Mon Oct 11, 2010 9:06 am

Hi,

I have noticed that with my setup where I store RRD file on NFS mount point polling performance was very bad.

After some investigation I have found out that SNMP/Script polling was performing very well (within 10s) but rrdtool would take very long time (thousands of seconds) to finish writing data.

The problem here was that rrdtool will write one RRD file at a time and over NFS opening files is slow. With about 7k RRD files this would not scale.

I have solved the problem by parallelizing RRD file write with rrdtool wrapper script:

Code: Select all

#!/usr/bin/python

import zlib
import subprocess
import sys
import time
import os

rrd = "/usr/bin/rrdtool"
process_no = 16
            
processes = []

def spawn():
        processes.append(subprocess.Popen([rrd, "-"], stdin=subprocess.PIPE))

def pipe(process, data):
        processes[process].stdin.write(data)

if len(sys.argv) > 1 and sys.argv[1] == "-":
        pno = process_no
        while pno:
                spawn()
                pno -= 1
        while True:
                line = sys.stdin.readline()
                if not line:
                        break
                args = line.lstrip().split(" ")
                
                if len(args) < 2 or args[0] != "update":
                        pipe(0, line)
                else:   
                        cmd = args[0]
                        file = args[1]
                        hash = zlib.crc32(file)
                        process = hash % process_no
                        #print "cmd: ", cmd, "file: ", file, "hash: ", hash, "process no: ", process
                        pipe(process, line)
else:   
        os.execlp(rrd, *([rrd] + sys.argv[1:]))
This script will spawn 16 rrdtool processes and will pipe update requests to them in parallel making sure that there are no two processes writing to the same file.

With this script used instead of rrdtool (path in settings page) I got my poller run times down to about 90s.
I beleve that it should be easy to rewrite this in PHP and include in Cacti release.
Regards,
Jakub Pastuszek

User avatar
TheWitness
Developer
Posts: 14804
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: RRD writting performance improvement

#2 Post by TheWitness » Mon Oct 11, 2010 6:15 pm

Good solution. I had done the same in Cacti, but never implemented it in SVN. The other thing to do is to use boost. That will turn them down even more.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Central Plugin Repository
Central Templates Repository


I'm still out there people. Getting excited for Cacti 1.2. I think it will be a great release.

kazuya
Posts: 2
Joined: Mon Oct 11, 2010 8:55 am

Re: RRD writting performance improvement

#3 Post by kazuya » Wed Oct 13, 2010 8:35 am

Thanks for that boost tip. We may need it in near future :)
Regards,
Jakub Pastuszek

npeltier
Posts: 27
Joined: Mon Aug 08, 2011 4:04 am

Re: RRD writting performance improvement

#4 Post by npeltier » Thu Apr 30, 2015 6:10 am

Hi kazuya,

I was very happy to test your script, the results are crazy, 10+ times faster !
Actual polling was ~10s and rrd update was taking between 200s and 600s depending on network and/or nfs activity.
Now its usually more around 20-30 seconds total...

I couldn't use boost as rrd files are shared and used by other applications that need constant up to date data.

@TheWitness : I strongly suggest that you push that kind of feature in future releases if you get a chance!

I'm currently using cacti 0.8.8b.

Thank you for all the good work guys !

NP

Hanseat
Posts: 17
Joined: Thu Sep 27, 2012 5:24 am

Re: RRD writting performance improvement

#5 Post by Hanseat » Tue Sep 15, 2015 9:12 am

As I/O is paramount in Cacti (in my expierence) we just put all RRDs into RAM and use
a copy job-script to periodically store the RRDs onto harddisk (something like this: https://www.a-netz.de/blog/2013/02/pers ... -ramdisks/).

routenull0
Posts: 16
Joined: Tue Jul 09, 2013 5:32 pm

Re: RRD writting performance improvement

#6 Post by routenull0 » Wed Sep 23, 2015 5:30 am

How can one determine how long the RRD update process is taking? I can see how long our polling process takes, but I cannot seem to locate the over all RRD update process.

User avatar
phalek
Cacti Guru User
Posts: 2710
Joined: Thu Jan 31, 2008 6:39 am
Location: Kressbronn, Germany
Contact:

Re: RRD writting performance improvement

#7 Post by phalek » Wed Sep 23, 2015 9:10 am

You'll have to hack the poller.php file and att some additional statistics information in it :-)
Greetings,
Phalek
---
Need more help ? Read the Cacti documentation or my new Cacti 1.x Book
Need on-site support ? Look here Cacti Workshop
Need professional Cacti support ? Look here CereusService
---
Plugins : CereusTransporter | CereusReporting | nmidWebService | nmidSmokeping | nmidWeb2 |

Code: Select all

CereusServer Master:  SYSTEM STATS: Time:2.5621 Method:spine Processes:1 Threads:16 Hosts:446 HostsPerProcess:446 DataSources:14683 RRDsProcessed:7573
CereusServer Agent:   SYSTEM STATS: Time:27.4840 Method:spine Processes:1 Threads:8 Hosts:16 HostsPerProcess:16 DataSources:114576 RRDsProcessed:48061
[/size]

Soup
Cacti User
Posts: 66
Joined: Tue Aug 07, 2012 10:02 am

Re: RRD writting performance improvement

#8 Post by Soup » Thu Apr 14, 2016 11:46 am

This was a HUGE help.

I was just forced to move my RRD files from SAN -> NFS. Boost was backlogged by about 17 million records and wasn't making any headway (but was keeping up).

Threw the wrapper in and the boost backlog cleared in a few hours.

Code: Select all

Last Start Time:	2016-04-14 16:37:42
Last Run Duration:	12 minutes 13 seconds (5% of update frequency)
RRD Updates:	9254006
Peak Poller Memory:	418.46 MBytes
Detailed Runtime Timers:	RRDUpdates:9254006 TotalTime:734 get_records:441.96 results_cycle:258.5 rrd_filename_and_template:81.6 rrd_lastupdate:6.86 rrdupdate:85.42 delete:17.7 timer_overhead:~3
Max Poller Memory Allowed:	2048 MBytes

routenull0
Posts: 16
Joined: Tue Jul 09, 2013 5:32 pm

Re: RRD writting performance improvement

#9 Post by routenull0 » Fri Jun 24, 2016 8:06 am

Soup wrote:This was a HUGE help.

I was just forced to move my RRD files from SAN -> NFS. Boost was backlogged by about 17 million records and wasn't making any headway (but was keeping up).

Threw the wrapper in and the boost backlog cleared in a few hours.

Code: Select all

Last Start Time:	2016-04-14 16:37:42
Last Run Duration:	12 minutes 13 seconds (5% of update frequency)
RRD Updates:	9254006
Peak Poller Memory:	418.46 MBytes
Detailed Runtime Timers:	RRDUpdates:9254006 TotalTime:734 get_records:441.96 results_cycle:258.5 rrd_filename_and_template:81.6 rrd_lastupdate:6.86 rrdupdate:85.42 delete:17.7 timer_overhead:~3
Max Poller Memory Allowed:	2048 MBytes
Where did you get those stats from? I'm guess they are provided via the Boost plugin?

Post Reply