|
|
| Author |
Message |
cwahlin Cacti User
Joined: 10 Oct 2003 Posts: 87
|
Posted: Sun Nov 12, 2006 5:02 pm Post subject: CactiEZ performance problem |
|
|
Hello,
I am having some weird performance problems. Not sure what it could be, so maybe somone here can help.
I just migrated to CactiEZ to have a pure cacti system. Before I was running a FC4 server with cacti, using cactid to poll. 50 hosts and over 6000 items collected in roughly 10 seconds. Also, the old system used RRD 1.0, and the new one uses RRD 1.2. I have no scripts in my poller cache.
Now on the CactiEZ system, I am polling the same 50 hosts, but with almost 8000 items, same poller settings, same hardware except more disk (HP DL360 G3, 4Gig RAM, 146Gig 10,000 RPM scsi disks raid1), same poller settings and I am getting 240 second poll time!
Also, server load on the old system is constantly under 2, and on the new I am up to 5 already.
This is the new server:
11/13/2006 11:18:55 AM - SYSTEM STATS: Time:234.2045 Method:cactid Processes:2 Threads:15 Hosts:43 HostsPerProcess:22 DataSources:8895 RRDsProcessed:3337
I get this when I put the old serverdisks into another DL360 with only 1 Gig Ram:
11/13/2006 11:22:17 AM - SYSTEM STATS: Time:9.0990 Method:cactid Processes:2 Threads:15 Hosts:51 HostsPerProcess:26 DataSources:6720 RRDsProcessed:2574
Could the added datasources really account for this?
Any ideas where I can start looking for what's causing this?
/Carl |
|
| Back to top |
|
 |
cwahlin Cacti User
Joined: 10 Oct 2003 Posts: 87
|
Posted: Mon Nov 13, 2006 10:00 am Post subject: |
|
|
Did some rrd performance testing, but I don't know what is a good or bad result:
Create 10 rrds 1 c/s (0.00186 sdv) Update 10 rrds 7522 u/s (0.00083 sdv)
Create 10 rrds 1 c/s (0.00183 sdv) Update 20 rrds 7724 u/s (0.00011 sdv)
Create 20 rrds 1 c/s (0.00400 sdv) Update 40 rrds 7117 u/s (0.00160 sdv)
Create 40 rrds 1 c/s (0.00765 sdv) Update 80 rrds 7154 u/s (0.00125 sdv)
Create 80 rrds 1 c/s (0.01544 sdv) Update 160 rrds 7191 u/s (0.00029 sdv)
Create 160 rrds 1 c/s (0.04560 sdv) Update 320 rrds 6419 u/s (0.00357 sdv)
Create 320 rrds 4 c/s (0.34117 sdv) Update 640 rrds 5717 u/s (0.00074 sdv)
Create 640 rrds 2 c/s (0.35059 sdv) Update 1280 rrds 3761 u/s (0.01511 sdv)
Create 1280 rrds 2 c/s (0.21801 sdv) Update 2560 rrds 270 u/s (0.10911 sdv)
Create 768 rrds 2 c/s (0.28093 sdv) Update 3328 rrds 6389 u/s (0.00082 sdv)
Can't do it on the old machine because I get an error, so I can't compare them:
Create 10 rrds 1 c/s (0.00278 sdv)conversion of '1163429843.49898' to long not complete: tail '.49898' at perftest.pl line 58.
/Carl |
|
| Back to top |
|
 |
TheWitness Developer
Joined: 14 May 2002 Posts: 9723 Location: MI, USA
|
Posted: Mon Nov 13, 2006 10:08 am Post subject: |
|
|
Did you add scripts? They can increase the load average, but not likely. You should run top and see what is going on.
TheWitness |
|
| Back to top |
|
 |
cwahlin Cacti User
Joined: 10 Oct 2003 Posts: 87
|
Posted: Mon Nov 13, 2006 10:08 am Post subject: |
|
|
Also, did what TheWitness said in another post, ran cactid with debuging on, and cactid gave me this result:
CACTID: Time: 37.0018 s, Threads: 15, Hosts: 43
So why does it take 240 seconds when it runs the polls every 5 minutes??
/Carl |
|
| Back to top |
|
 |
TheWitness Developer
Joined: 14 May 2002 Posts: 9723 Location: MI, USA
|
|
| Back to top |
|
 |
cwahlin Cacti User
Joined: 10 Oct 2003 Posts: 87
|
Posted: Mon Nov 13, 2006 1:27 pm Post subject: |
|
|
Yeah, top confirms that rrdtool is taking a looong time. But can anyone explain what the difference is from my last server? the extra 1000 (that's like a 15% increase or so) data sources can't make it 23 times slower?
/Carl |
|
| Back to top |
|
 |
egarnel Cacti Pro User
Joined: 21 Nov 2002 Posts: 630 Location: Austin, TX
|
Posted: Mon Nov 13, 2006 1:48 pm Post subject: |
|
|
Perhaps it is due to a newer feature of rrdtool 1.2 - maybe the backwards compatibility to be able to read rrdtool 1.0 files
http://oss.oetiker.ch/rrdtool/newin12.txt |
|
| Back to top |
|
 |
streaker69 Cacti Pro User
Joined: 27 Mar 2006 Posts: 647 Location: Psychic Amish Network Administrator
|
Posted: Mon Nov 13, 2006 3:02 pm Post subject: |
|
|
| Is there a difference in Blocksize between your old system and your new system? |
|
| Back to top |
|
 |
TheWitness Developer
Joined: 14 May 2002 Posts: 9723 Location: MI, USA
|
Posted: Mon Nov 13, 2006 4:08 pm Post subject: |
|
|
Available Memory and Disk Cache Rule in the RRDtool world.
TheWitness |
|
| Back to top |
|
 |
streaker69 Cacti Pro User
Joined: 27 Mar 2006 Posts: 647 Location: Psychic Amish Network Administrator
|
Posted: Mon Nov 13, 2006 4:13 pm Post subject: |
|
|
| TheWitness wrote: | Available Memory and Disk Cache Rule in the RRDtool world.
TheWitness |
Right, from what N3NCY was telling me about the Boost plugin, that's going to make all the difference, but in the OP's case here, he's increased the RAM and Harddrive and now he's getting bad performance.
I was wondering if maybe a difference in Blocksize in the filesystem could be causing RRDTool to lag behind because of the larger drive.
I don't know much about *nix file systems, but after talking to someone that does, he thinks it could be a factor. |
|
| Back to top |
|
 |
cwahlin Cacti User
Joined: 10 Oct 2003 Posts: 87
|
Posted: Mon Nov 13, 2006 4:20 pm Post subject: |
|
|
I didn't import any data from the old server. I just set it up completely new with and configured all hosts manually.
| Quote: | Available Memory and Disk Cache Rule in the RRDtool world.
TheWitness |
But the only hardware difference between the 2 servers now is the new slower one has 4 Gigs of RAM, and has 146Gig drives instead of 18 Gig drives and 1 Gig RAM in the old one. I would think there wouldn't be a x22 difference between the two. Especially since the new server had WAY more RAM!
Block size is 4096 according to tune2fs.
/Carl |
|
| Back to top |
|
 |
cwahlin Cacti User
Joined: 10 Oct 2003 Posts: 87
|
Posted: Sat Nov 18, 2006 2:23 pm Post subject: |
|
|
Something is way off.
I don't see why it would take 240 seconds to write 3337 RRDs compared to taking 10 seconds to write 2574. 24x the time to write 30% more rrds. I have the same sql tuning done, same hardware (except more ram and disk in the 240 second server). Maybe wrong driver for the disks or something? Maybe my old fedora box just interacts with the hardware better than cactiEZ?
Looking at the performance topic I think I should complete the poll lots faster.
I'm concidering reinstalling the server again with fedora 5 and installing everything again. Hopefully I can use the database from CactiEZ or I'll have to manualy enter every host again...
/Carl |
|
| Back to top |
|
 |
TheWitness Developer
Joined: 14 May 2002 Posts: 9723 Location: MI, USA
|
Posted: Sat Nov 18, 2006 2:38 pm Post subject: |
|
|
It has to be something associated with the disk subsystem. Keep looking at it. Also, just confirm that you Cactid processes end right away and the onlything the poller is doing from that point on is RRDupdates. Ok.
Larry |
|
| Back to top |
|
 |
cwahlin Cacti User
Joined: 10 Oct 2003 Posts: 87
|
Posted: Wed Nov 22, 2006 4:41 am Post subject: |
|
|
Ok,
I solved the problem, but I have no idea what caused it.
I reinstalled the machine with Fedora Core 6. Installed cacti and imported my sql tables. Not the exact number of RRDs as before, but almost, and this is what I got:
SYSTEM STATS: Time:16.3049 Method:cactid Processes:2 Threads:20 Hosts:43 HostsPerProcess:22 DataSources:8818 RRDsProcessed:3302
(no sql and php tuning yet, and no php accelerator)
MUCH better! I guess someting in CactiEZs way of speaking to my harddrives wasn't right.
/Carl |
|
| Back to top |
|
 |
cigamit Developer
Joined: 07 Apr 2005 Posts: 946 Location: B/CS Texas
|
Posted: Wed Nov 22, 2006 11:57 pm Post subject: |
|
|
What version of RRDTool are you using? After version 1.2.11 (I believe that was the one) my performance has just gone downhill (on all 3 production machines) from 20-30 second polling times to 80-100. There were several other things I changed also, but I believe it to have been rrdtool that was causing the performance loss.
Thankfully we now have the boost plugin to help completely offset the lost performance. |
|
| Back to top |
|
 |
|