Cacti (home)ForumsDocumentation
Cacti: offical forums and support
It is currently Sun Apr 23, 2017 4:51 pm

All times are UTC - 5 hours




Post new topic Reply to topic  [ 11 posts ] 
Author Message
 Post subject: Migrated Cacti to new server; poll barely finishes in 1 min
PostPosted: Tue Jan 24, 2017 7:17 pm 
Offline
Cacti User

Joined: Tue Mar 31, 2009 10:05 am
Posts: 216
Sorry for the long post, but I am hoping I have included everything necessary to get this situation resolved.

We are running Cacti 0.8.8h, Spine 0.8.8h, PHP 5.6.29, NET-SNMP 5.5, RRDTool 1.4.8, and everything else in the Technical Support page looks normal. We had a spare server that we migrated Cacti to using the step located here http://xmodulo.com/migrate-cacti-server.html. The server is way over powered (24 cores and 140 GB of RAM). The server only shows 78 % CPU utilization and 3.6 GB of RAM used. We are polling 476 hosts that have 17,788 data sources, yet even with the settings below, polling still usually comes in at 58 seconds and rarely goes below 39 seconds.

Attachment:
poller.jpg
poller.jpg [ 270.62 KiB | Viewed 408 times ]


In addition we keep seeing messages in the log about "SPINE: Poller[0] ERROR: Spine Timed Out While Processing Hosts Internal" and "SPINE: Poller[0] ERROR: SS[0] PHP Script Server communications lost. Restarting PHP Script Server", and "SPINE: Poller[0] ERROR: Spine Timed Out While Processing Hosts Internal" all of which had solutions on the internet of adjusting the max_connections in my.cnf which we did to no avail and checking for 2 pollers. We dug and dug and there is only 1 poller running anywhere and that is in the cron of the cactiuser and looks like: * * * * * umask 000; /usr/bin/php /usr/local/cacti/poller.php > /dev/null 2>&1

In the my.cnf file, we have max_connections = 1000 and max-heap-table-size = 1G. According to this page http://www.cacti.net/downloads/docs/html/using_spine.html, "Maximum Concurrent Poller Processes" should be 1-2 times greater than the number of cores, but setting this to 24 or 48 definitely didn't get us under 1 minute. Despite that page saying that the "required settings for "Maximum Threads per Process" are 5-10, the only way we could get under a minute is setting that to 48 (2x the number of processors) as shown in the image below.

Attachment:
settings.jpg
settings.jpg [ 243.5 KiB | Viewed 408 times ]


Running the repair_database script returned the following. I'm not sure if those issues could be the cause for the slowness? If so, what are the possible issues of running that command with a "--force"?
Code:
NOTE: Checking for Invalid Cacti Templates
NOTE: 1 Invalid CDEF Item Rows Found in Graph Templates
NOTE: 45950 Invalid Data Input Data Rows Found in Data Templates

WARNING: Cacti Template Problems found in your Database.  Using the '--force' option will remove
the invalid records.  However, these changes can be catastrophic to existing data sources.  Therefore, you
should contact your support organization prior to proceeding with that repair.


As you may have seen in the image above, we have Boost and hmib installed. We could never get hmib running as it would never populate the rrds. We had Boost running on the old system, but the polling still didn't finish in under a minute. On this new system, when we try and enable Boost it doesn't process any records. It shows the next time they will be processed, that time comes and the Boost status shows no records processed, but the next scheduled processing time has bumped forward 30 minutes.

Are we really asking for two much? Is 40-58 seconds all that we can reasonably get with this setup? Any help would be greatly appreciated.


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Mon Feb 06, 2017 9:47 am 
Offline
Cacti User

Joined: Tue Mar 31, 2009 10:05 am
Posts: 216
I had Boost enabled, however that didn't help the speed. We have made a new installation of 1.0.0, so we will see if that helps at all.


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Mon Feb 06, 2017 9:55 am 
Offline
Cacti User
User avatar

Joined: Mon Jan 05, 2015 10:10 am
Posts: 202
That prior screen show is with boost disabled.

_________________
Before history, there was a paradise, now dust.


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Mon Feb 06, 2017 10:05 am 
Offline
Cacti User

Joined: Tue Mar 31, 2009 10:05 am
Posts: 216
After that post and before Larry's reply, we re-enabled Boost but I had not yet had time to update the post.


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Mon Feb 06, 2017 1:30 pm 
Offline

Joined: Thu Jul 21, 2011 6:33 am
Posts: 9
I'm updating from 0.8.8a to 1.0.1 and I'm having similar trouble with script server. Sometimes it works... sometimes not.

2017-02-06 14:45:15 - PHPSVR DEBUG: PID[1849] CTR[1] INC: 'ss_host_cpu.php' FUNC: 'ss_host_cpu' PARMS: 'xxxxxx 409 2:161:500:3:50:COMMUNITY:::::: get usage '1''
2017-02-06 14:45:15 - PHPSVR DEBUG: PID[1849] CTR[1] RESPONSE:'0'
2017-02-06 14:45:16 - PHPSVR DEBUG: PID[1853] CTR[1] INC: 'ss_host_cpu.php' FUNC: 'ss_host_cpu' PARMS: 'xxxxxx 409 2:161:500:3:50:COMMUNITY:::::: get usage '0''
2017-02-06 14:45:16 - PHPSVR DEBUG: PID[1853] CTR[1] RESPONSE:'1'
2017-02-06 14:45:16 - PHPSVR DEBUG: PID[1856] CTR[1] INC: 'ss_host_cpu.php' FUNC: 'ss_host_cpu' PARMS: 'zzzzzzz 410 2:161:500:3:50:COMMUNITY:::::: num_indexes'
2017-02-06 14:45:16 - PHPSVR WARNING: PHP Script File to be included, does not exist
2017-02-06 14:45:16 - PHPSVR WARNING: Function does not exist INC: 'ss_host_cpu.php' FUNC: 'ss_host_cpu' PARMS: 'zzzzzzz 410 2:161:500:3:50:COMMUNITY:::::: num_indexes'
2017-02-06 14:45:20 - PHPSVR DEBUG: PID[1864] CTR[1] INC: 'ss_host_cpu.php' FUNC: 'ss_host_cpu' PARMS: 'yyyyyyyy 423 2:161:500:3:50:COMMUNITY:::::: num_indexes'
2017-02-06 14:45:20 - PHPSVR WARNING: PHP Script File to be included, does not exist
2017-02-06 14:45:20 - PHPSVR WARNING: Function does not exist INC: 'ss_host_cpu.php' FUNC: 'ss_host_cpu' PARMS: 'yyyyyyyy 423 2:161:500:3:50:COMMUNITY:::::: num_indexes'

Look it worked for 'xxxxx' and 'zzzzz' hosts. For yyyyy he disappeared? I can invoke script_server cli manually and it works, using cactiuser.


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Tue Feb 07, 2017 8:43 am 
Offline
Cacti User
User avatar

Joined: Mon Jan 05, 2015 10:10 am
Posts: 202
Well that is odd. You should log an issue on GitHub.

_________________
Before history, there was a paradise, now dust.


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Tue Feb 07, 2017 9:03 am 
Offline
Cacti Guru User
User avatar

Joined: Thu Jan 31, 2008 6:39 am
Posts: 2573
Location: Kressbronn, Germany
Just to add to this discussion. There's more then just CPU and memory involved with polling times:

- I/O speed is essential for updating rrd files ( hence the recomendation to use boost )
- Network Response is equally important. If you have devices that require >60seconds to respond, you will never get a polling speed below this

Especially for the later one WMI is prone to this, but I have it with a Linux system which is heavily using SNMP/Scripts to query database parameters.

Cacti 1.0 does have the "Poll Time" field which you can check. Earlier version require a patch ( http://bugs.cacti.net/view.php?id=1550 )

_________________
Greetings,
Phalek
---
Need more help ? Read the Cacti documentation or my new Cacti Book
Need on-site support ? Look here Cacti Workshop
Need professional Cacti support ? Look here CereusService
---
Plugins : CereusTransporter | CereusReporting | nmidWebService | nmidSmokeping | nmidWeb2 |

Code:
CereusServer Master:  SYSTEM STATS: Time:2.5621 Method:spine Processes:1 Threads:16 Hosts:446 HostsPerProcess:446 DataSources:14683 RRDsProcessed:7573
CereusServer Agent:   SYSTEM STATS: Time:27.4840 Method:spine Processes:1 Threads:8 Hosts:16 HostsPerProcess:16 DataSources:114576 RRDsProcessed:48061


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Wed Mar 01, 2017 9:13 am 
Offline
Cacti User

Joined: Tue Mar 31, 2009 10:05 am
Posts: 216
How do I see the time each specific host took to poll?

Things definitely aren't looking any better with various combinations of spine, cmd, or Boost and we only have 1/3 of our devices/data sources added into Cacti 1.0.4. From the images below, you can see how we have things configured:
[*] Boost/performance is on
[*] cmd is being used (spine 1.0.4 finished no quicker and would then leave massive holes in graphs until Boost processed)
[*] The majority of the "Partial Result: U" datasources are Unix/Windows hard drive data sources using the "Host MIB - Hard Drive Space" template and Windows CPU graphs using the "Host MIB - CPU Utilization" template. Am I correct that no matter how hard we try, we may never fully prevent those errors from happening?
[*] Server has 24 cores and 144 GB of RAM, so any performance tweaks that can be suggested would be appreciated

Attachment:
boost.png
boost.png [ 62.16 KiB | Viewed 227 times ]

Attachment:
poller.png
poller.png [ 64.95 KiB | Viewed 227 times ]

Attachment:
poller104.png
poller104.png [ 42.95 KiB | Viewed 227 times ]

Attachment:
booststatus.png
booststatus.png [ 42.83 KiB | Viewed 227 times ]


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Wed Mar 01, 2017 11:36 am 
Offline
Developer

Joined: Thu Apr 07, 2005 3:29 pm
Posts: 2517
Location: B/CS Texas
Just my thoughts...

1. You have 24 cores but are only running 4 poller cmd.php processes. You will want to increase that. (and if spine isn't faster than cmd.php then you have big issues somewhere)
2. You are mostly using Host MIB - Hard Drive Space" this uses the script server to poll. You will want to increase the number of script servers to 10 to start.
3. Your last Boost update took 30 minutes (1801 seconds) to run for only 79204 entries. That is super slow. Probably related to disk access speeds / disk I/O issues. My little production VM (only 4 CPU 4 GB Mem) had these results on its last update "RRDUpdates:16158 TotalTime:144" for comparison. This usually means Disk I/O issues.
4. Your polling mostly Windows boxes which are notoriously slow to begin with, so the only real fix is to poll more of them at once
5. If spine is leaving holes in your graphs until boost processes, this is because your rrds are not writable by the webserver (spine inserts directly into the Boost poller table but with boost enabled, the webserver should try to pull any data and write it to disk when you view a graph)

Is both your poller_output and poller_output_boost set as MEMORY tables?

_________________
CactiEZ CD


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Wed Mar 01, 2017 12:45 pm 
Offline
Cacti User

Joined: Tue Mar 31, 2009 10:05 am
Posts: 216
Wondering if the Spine slowness is due to us still running MySQL 5.1.73? We are in the process of updating that.

For the RRA directory permissions, our webserver runs as nobody and the permissions on the RRA directory are "drwxrwx--- 3 nobody wheel", so that should have that covered.

I'll try the other suggestions and post back.


Top
 Profile  
 
 Post subject: Re: Migrated Cacti to new server; poll barely finishes in 1
PostPosted: Wed Mar 01, 2017 10:30 pm 
Offline
Cacti User

Joined: Tue Mar 31, 2009 10:05 am
Posts: 216
I am a grade "A" idiot. Regarding Spine not graphing in 1.x, I had screwed up and entered the wrong path to the spine URL (we are currently running 0.8.8h and 1.x in parallel during migration), so Cacti 1.x was trying to use Spine 0.8.8h which was failing.

After entering the correct path, spine started graphing with Boost enabled and it was much quicker. Sorry for the confusion and thank you for the tuning tips.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC - 5 hours


Who is online

Users browsing this forum: Google [Bot] and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  

Protected by Anti-Spam ACP Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group