Cacti (home)ForumsDocumentation
Cacti: offical forums and support
It is currently Fri Apr 28, 2017 8:45 pm

All times are UTC - 5 hours




Post new topic Reply to topic  [ 30 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Spine 0.8.7d-pre2 Available for Testing
PostPosted: Wed Mar 25, 2009 7:30 am 
Offline
Developer
User avatar

Joined: Tue May 14, 2002 5:08 pm
Posts: 14863
Location: MI, USA
All,

I have provided a pre-release of Spine 0.8.7d for user preview prior to release. There are multiple bug fixes in this release, related to pinging of hosts and other general performance issues.

I would like to target users of snmpv3 in this pre-release and users who periodically encounter script timeout issues. This release should correct the snmpv3 issues that multiple users have been experiencing and also provide a workaround for script timeouts.

For users who experience script timeouts, I would be interested to see if that after spine exits, your child script processes exit as well. If they stay running and the PGID becomes 1, I would like to also know. The intent is that they terminate with thier spine parent. If you encounter a PGID becoming 1, please update the post below, and then rerun configure with the --enable-nifty-popen option to resolve this issue.

TheWitness


Attachments:
File comment: Linux/UNIX Source
cacti-spine-0.8.7d-pre2.tar.gz [708.22 KiB]
Downloaded 494 times
File comment: Windows Source
cacti-spine-0.8.7d-pre2.zip [810.1 KiB]
Downloaded 419 times

_________________
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
Gandalfs Official Debugging Help
Central Plugin Repository
Central Templates Repository


Last edited by TheWitness on Thu Apr 09, 2009 7:58 pm, edited 4 times in total.
Top
 Profile  
 
 Post subject: My results so far
PostPosted: Wed Mar 25, 2009 8:39 am 
Offline

Joined: Sat Aug 12, 2006 6:47 pm
Posts: 37
Code:
03/25/2009 09:27:32 AM - AUTH LOGIN: User 'admin' Authenticated
03/25/2009 09:27:14 AM - SYSTEM STATS: Time:132.3853 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:9269
03/25/2009 09:27:09 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:59 AM - SPINE: Poller[0] Host[268] ERROR: Empty result [24.97.168.219]: '/usr/bin/php -q /usr/share/cacti/scripts/query_host_isdncalls.php 24.97.168.219 feedme 2 get calls_out 0'
03/25/2009 09:26:50 AM - SPINE: Poller[0] Host[268] ERROR: Empty result [24.97.168.219]: '/usr/bin/php -q /usr/share/cacti/scripts/query_host_isdncalls.php 24.97.168.219 feedme 2 get calls_in 0'
03/25/2009 09:26:41 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:33 AM - AUTH LOGIN: User 'admin' Authenticated
03/25/2009 09:26:31 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:21 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:11 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:01 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:25:58 AM - SPINE: Poller[0] Host[408] ERROR: The POPEN timed out
03/25/2009 09:25:51 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:25:46 AM - SPINE: Poller[0] Host[529] ERROR: The POPEN timed out
03/25/2009 09:25:41 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:25:36 AM - SPINE: Poller[0] Host[477] ERROR: The POPEN timed out
03/25/2009 09:25:36 AM - SPINE: Poller[0] Host[530] ERROR: The POPEN timed out
03/25/2009 09:25:31 AM - SPINE: Poller[0] Host[559] ERROR: The POPEN timed out
03/25/2009 09:25:30 AM - SPINE: Poller[0] Host[467] ERROR: The POPEN timed out
03/25/2009 09:25:30 AM - SPINE: Poller[0] Host[465] ERROR: The POPEN timed out
03/25/2009 09:25:29 AM - SPINE: Poller[0] Host[427] ERROR: The POPEN timed out
03/25/2009 09:25:29 AM - SPINE: Poller[0] Host[429] ERROR: The POPEN timed out
03/25/2009 09:25:28 AM - SPINE: Poller[0] Host[516] ERROR: The POPEN timed out
03/25/2009 09:25:28 AM - SPINE: Poller[0] Host[515] ERROR: The POPEN timed out
03/25/2009 09:25:28 AM - SPINE: Poller[0] Host[518] ERROR: The POPEN timed out
03/25/2009 09:25:28 AM - SPINE: Poller[0] Host[506] ERROR: The POPEN timed out
03/25/2009 09:25:27 AM - SPINE: Poller[0] Host[563] ERROR: The POPEN timed out
03/25/2009 09:25:27 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:25:27 AM - SPINE: Poller[0] Host[508] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[557] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[426] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[499] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[237] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[481] ERROR: The POPEN timed out
03/25/2009 09:25:23 AM - SPINE: Poller[0] Host[468] ERROR: The POPEN timed out
03/25/2009 09:25:23 AM - SPINE: Poller[0] Host[470] ERROR: The POPEN timed out
03/25/2009 09:25:20 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:25:16 AM - SPINE: Poller[0] Host[217] ERROR: The POPEN timed out
03/25/2009 09:25:16 AM - SPINE: Poller[0] Host[216] ERROR: The POPEN timed out
03/25/2009 09:25:12 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:25:12 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:20:33 AM - SPINE: Poller[0] Host[398] ERROR: The POPEN timed out
03/25/2009 09:20:28 AM - SPINE: Poller[0] Host[425] ERROR: The POPEN timed out
03/25/2009 09:20:26 AM - SPINE: Poller[0] Host[382] ERROR: The POPEN timed out
03/25/2009 09:20:25 AM - SPINE: Poller[0] Host[445] ERROR: The POPEN timed out
03/25/2009 09:20:17 AM - SPINE: Poller[0] Host[532] ERROR: The POPEN timed out
03/25/2009 09:20:17 AM - SPINE: Poller[0] Host[530] ERROR: The POPEN timed out
03/25/2009 09:20:15 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:20:15 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:12 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:15:39 AM - SPINE: Poller[0] Host[399] ERROR: The POPEN timed out
03/25/2009 09:15:32 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:15:28 AM - AUTH LOGIN: User 'admin' Authenticated
03/25/2009 09:15:26 AM - SPINE: Poller[0] Host[389] ERROR: The POPEN timed out
03/25/2009 09:15:24 AM - SPINE: Poller[0] Host[388] ERROR: The POPEN timed out
03/25/2009 09:15:22 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:15:18 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:15:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:15:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:15:00 AM - SYSTEM STATS: Time:299.1729 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:7145
03/25/2009 09:15:00 AM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
03/25/2009 09:10:28 AM - SPINE: Poller[0] Host[399] ERROR: The POPEN timed out
03/25/2009 09:10:23 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:10:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:10:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:10:00 AM - SYSTEM STATS: Time:298.8048 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:6890
03/25/2009 09:10:00 AM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
03/25/2009 09:05:26 AM - SPINE: Poller[0] Host[564] ERROR: The POPEN timed out
03/25/2009 09:05:24 AM - SPINE: Poller[0] Host[447] ERROR: The POPEN timed out
03/25/2009 09:05:23 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:05:21 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:05:17 AM - SPINE: Poller[0] Host[532] ERROR: The POPEN timed out
03/25/2009 09:05:09 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:05:09 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:03:53 AM - SYSTEM STATS: Time:230.9602 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:9324


The first two runs might have been skewed by not properly killing the spine / php poller process. When I killed it with the nifty-popen option enabled, I got a ton of errors, but it processed all RRDs and seemed to work properly. The run at 9:05 is the first change spine 0.8.7d and then run at 9:25 is the change to popen.

My 9:30 run looks very promising though:

Code:
03/25/2009 09:31:39 AM - SYSTEM STATS: Time:97.2300 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:9295
03/25/2009 09:30:41 AM - SPINE: Poller[0] Host[84] ERROR: The POPEN timed out
03/25/2009 09:30:35 AM - SPINE: Poller[0] Host[399] ERROR: The POPEN timed out
03/25/2009 09:30:33 AM - AUTH LOGIN: User 'admin' Authenticated
03/25/2009 09:30:25 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:30:25 AM - SPINE: Poller[0] Host[205] ERROR: The POPEN timed out
03/25/2009 09:30:24 AM - SPINE: Poller[0] Host[390] ERROR: The POPEN timed out
03/25/2009 09:30:24 AM - SPINE: Poller[0] Host[307] ERROR: The POPEN timed out
03/25/2009 09:30:24 AM - SPINE: Poller[0] Host[373] ERROR: The POPEN timed out
03/25/2009 09:30:20 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:30:20 AM - SPINE: Poller[0] Host[508] ERROR: The POPEN timed out
03/25/2009 09:30:19 AM - SPINE: Poller[0] Host[5] ERROR: The POPEN timed out
03/25/2009 09:30:19 AM - SPINE: Poller[0] Host[474] ERROR: The POPEN timed out
03/25/2009 09:28:17 AM - SYSTEM MACTRACK STATS: Time:62.5497 ConcurrentProcesses:7 Devices:1


However, the 9:35 run seemed to have a lot more errors:

Code:
03/25/2009 09:36:01 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:35:50 AM - SPINE: Poller[0] Host[84] ERROR: The POPEN timed out
03/25/2009 09:35:44 AM - SPINE: Poller[0] Host[559] ERROR: The POPEN timed out
03/25/2009 09:35:40 AM - SPINE: Poller[0] Host[84] ERROR: The POPEN timed out
03/25/2009 09:35:37 AM - SPINE: Poller[0] WARNING: SS[1] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:35:34 AM - SPINE: Poller[0] Host[559] ERROR: The POPEN timed out
03/25/2009 09:35:28 AM - SPINE: Poller[0] Host[532] ERROR: The POPEN timed out
03/25/2009 09:35:27 AM - SPINE: Poller[0] Host[530] ERROR: The POPEN timed out
03/25/2009 09:35:27 AM - SPINE: Poller[0] Host[33] ERROR: The POPEN timed out
03/25/2009 09:35:26 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:35:23 AM - SPINE: Poller[0] Host[427] ERROR: The POPEN timed out
03/25/2009 09:35:23 AM - SPINE: Poller[0] Host[419] ERROR: The POPEN timed out
03/25/2009 09:35:21 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:35:21 AM - SPINE: Poller[0] Host[508] ERROR: The POPEN timed out
03/25/2009 09:35:20 AM - SPINE: Poller[0] Host[475] ERROR: The POPEN timed out
03/25/2009 09:35:18 AM - SPINE: Poller[0] Host[532] ERROR: The POPEN timed out
03/25/2009 09:35:17 AM - SPINE: Poller[0] Host[530] ERROR: The POPEN timed out
03/25/2009 09:35:17 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:35:17 AM - SPINE: Poller[0] Host[33] ERROR: The POPEN timed out
03/25/2009 09:35:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:35:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:35:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:07 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:07 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)


-ryan


Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 9:05 am 
Offline
Cacti Guru User
User avatar

Joined: Thu Sep 16, 2004 5:53 am
Posts: 5091
Location: United Kingdom
I had given up on spine after it started to take 300 seconds to poll when I switched to a 64-bit CPU, apparently related to the (small) number of down devices at the time. It sounds like this release might help me with that issue... is that likely?

_________________
Weathermap 0.98 is out! & QuickTree 0.2. Superlinks is over there now.
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)


Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 9:21 am 
Offline
Developer
User avatar

Joined: Tue May 14, 2002 5:08 pm
Posts: 14863
Location: MI, USA
Ryan (skinty),

I need some details on the hosts and scripts in question. My first observation is that for the popens, I need better loging. Beyond that, here are my questions:

SNMP Timeouts:
1) For SNMP Timeouts on Hosts 529 and 477, are these snmpv3 hosts?
2) For those same hosts, if you increase the timeout, does the problem go away?
3) If you reduce the MAX OID's does the problem go away?

For Re-Index Segmentation Faults:
1) Is HostID 322 an SNMP Host?
2) Is HostID 322 an SNMPv3 Host?
3) For HostID 322, what is the Re-Index Method for the Query?

For POPEN Timeouts:
1) For each host, are the scripts for those hosts notoriously slow?

Once you answer these, to eliminate the segmentation faults, attempt to find out what is causing the HostID 322 problem. It may be that it is an orphaned item in the poller_reindex table, or it could be that you have the reindex method set to something like Uptime Goes Backwards for a Non-SNMP host. Need those details. It's still a bug, but I need the fault domain to correct.

Regards,

Larry

_________________
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
Gandalfs Official Debugging Help
Central Plugin Repository
Central Templates Repository


Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 9:23 am 
Offline
Developer
User avatar

Joined: Tue May 14, 2002 5:08 pm
Posts: 14863
Location: MI, USA
Howie wrote:
I had given up on spine after it started to take 300 seconds to poll when I switched to a 64-bit CPU, apparently related to the (small) number of down devices at the time. It sounds like this release might help me with that issue... is that likely?


That was my hope. You can run in R/O mode for a while and simply log the messages. Cron this:

Code:
*/5 * * * * /usr/local/spine/bin/spine -R >> /tmp/myspine.out 2>&1


If you have 1 minute polling, you may not want to do this though.

Otherwise, this way it can run side-by-side with cmd.php and not cause interference with the main Cacti service.

_________________
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
Gandalfs Official Debugging Help
Central Plugin Repository
Central Templates Repository


Last edited by TheWitness on Wed Mar 25, 2009 10:58 am, edited 1 time in total.

Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 10:54 am 
Offline
Cacti Guru User
User avatar

Joined: Thu Sep 16, 2004 5:53 am
Posts: 5091
Location: United Kingdom
TheWitness wrote:
That was my hope. You can run in R/O mode for a while and simply log the messages. Cron this:

Code:
*/5 * * * * /usr/local/spine/bin/spine -R >> /tmp/myspine.out 2>&1


Otherwise, this way it can run side-by-side with cmd.php and not cause interference with the main Cacti service.


Ah, cool. I didn't realise this was possible. I'll try and get it going soon, time permitting of course :roll:

_________________
Weathermap 0.98 is out! & QuickTree 0.2. Superlinks is over there now.
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)


Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 10:58 am 
Offline
Developer
User avatar

Joined: Tue May 14, 2002 5:08 pm
Posts: 14863
Location: MI, USA
Howie wrote:
TheWitness wrote:
That was my hope. You can run in R/O mode for a while and simply log the messages. Cron this:

Code:
*/5 * * * * /usr/local/spine/bin/spine -R >> /tmp/myspine.out 2>&1


Otherwise, this way it can run side-by-side with cmd.php and not cause interference with the main Cacti service.


Ah, cool. I didn't realise this was possible. I'll try and get it going soon, time permitting of course :roll:


Howie, make sure you change my bad syntax. Should be ">>" and not ">".

Larry

_________________
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
Gandalfs Official Debugging Help
Central Plugin Repository
Central Templates Repository


Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 2:33 pm 
Offline

Joined: Sat Aug 12, 2006 6:47 pm
Posts: 37
For 529 with the popen fix switch
Quote:
03/25/2009 03:20:36 PM - SPINE: Poller[0] Host[529] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
... EDITED OUT BY THEWITNESS ...
03/25/2009 03:20:36 PM - SPINE: Poller[0] Host[529] DS[12657] SCRIPT: /usr/bin/php -q /usr/share/cacti/scripts/cisco_cpu_usage.php 172.16.1.212, feedme, 2, , , 161, 2000 get fiveMin 1, output: 19
03/25/2009 03:20:32 PM - SPINE: Poller[0] Host[529] DEBUG: The POPEN returned the following File Descriptor 29
03/25/2009 03:20:32 PM - SPINE: Poller[0] Host[529] DS[12657] SCRIPT: /usr/bin/php -q /usr/share/cacti/scripts/cisco_cpu_usage.php 172.16.1.212, feedme, 2, , , 161, 2000 get oneMin 1, output: 18


My ping (using SNMP) timeout and snmp time out are 400 and 2000 respectively. That just sounds wrong when I type it, so if it is, I can change them. Everything is snmpv2 at this point, most of the re-index methods are done through script servers.

322 is a different beast that is only monitored using a tcp ping and graphed with advanced ping 1.3.

I'll provide more info shortly, but i've been in the weeds today.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 3:31 pm 
Offline
Developer
User avatar

Joined: Tue May 14, 2002 5:08 pm
Posts: 14863
Location: MI, USA
Quote:
322 is a different beast that is only monitored using a tcp ping and graphed with advanced ping 1.3.


Well that explains the segmentation fault's anyway. There are two orphaned poller_reindex entries for host_id 322 that need to be dealt with. I will make a fix in spine to ignore a host that has a reindex query set and does not use snmp. That will fix the segmentation faults, although there appears to be some database corruption.

TheWitness

_________________
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
Gandalfs Official Debugging Help
Central Plugin Repository
Central Templates Repository


Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 4:30 pm 
Offline
Cacti Guru User
User avatar

Joined: Thu Sep 16, 2004 5:53 am
Posts: 5091
Location: United Kingdom
TheWitness wrote:
You can run in R/O mode for a while and simply log the messages. Cron this:

Code:
*/5 * * * * /usr/local/spine/bin/spine -R >> /tmp/myspine.out 2>&1


Otherwise, this way it can run side-by-side with cmd.php and not cause interference with the main Cacti service.


How much output should I expect when running this way? I have logging set to HIGH, and I only get:

Code:
No log handling enabled - turning on stderr logging
truncating unsigned value to 32 bits (2)
truncating unsigned value to 32 bits (2)
truncating unsigned value to 32 bits (2)
(20 or so lines of the same)
truncating unsigned value to 32 bits (2)
truncating unsigned value to 32 bits (2)


It runs very quickly (< 1 minute for sure), but it's not clear that anything is really happening.

_________________
Weathermap 0.98 is out! & QuickTree 0.2. Superlinks is over there now.
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)


Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 4:32 pm 
Offline
Developer
User avatar

Joined: Tue May 14, 2002 5:08 pm
Posts: 14863
Location: MI, USA
Add "--stdout" to the options. Sorry.

Larry

_________________
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
Gandalfs Official Debugging Help
Central Plugin Repository
Central Templates Repository


Top
 Profile  
 
 Post subject:
PostPosted: Wed Mar 25, 2009 8:57 pm 
Offline
Developer
User avatar

Joined: Tue May 14, 2002 5:08 pm
Posts: 14863
Location: MI, USA
All,

To those monitoring. After todays feedback, I have made a few new changes to address the reindex issue as well as a slight alteration to the tcp_ping to designate a host as up if the TCP connection is refused by the host (aka It's up).

More feedback please. Still waiting on an SNMPv3 tester.

TheWitness

_________________
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
Gandalfs Official Debugging Help
Central Plugin Repository
Central Templates Repository


Top
 Profile  
 
 Post subject:
PostPosted: Thu Mar 26, 2009 5:27 am 
Offline
Cacti Guru User
User avatar

Joined: Thu Sep 16, 2004 5:53 am
Posts: 5091
Location: United Kingdom
Code:
Time: 18.5671 s, Threads: 15, Hosts: 239


This is how it should be! I'll try dropping it into the live loop in a while, and see if it holds up there too..

_________________
Weathermap 0.98 is out! & QuickTree 0.2. Superlinks is over there now.
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)


Top
 Profile  
 
 Post subject:
PostPosted: Thu Mar 26, 2009 5:59 am 
Offline
Cacti Guru User
User avatar

Joined: Thu Sep 16, 2004 5:53 am
Posts: 5091
Location: United Kingdom
Howie wrote:
Code:
Time: 18.5671 s, Threads: 15, Hosts: 239


This is how it should be! I'll try dropping it into the live loop in a while, and see if it holds up there too..


Hmm. Nope :(

Code:
03/26/2009 10:47:33 AM - SYSTEM STATS: Time:151.3474 Method:cmd.php Processes:4 Threads:N/A Hosts:239 HostsPerProcess:60 DataSources:13143 RRDsProcessed:4936
03/26/2009 10:55:03 AM - SYSTEM STATS: Time:301.4584 Method:spine Processes:4 Threads:15 Hosts:239 HostsPerProcess:60 DataSources:13143 RRDsProcessed:2711


Am I reading that correctly as being an issue with the rrd updates rather than data collection? I thought the rrd-update process was the same for both cmd.php and spine though...

_________________
Weathermap 0.98 is out! & QuickTree 0.2. Superlinks is over there now.
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)


Top
 Profile  
 
 Post subject:
PostPosted: Thu Mar 26, 2009 8:27 am 
Offline

Joined: Sat Aug 12, 2006 6:47 pm
Posts: 37
Howie,

I was getting some diffs between the two as well. You're looking at almost 50% loss in RRD's being processed, i was getting about a 20% loss.

TheWitness,

Quote:
That will fix the segmentation faults, although there appears to be some database corruption.


What do you suggest for checking into possible database corruption?

-ryan


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 30 posts ]  Go to page 1, 2  Next

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  

Protected by Anti-Spam ACP Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group