Cacti (home)ForumsDocumentation
Cacti: offical forums and support
It is currently Tue Jun 25, 2019 8:58 pm

All times are UTC - 5 hours




Post new topic Reply to topic  [ 2 posts ] 
Author Message
 Post subject: [Cacti 1.2.1] Either queries or sources but not both?!
PostPosted: Thu Jan 31, 2019 6:29 pm 
Offline

Joined: Thu Jun 01, 2017 12:44 pm
Posts: 22
I am currently monitoring just over 60 remote linux hosts. Many of them are both queried using custom scripts over ssh (think Percona ssh) as well as standard SNMP (bandwidth) queries and data input methods.

Many sites are monitored via these data queries: UPS, (ssh/script), bandwidth (SNMP)
Also using these data input methods: #users (SNMP), #processes (SNMP), and board temps (ssh/script)

The UPS data query gathers three values: battery %charge, line volts, and time left on battery

There is one site that I am having difficulty with. I can either monitor the UPS and board temps or I can monitor bandwidth - but not both. If I am initially monitoring UPS and temps first, then I add bandwidth, monitoring of UPS and temps stop.

The log reveals that the poller is running the UPS data query but only executing retrieval of two values, not all three - so it doesn't plot. The poller does not execute the board temp script at all. However, bandwidth is run just fine and produces clean plots.

If I delete the bandwidth queries, UPS and temps resume.

There is nothing unusual about the setup for this particular host. There are several other hosts that have the same behavior. The network health is good, low latency, no bandwidth limitations. However, there are many other hosts that have all these data sources monitored without issue.

I have tried creating an entirely new instance of the device and adding new data sources and experience the same problem. I've made sure that the poller intervals/cron/spine are all in sync and have rebuilt the poller cache but to no avail. I can run the scripts manually without issue.

Cron is set to 5 min and spine poller is at 1min intervals as well as the data queries and input methods.

I appreciate any suggestions about how to sort out this problem.

Here is a snippet of the log showing UPS and temps executed successfully - before I added the network SNMP bandwidth query - the followed by another snippet showing log after I add the SNMP query:

>>>> UPS and Temps only <<<
2019/01/31 12:59:03 - SPINE: Poller[1] Device[75] HT[2] DS[512] SCRIPT: /bin/perl /var/www/html/cacti-1.2.1/scripts/query_ups.pl '192.168.100.2' 'get' 'bcharge' 'Back-UPS_RS_1500G', output: 100.0
2019/01/31 12:59:03 - SPINE: Poller[1] Device[75] HT[2] Total Time: 46 Seconds
2019/01/31 12:59:03 - SPINE: Poller[1] Device[75] HT[3] DS[512] SCRIPT: /bin/perl /var/www/html/cacti-1.2.1/scripts/query_ups.pl '192.168.100.2' 'get' 'timeleft' 'Back-UPS_RS_1500G', output: 317.5
2019/01/31 12:59:03 - SPINE: Poller[1] Device[75] HT[3] Total Time: 46 Seconds
2019/01/31 12:59:03 - SPINE: Poller[1] Device[75] HT[1] DS[512] SCRIPT: /bin/perl /var/www/html/cacti-1.2.1/scripts/query_ups.pl '192.168.100.2' 'get' 'linev' 'Back-UPS_RS_1500G', output: 230.0
2019/01/31 12:59:03 - SPINE: Poller[1] Device[75] HT[1] Total Time: 47 Seconds
2019/01/31 12:59:04 - SPINE: Poller[1] Device[75] HT[4] DS[513] SCRIPT: /bin/perl /var/www/html/cacti/scripts/delta_temp.pl '192.168.100.2', output: boardtemp:68.3
2019/01/31 12:59:04 - SPINE: Poller[1] Device[75] HT[4] Total Time: 47 Seconds
2019/01/31 12:59:22 - SPINE: Poller[1] Device[75] SNMP Result: Device responded to SNMP
2019/01/31 12:59:23 - SPINE: Poller[1] Device[75] HT[4] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403975169 < output: 403981505)
2019/01/31 12:59:23 - SPINE: Poller[1] Device[75] HT[4] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403975169 < output: 403981505)
2019/01/31 12:59:23 - SPINE: Poller[1] Device[75] HT[4] NOTE: There are '1' Polling Items for this Device
2019/01/31 12:59:23 - SPINE: Poller[1] Device[75] HT[3] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403975169 < output: 403981532)
2019/01/31 12:59:23 - SPINE: Poller[1] Device[75] HT[3] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403975169 < output: 403981532)
2019/01/31 12:59:23 - SPINE: Poller[1] Device[75] HT[3] NOTE: There are '1' Polling Items for this Device
2019/01/31 12:59:23 - SPINE: Poller[1] Device[75] HT[2] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403975169 < output: 403981532)
2019/01/31 12:59:23 - SPINE: Poller[1] Device[75] HT[2] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403975169 < output: 403981532)
2019/01/31 12:59:23 - SPINE: Poller[1] Device[75] HT[2] NOTE: There are '1' Polling Items for this Device
2019/01/31 12:59:24 - SPINE: Poller[1] Device[75] HT[1] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403975169 < output: 403981606)
2019/01/31 12:59:24 - SPINE: Poller[1] Device[75] HT[1] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403975169 < output: 403981606)
2019/01/31 12:59:24 - SPINE: Poller[1] Device[75] HT[1] NOTE: There are '1' Polling Items for this Device

>>> UPS, Temps, AND Bandwidth (note, poller executes on two of three UPS queries and ignores Temps data input) <<<

2019/01/31 13:01:13 - SPINE: Poller[1] Device[75] HT[3] DS[514] SNMP: v3: 161.200.93.136, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.2, value: 1842333913
2019/01/31 13:01:13 - SPINE: Poller[1] Device[75] HT[3] DS[514] SNMP: v3: 161.200.93.136, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.2, value: 4265318135
2019/01/31 13:01:13 - SPINE: Poller[1] Device[75] HT[3] Total Time: 2.7 Seconds
2019/01/31 13:01:13 - SPINE: Poller[1] Device[75] HT[4] DS[515] SNMP: v3: 161.200.93.136, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.3, value: 4200248986
2019/01/31 13:01:13 - SPINE: Poller[1] Device[75] HT[4] DS[515] SNMP: v3: 161.200.93.136, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.3, value: 3595183233
2019/01/31 13:01:13 - SPINE: Poller[1] Device[75] HT[4] Total Time: 2.7 Seconds
2019/01/31 13:01:14 - SPINE: Poller[1] Device[75] HT[1] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403986138 < output: 403992582)
2019/01/31 13:01:14 - SPINE: Poller[1] Device[75] HT[1] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403986138 < output: 403992582)
2019/01/31 13:01:14 - SPINE: Poller[1] Device[75] HT[1] NOTE: There are '2' Polling Items for this Device
2019/01/31 13:01:56 - SPINE: Poller[1] Device[75] HT[2] DS[512] SCRIPT: /bin/perl /var/www/html/cacti-1.2.1/scripts/query_ups.pl '161.200.93.136' 'get' 'timeleft' 'Back-UPS_RS_1500G', output: 425.0
2019/01/31 13:01:57 - SPINE: Poller[1] Device[75] HT[1] DS[512] SCRIPT: /bin/perl /var/www/html/cacti-1.2.1/scripts/query_ups.pl '161.200.93.136' 'get' 'linev' 'Back-UPS_RS_1500G', output: 230.0
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] SNMP Result: Device responded to SNMP
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[2] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998672)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[2] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998672)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[2] NOTE: There are '2' Polling Items for this Device
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[4] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998672)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[4] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998672)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[4] NOTE: There are '2' Polling Items for this Device
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[6] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998679)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[6] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998679)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[6] Total Time: 1.7 Seconds
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[5] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998687)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[5] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998687)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[5] Total Time: 1.8 Seconds
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[3] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998691)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[3] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 403992582 < output: 403998691)
2019/01/31 13:02:15 - SPINE: Poller[1] Device[75] HT[3] NOTE: There are '2' Polling Items for this Device


Top
 Profile  
 
 Post subject: Re: [Cacti 1.2.1] Either queries or sources but not both?!
PostPosted: Sun Feb 03, 2019 8:05 am 
Offline
Cacti Pro User
User avatar

Joined: Mon Jan 05, 2015 10:10 am
Posts: 753
40+ seconds, you have to figure that out, never going to scale. Maybe get the data async.

_________________
Before history, there was a paradise, now dust.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  

Protected by Anti-Spam ACP Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group