Allow script to overrule poll time

Anything that you think should be in Cacti.

Moderators: Moderators, Developers

Post Reply
Author
Message
Gamut
Posts: 3
Joined: Mon Sep 17, 2012 4:07 am

Allow script to overrule poll time

#1 Post by Gamut » Mon Sep 17, 2012 7:45 am

Situation sketch: We gather performance statistics from the perfstat schema in our oracle databases. The perfstat user per database takes snapshots at regular intervals (15 minutes is our standard). These performance statistics have ofcourse an associated snapshot time. We sync all those per database snapshots/statistics to one mother database. This database is incrementally loaded every 6 hours or so.

Problem: if cacti polls (via our script), cacti determines what the polling time is, in cmd.php:

Code: Select all

                        $host_update_time = date("Y-m-d H:i:s"); // for poller update time
(...)
                                        db_execute("insert into poller_output (local_data_id, rrd_name, time, output) values (" . $item["local_data_id"] . ", '" . $item["rrd_name"] . "', '$host_update_time', '" . addslashes($output) . "')");
So, there is no way for our script to tell that the values it returns belong to a different poll time.

Suggested solution: allow the script to overrule the poll time set by cacti.

Suggested implementation: it seems intrusive to change to a more direct input for rrdupdate (which does allow for a timestamp btw), ie I do not know what will be broken after I would patch function validate_result (in site/lib/functions.php). Giving special meaning to an output field, say, "poll_time" should have low impact and should not break any existing script, I hope. But it is feasible that someone already uses "poll_time" as a field, so only consider giving special meaning to "poll_time" if it is not used by user.

Suggested patch:

Code: Select all

[email protected]:/usr/share/cacti# diff -u site/lib/poller.php-dd20120912BD site/lib/poller.php
--- site/lib/poller.php-dd20120912BD	2012-04-23 04:00:14.000000000 +0200
+++ site/lib/poller.php	2012-09-17 14:42:40.806829762 +0200
@@ -394,6 +394,16 @@
 							}
 
 							$rrd_update_array{$item["rrd_path"]}["times"][$unix_time]{$rrd_field_names{$matches[0]}} = $matches[1];
+						}else{
+							/* This field name is not in use. Perhaps it is special. */
+
+							/* Consider the case where there is an intermediate data gatherer. If the intermediate gatherer does not give real time results (as is highly expected), the poll time 'now' does not make sense. Allow overrides then. */
+							if ($matches[0] === "poll_time") {
+								if (read_config_option("log_verbosity") >= POLLER_VERBOSITY_DEBUG || $debug) {
+									cacti_log ("Found a non-user match, but cacti special match: $matches[0]=|$matches[1]| (name=" . $item["rrd_name"] . ", rrd_path=" . $item["rrd_path"] . ")");
+								}
+								$rrd_update_array{$item["rrd_path"]}["times"][$unix_time]['__actual_unix_time'] = $matches[1];
+							}
 						}
 					}
 				}
@@ -424,6 +434,15 @@
 						$k = 0;
 						$data_ids = array();
 					}
+
+					/* Overwrite poll time if output indicated so. This is a good place as $unix_time nor $item["unix_time"] is not used after this point. */
+					if (isset($rrd_update_array{$item["rrd_path"]}["times"][$unix_time]['__actual_unix_time'])) { var_dump ($rrd_update_array{$item["rrd_path"]});
+						$actual_unix_time = $rrd_update_array{$item["rrd_path"]}["times"][$unix_time]['__actual_unix_time'];
+						unset($rrd_update_array{$item["rrd_path"]}["times"][$unix_time]['__actual_unix_time']);
+
+						$rrd_update_array{$item["rrd_path"]}["times"][$actual_unix_time] = $rrd_update_array{$item["rrd_path"]}["times"][$unix_time];
+						unset ($rrd_update_array{$item["rrd_path"]}["times"][$unix_time]);
+					}
 				}else{
 					unset($rrd_update_array{$item["rrd_path"]}["times"][$unix_time]);
 				}
Please consider the patch or the idea as a feature request.

User avatar
phalek
Cacti Guru User
Posts: 2724
Joined: Thu Jan 31, 2008 6:39 am
Location: Kressbronn, Germany
Contact:

Re: Allow script to overrule poll time

#2 Post by phalek » Mon Sep 17, 2012 8:32 am

If your polling time is every 15 minutes, you could as well just return the same values everytime the script is being polled. This would result in the rrd graphs having the same value within that 15 minute time span.

I understand that you did create a custom script by utilizing the perfstat schema.

What's the issue you are trying to solve here ?

Fyi, I've already created a solution for a client which updated the rrd file ( using rrdupdate ) only once every 24 hours without the need to change anything in Cactis code...

There's a topic for updating rrd files externally in the docs section:

http://docs.cacti.net/manual:088:8_rrdt ... dated_rrds
Greetings,
Phalek
---
Need more help ? Read the Cacti documentation or my new Cacti 1.x Book
Need on-site support ? Look here Cacti Workshop
Need professional Cacti support ? Look here CereusService
---
Plugins : CereusTransporter | CereusReporting | nmidWebService | nmidSmokeping | nmidWeb2 |

Code: Select all

CereusServer Master:  SYSTEM STATS: Time:2.5621 Method:spine Processes:1 Threads:16 Hosts:446 HostsPerProcess:446 DataSources:14683 RRDsProcessed:7573
CereusServer Agent:   SYSTEM STATS: Time:27.4840 Method:spine Processes:1 Threads:8 Hosts:16 HostsPerProcess:16 DataSources:114576 RRDsProcessed:48061
[/size]

Gamut
Posts: 3
Joined: Mon Sep 17, 2012 4:07 am

Re: Allow script to overrule poll time

#3 Post by Gamut » Tue Sep 18, 2012 2:15 am

You understandly correctly that I did create a custom script to get details from the perfstat schema. (I am using it via the Data Input Method.)



Perfstat takes snapshots every 15 minutes, but cacti does not know __when__ perfstat did that. So when I would query the database for the latest snapshot, I am inherently off in my timings. Consider the case when I poll vast numbers of databases: the last poll might start as late as 14m59s after the snapshot was taken. This would give results that are skewed in time. This in turn would perhaps make me misinterpret graphs.

To add insult to injury, I don't query the perfstat schema, I query a database that collects all perfstat schema's. Since that mother database incrementally loads every six hours, I am bound to have an even greater difference between time and values: up to six hours :-(.



The problem I am trying to solve is that I want the output of the cacti script to control what was the actual polling time, instead of cacti dictating me that the values where from timestamp "time()". Solving this problem will lead to graphs where the time axis is in sync with the value axis.

This post wants something similar btw: http://forums.cacti.net/about6889.html



As for your help of pointing to "Externally Updated RRDs", I am aware of that document. Unfortunately, I think it is to cumbersome and labour intensive. A perfstat schema will have some 600 "sysstat" statistics per snapshot per database. I could perhaps script the creation of the data template, but binding 600 data sources per database is ridiculous. I'd like to monitor some 100 database, so I would have to do some 60000 clicks. Arghhh... Externally updated RRDs is a fine hack for smaller graphing environments, but it does not suit my situation however. Nevertheless thanks for that pointer :-).



FWIW1:
The observant reader will notice another problem with my setup: I would like to be able to allow a reply stating there was no data available (and __not__ counting that it is unknown) and cacti disregarding that poll. Furthermore, I'd like to be able to have a reply giving more that once result: once the mother database is incrementally loaded, loads of snapshots will be available. Once cacti has slurped this data, no more data will be available until the next incremental load. No worries: I am working on a patch for this as well.

FWIW2:
The patch I supplied suits my needs just fine :-). (Although I consider it more a hack than a true solution.)

FWIW3:
The patch was done against cacti 0.8.8a-3 (from debian, wheezy/sid).

User avatar
gandalf
Developer
Posts: 22375
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: Allow script to overrule poll time

#4 Post by gandalf » Tue Sep 18, 2012 2:01 pm

Well, I consider this use case a valid one; as mentioned, this was already asked for.

The solution is a pragmatic one, the drawback is already mentioned. Another drawback is the restriction to scripts, at least I see it this way. In case of an SNMP polling use case (e.g. SNMP Data Query), we could think about a specific <xml_field> which then should be handled in a similar way.
People would still have to take care for correct RRD step size and heartbeat to avoid drops in graphs.

But honestly, I do not like those "magic strings". I see need for making them "user-configurable", e.g. by using a pseudo-variable or a checkbox or the like within the Data Input Script definitions (or the XML, respectively). E.g. like host_hostname (which is weird by itself, hehe) ...

R.

Gamut
Posts: 3
Joined: Mon Sep 17, 2012 4:07 am

Re: Allow script to overrule poll time

#5 Post by Gamut » Wed Sep 19, 2012 1:38 pm

I admit that introducing the magic strings is both pragmatic and ugly. No offence taken on your comment :-).

Solving it more elegantly and userinterfacefriendly is a bit out of my league: my patch was my first introduction to both php and cacti code (although I have been a happy cacti user for years!). I will give it a shot, but I do not expect to succeed easily if I am to succeed at all. If I go silent, it is up to you :-P.

Thanks for acknowledging the use case is valid!

User avatar
gandalf
Developer
Posts: 22375
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: Allow script to overrule poll time

#6 Post by gandalf » Sat Sep 22, 2012 3:42 am

Well, it's not primarily the code that is required. Though I honestly admit, that it would make accepting this feature much faster. It's the conceptual ideas that are required to match most general approaches to that.
To be honest, I was already thinking quite often about this issue, but your approach did never come to my mind, yet. So, given the starting point you already introduced, I suppose that adding only a bit more to get away from "fixed string magic" might be enough.
R.

Post Reply