|
|
| Author |
Message |
tekbot
Joined: 07 Jun 2005 Posts: 45 Location: Venice, CA
|
Posted: Fri Dec 07, 2007 12:40 pm Post subject: |
|
|
Settings Amiss (as mentioned above):
Your Cisco CPU usage Data Template appears to be the default, 5 minute averages all around with a Step of 300 and a Heartbeat (presumably) of 600. This is fine, but the fact that you're not getting data on this graph in addition to your interface graphs says to me that the issue is not in the fact that you created NEW RRAs as described in my first post on this thread.
So, the issue lies elsewhere. A few questions I'd like you to answer:
Are you using any plugins (namely boost)?
Are you using the 0.8.7a Spine binary or cmd.php for your poller?
Why are you running the poller as root? (this shouldn't really matter, but it's best practice to run the poller as a cacti user, and make sure the log file and RRA directory are owned by this cacti user).
How many devices in your infrastructure are you polling on this system? A lot? A few? Routers only? Windows / Linux boxes, switches, etc...?
Is this the only cacti installation on this server?
Please answer these questions and post the output from the tail | grep I requested previously and we should have a better idea of what's going on.
Thanks,
tekbot
|
|
| Back to top |
|
 |
schef4711
Joined: 12 Jul 2005 Posts: 19 Location: Argentina
|
Posted: Fri Dec 07, 2007 1:14 pm Post subject: |
|
|
Hi,
| tekbot wrote: | | There are a few things that looked amiss on your Data Template that I'll comment on shortly. Sorry about the delay in getting back to you, it's been a very busy week for me. |
don't hurry up - delay will not be a problem because at the moment I will use the 1min avg only in "test mode" So maybe other work is more important
| tekbot wrote: | | I don't think that's it, Schef, but zoom in on the gaps to verify. If you see 0, it's fine, if you see NaN (which I expect) it's a bug with the poller / Data Template / Configuration. |
You're right - I will see "NaN" on Current/Averrage/Maximum and 0 Bytes (within the Bytes graph) or 0 mbit in+out (within the 95% graph) when I look inside the gaps.
| tekbot wrote: |
Do me one other favor when you have a chance: go to your cacti.log directory and run the following for 5-10 polling intervals. |
I can do this but it need a little bit of time to find out the right graphs because at the moment each router/switch has many graphs. I will create a new device with only 2 interfaces where I will have the problem to reduce the output.
I will be back soon
thx a lot
alex
|
|
| Back to top |
|
 |
schef4711
Joined: 12 Jul 2005 Posts: 19 Location: Argentina
|
Posted: Fri Dec 07, 2007 2:03 pm Post subject: |
|
|
| tekbot wrote: | Settings Amiss (as mentioned above):
Your Cisco CPU usage Data Template appears to be the default, 5 minute averages all around with a Step of 300 and a Heartbeat (presumably) of 600. This is fine, but the fact that you're not getting data on this graph in addition to your interface graphs says to me that the issue is not in the fact that you created NEW RRAs as described in my first post on this thread. So, the issue lies elsewhere. |
surely, maybe there is another problem in fact off no changing nothing to that data template but the best question is "where we should search about it". So I thinking more and more that there is a bug in the poller.
| tekbot wrote: |
A few questions I'd like you to answer:
Are you using any plugins (namely boost)?
Is this the only cacti installation on this server?
|
no, it is a fresh installation on a new server without any installation before. So there was no upgrading, adding or changing of any file of cacti.
| tekbot wrote: |
Are you using the 0.8.7a Spine binary or cmd.php for your poller?
Why are you running the poller as root? (this shouldn't really matter, but it's best practice to run the poller as a cacti user, and make sure the log file and RRA directory are owned by this cacti user).
|
I use cmd.php as my poller. Spine isn't installed on that machine. Surely root shouldn't be but I can change it - this isn't the problem - but I don't think that will have a newer effect
| tekbot wrote: |
How many devices in your infrastructure are you polling on this system? A lot? A few? Routers only? Windows / Linux boxes, switches, etc...?
|
At the moment the "localhost" is disabled, I have two cisco switch (one with 63 and the other one has 35 graphs), two cisco router (23 and 29 graphs) and 3 zyxel modem (each has 4 graphs). So in total 4 Cisco's with 150 graphs and 3 Modem with 12 graphs Like nothing for a Dual Opteron 848 with 4GB and 1TB storage
| tekbot wrote: |
Please answer these questions and post the output from the tail | grep I requested previously and we should have a better idea of what's going on.
|
the "tail -f" file http://www.buenosair.es/mrtg/20071207cactilog.txt with the last 13 polling outputs . The DS[71] is the 5min CPU graph, the DS[74] (95% graph) and the DS[88] (Bytes Total graph) are the graphs as I posted the images before with the gaps.
If you need some other output please let me know.
thx a lot for helping
alex
Last edited by schef4711 on Fri Dec 07, 2007 2:15 pm; edited 1 time in total |
|
| Back to top |
|
 |
schef4711
Joined: 12 Jul 2005 Posts: 19 Location: Argentina
|
|
| Back to top |
|
 |
krap_rz
Joined: 18 May 2006 Posts: 26 Location: Cyberjaya, Malaysia
|
Posted: Tue Jan 08, 2008 9:03 pm Post subject: |
|
|
Hi tekbot
After reading your guide note, i was quite interested to update my cacti from 6j to 7a (waiting for stable ver.) and i am looking to change the polling time to less than 5minutes (maybe 3 or 4 minutes).
But looking at your post does give me a thinking cap to see whether going for 1min polling.
It would much appriciated if tekbot could offer us some screen capture of your RRA, console setup, adjustments made in data source, poller and etc.
Hope you have time for this as I believe many would like to see how you do it.
Appreciate your help on this. Please advice. thanks again.
|
|
| Back to top |
|
 |
tekbot
Joined: 07 Jun 2005 Posts: 45 Location: Venice, CA
|
Posted: Thu Jan 24, 2008 8:23 am Post subject: |
|
|
Sorry about the delay in getting back to you guys. Here's a handful of screenshots. The first is of my custom RRA settings. The next is of a modified CPU Data Template. I threw in one of my 10second graphs as well to show the granularity -- this a 12 hour view of 2 10 second data sources with a cdef that calculates the Net Gain and Loss. For more detailed information, refer to my earlier posts in this thread.
Hope all this helps!
| Description: |
| Custom RRA Settings for storing 10s, 1m, and 5m graph data as per my first post in this thread. |
|
| Filesize: |
61 KB |
| Viewed: |
3850 Time(s) |

|
| Description: |
| Modified Data Template for standard CPU Data Source. Note the selected RRAs, Step and Heartbeat values. |
|
| Filesize: |
90.76 KB |
| Viewed: |
3850 Time(s) |

|
| Description: |
| 12 hour view of a 10s graph. This graph includes 2 10 second data sources, and a CDEF that calculates the Net Gain / Loss. |
|
| Filesize: |
49.89 KB |
| Viewed: |
3850 Time(s) |

|
|
|
| Back to top |
|
 |
marcmo
Joined: 21 Sep 2005 Posts: 25
|
Posted: Mon Feb 25, 2008 2:47 pm Post subject: |
|
|
| If one wanted to experiment with 10 second polling would the Data Source step be 10 and the heartbeat be 20?
|
|
| Back to top |
|
 |
soloslinger
Joined: 19 Jan 2007 Posts: 25
|
Posted: Tue Mar 25, 2008 12:10 pm Post subject: |
|
|
The part I don't understand about the 1 minute polling is, if the poller is scheduled on the cron to run every 5 minutes, inbetween those intervals, how is data then gathered?? In other words, if the poller isn't gathering data every 60 seconds, what is?? Where do the other 4 numbers sampled come from??
soloslinger
|
|
| Back to top |
|
 |
agreusel
Joined: 25 Mar 2008 Posts: 2
|
Posted: Tue Mar 25, 2008 3:33 pm Post subject: |
|
|
| soloslinger wrote: | The part I don't understand about the 1 minute polling is, if the poller is scheduled on the cron to run every 5 minutes, inbetween those intervals, how is data then gathered?? In other words, if the poller isn't gathering data every 60 seconds, what is?? Where do the other 4 numbers sampled come from??
soloslinger |
This is confusing me as well...
From what I've gathered, for this to work, you need the following:
- The poller.php entry in the crontab set to */5 (every 5 minutes).
- [Settings -> Poller -> Cron Interval] set to "Every 5 Minutes".
- [Settings -> Poller -> Poller Interval] set to "Every 1 Minute".
If tek, or someone else, could confirm this for me, I'd greatly appreciate it.
|
|
| Back to top |
|
 |
TheWitness Developer
Joined: 14 May 2002 Posts: 9723 Location: MI, USA
|
Posted: Wed Apr 02, 2008 4:41 am Post subject: |
|
|
Quite simply, if you set the cron interval to 5 minutes and the poller interval to 1 minute, the poller will run 5 times and exit.
If you set the cron interval to 5 minutes and the poller interval to 10 seconds, the poller will run 30 times and exit.
If you ever change a poller interval for an existing data source, you have to delete the corresponding rrdfiles (sorry, it's rrdtool).
If you change a poller interval for a data template, you should likely repopulate your poller cache to re-distribute the polling of data sources.
If you were previously polling at 1 minute with a 5 minute RRD to compensate for not having a 64bit counter available, then you have a problem as that was not considered as a part of the design. What I mean by that is that if you have 32bit counters and you poll a device 5 times in 5 minutes to allow RRDtool to store the average of those 5 samples, the design of the poller interval did not take that into account. I suspect that is a corner case as most "high bandwidth" devices are "modern" (net-snmp 5.2++) and otherwise are network electronics which typically support snmpv2/3 and 64bit counters.
TheWitness
|
|
| Back to top |
|
 |
CPF
Joined: 28 Aug 2005 Posts: 25
|
Posted: Thu Apr 17, 2008 5:58 am Post subject: |
|
|
Hi tekbot,
Just a quick post to say thanks, this has proved to be a really useful thread to understanding how the 1 minute poller works.
The last post from TheWitness also really made it click for me.
This 1 minute polling is already giving me greater visibility of my network. See picture for how easy it is to miss some short-lived traffic spikes.
To The Forum Admins - I know it's already a Sticky, but could/should tekbot's post be moved or linked to the 'How To' Section of the Forum?
(I'm also sure that we're all eagerly awaiting the release of version 0.8.8 too)
Thanks to all the Cacti Team.
Graham.
| Description: |
|
| Filesize: |
69.98 KB |
| Viewed: |
2820 Time(s) |

|
|
|
| Back to top |
|
 |
niobe Cacti User
Joined: 10 Mar 2008 Posts: 125 Location: Australia
|
Posted: Mon Apr 21, 2008 1:07 pm Post subject: |
|
|
I have my cron and poller intervals set to 1 minute as well as my crontab. I have also created most data sources with interval 60 and heartbeat 120.
Still unclear after reading this what I am actually missing? Are the rrds being updated every minute for five minutes with the same number?
One reason I do this is so that weather maps are recreated every minute, which works a treat - but confused as I am pretty sure the numbers change every minute.
|
|
| Back to top |
|
 |
phila
Joined: 07 Mar 2008 Posts: 7
|
Posted: Mon May 12, 2008 4:13 pm Post subject: Not have to recreate graphs |
|
|
Hi all,
Is there a way to avoid recreating 1500+ graphs that I have if I want to have 1 min resolution? Of course old data will stay at old resolution, but that new is added at 1min res?
Recreating them all by hand would be such a waste.
Thanks,
|
|
| Back to top |
|
 |
gandalf Developer
Joined: 02 Dec 2004 Posts: 12295 Location: Muenster, Germany
|
Posted: Tue May 13, 2008 2:39 pm Post subject: |
|
|
I'm quite positive that there's no chance. The main problem is the step size/heartbeat. Step size cannot be changed for existing rrd files.
Reinhard
|
|
| Back to top |
|
 |
tmnewton
Joined: 21 May 2008 Posts: 1
|
Posted: Wed May 21, 2008 9:01 am Post subject: Completely confused |
|
|
| Please forgive my ignorance. I have read this post over and over trying to understand how to get the one minute polling to work and reflected in my graphs. I understand leaving cron to run every five minutes and to set the poller interval at one minute. This basically starts the poller every five minutes, polls the devices once a minute for five minutes, and then the poller process ends. My understanding ends here. In tekbot's long post regarding the custom rra's and template's, he states that step is defined as how many polls is required to average the data and enter it into the rrd. So, to get the 10 second granularity that he is stating, his polling interval has to be set at 10 seconds, right? For his 1 minute average, his step is defined as 1, shouldn't that be 6 (6 polls x 1 minute)? For his five minute average, his step is still defined as 1, shouldn't that be 30 (6 polls x 5 minutes)?
|
|
| Back to top |
|
 |
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|