Cacti (home)ForumsRepositoryDocumentation
Cacti: offical forums and support  

 FAQFAQ   SearchSearch   MemberlistMemberlist    RegisterRegister   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in    


[HOWTO] Define very BIG rra "without data loss"
Goto page 1, 2  Next
 
Post new topic   Reply to topic    Cacti Forum Index -> Informational/HOWTO's
Author Message
gandalf
Developer


Joined: 02 Dec 2004
Posts: 12186
Location: Muenster, Germany

PostPosted: Sun Oct 02, 2005 1:39 pm    Post subject: [HOWTO] Define very BIG rra "without data loss" Reply with quote

Please find the most current version here
Reinhard
Preface
This tutorial is based on cacti http://cacti.net, written by Ian Berry, Larry Adams and Tony Roman. Cacti itself is based on rrdtool http://www.rrdtool.org, written by Tobias Oetiker. It is written by Reinhard Scheck to help you figuring out, how to configure cacti for use with a single round robin archive (rra) without using consolidation (e.g. without averaging out some data points).

Be warned!
You won't really do that! Why? One of the inherent features of rrd's is: they never grow in space. In other words: When creating a new rrd, it is allocated with all space needed. See rrd-beginners tutorial at http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/tut/rrd-beginners.en.html. As usually, you may use the information given at your own risk.

Here we go!
Cacti's logic to generate rrd files works as follows:
  1. create a device (the host that shall be queried)
  2. create a graph for this host (using a graph template or a data query that refers to graph templates)
  3. each graph template refers to a data template
  4. each data template defines one or more data sources
  5. each data template uses one or more round robin archives (rra)
  6. each of the data sources uses the same set of rra's


This tutorial works the way back.

Defining a new round robin archive (rra)
For the following, lets assume you are logged in with admin permissions and use the “console” tab.
1.Go to Management -> Data Sources -> RRAs
2.Click Add to add a new rra

Now fill in the data as follows and SAVE:

Name: you may choose your own
Consolidation function: AVERAGE needed
X-Files Factor: always 0.5
Steps: 1 (that is the number of data points to use for consolidation, 1 says: no consolidation at all)
Rows: 115200 = 400 days with 24 hours and 12 data points per hour (= 5 min interval)
Timespan: used for displaying 33,053,184 seconds = about 382 days (taken from other cacti rra)

Define a new data template
For ease of use (yes, I'm lazy), please copy an existing template. Goto Data Templates, and check the box on the right of Interface – Traffic:

Then scroll to the bottom of the page, select Duplicate and Go.

You will be prompted for a new name of this template:

Of course, you may choose your own name here. Now it is time to modify this template:

1.You may change the name of the template
2.Select the just created RRA (Don't worry about the other RRAs in this list; they are needed for the next tutorial ...)
Please leave the rest as is; SAVE. Of course, you may define a new data template from scratch. The only thing to keep in mind is to select the appropriate RRA. The data template is now done.

Define a new graph template
Well, you will imagine what comes next. Again, I decided to copy the appropriate graph template. So goto Graph Templates and repeat the steps above for the template Interface – Traffic (bits/sec). It will look like this:

Please pay attention to the next steps! You will have to delete both Graph Item Inputs, as they refer to the “wrong” data source. Please select the red X to the right of Inbound Data Source as well as Outbound Data Source.
Then you will have to add the newly generated data sources. In order to do that, please select each item of the list of Graph Items, one after the other. This will look like:

As Data Source you will choose the appropriate data source you generated in the previous step. Don't forget to do this for each and every item of the Graph Item list. When you're done, scroll to the bottom of the Graph Template definition and SAVE.

Modify Data Query to add Graph Template
This example uses Interface – Traffic Graph Template. This is referenced by the Data Query SNMP – Interface Statistics. Now we're going to add the newly defined Graph Template to this very Data Query.
If you have chosen some other Graph Template, e.g. ucd/net – Load Average, you will skip this step.
The Data Query goes like this. Goto Data Queries and select SNMP – Interface Statistics.

Now Add to see this:

Define a new name for this Associated Graph and CREATE.

Finally: Create Graphs for this Host
Goto Devices and select your favorite device to see the rra in action. If you have modified the SNMP – Interface Statistics Data Query, you may immediately select Create Graphs for this Host to see the following:

Select the interface as you would have done for any Traffic Graph. Then Select a graph type from the dropdown list (of course our newly defined Graph Template!) and CREATE. As usually, you will have to wait at least two polling cycles to get the graph generated and filled with the first value. Don't be impatient! Let it run for awhile.
Under the Graphs tab you will notice something like

Well, this looks like usual, doesn't it? You may wonder about the Outbound traffic displayed negative. Well, this is a little CDEF but is of no matter here. And of course, for the first two days you will not notice anything unusual. This is because the default cacti rra configuration keeps all data points without consolidation for 600 intervals (about 2 days).
Some advice:
Please do not click onto the graph too fast. I had to wait some time (don't remember exactly) before clicking gave a result like the next one:

This is already a zoomed image. You will notice, that my personal laptop isn't online for the whole day .

Now, where is the trick?
At first, you may wonder, whether only this one graph will be displayed. This is, because only one single rra exists. And cacti associated the time interval of the graph with each rra. Only one rra defined gives only one image displayed.
But you may zoom in at any place and will reach down to the 5 min intervals. This is, what had to be proved (q.e.d as the old romans said).

Something to keep in mind
Space allocation with rrdtool
The space needed is calculated from

  • the number of data sources needed (e.g. traffic in and traffic out form two data sources)
  • the number of rra's needed (e.g. one archive for storing original data points, a second one to hold averaged data points for some weeks, a third for holding averaged data points for some months ...)
  • the number of data points to be stored in each rra
  • some header space

If you omit consolidation (that is: averaging out some data points), you won't “loose data”. But you will loose space!
Example: Store data every 300 seconds for a whole year. This leads to 12 (data points each hour) * 24 (hours per day) * 365 (days per year) data point (= 105120). Each data point holds 8 bytes, so the whole rrd will occupy about 840,960 Bytes (plus some header space) for each single data source.

A closer look to rrd file properties
Please have a look at the file sizes on my computer:
Code:
-rw-r--r--  1 cactiuser cactiuser   94660 Oct  2 19:40 gandalf_traffic_in_17.rrd
-rw-r--r--  1 cactiuser cactiuser 1844056 Oct  2 19:40 gandalf_traffic_in_71.rrd


They belong to following rrd definitions (see Data Source Debug of that data source)
Code:
/usr/bin/rrdtool create \
/var/www/html/cacti-0.8.6f/rra/gandalf_traffic_in_17.rrd \
--step 300  \
DS:traffic_in:COUNTER:600:0:100000000 \
DS:traffic_out:COUNTER:600:0:100000000 \
RRA:AVERAGE:0.5:1:600 \
RRA:AVERAGE:0.5:6:700 \
RRA:AVERAGE:0.5:24:775 \
RRA:AVERAGE:0.5:288:797 \
RRA:MIN:0.5:1:600 \
RRA:MIN:0.5:6:700 \
RRA:MIN:0.5:24:775 \
RRA:MIN:0.5:288:797 \
RRA:MAX:0.5:1:600 \
RRA:MAX:0.5:6:700 \
RRA:MAX:0.5:24:775 \
RRA:MAX:0.5:288:797 \
RRA:LAST:0.5:1:600 \
RRA:LAST:0.5:6:700 \
RRA:LAST:0.5:24:775 \
RRA:LAST:0.5:288:797 \


and respectively:
Code:
/usr/bin/rrdtool create \
/var/www/html/cacti-0.8.6f/rra/gandalf_traffic_in_71.rrd \
--step 300  \
DS:traffic_out:COUNTER:600:0:100000000 \
DS:traffic_in:COUNTER:600:0:100000000 \
RRA:AVERAGE:0.5:1:115200 \


As you will notice, the newly generated rrd is about 20 times the size of the original one (and this one spreads two years, not only 400 days). So please pay attention, before using this widely. The performance impact for updating and displaying such rrd's in a large installation may not be desired.

Coming next
This tutorial shows the use of large rra's without consolidation and how to define and use them with cacti.
But it does not show, how to do this for already existing rrd's. Well, there is a way. You will have to use rrdtool resize to change the size of an already defined rra. I'm developing a shell script (yes, linux guy) and some help how to use this. This will hopefully fill another tutorial.
Done, see http://forums.cacti.net/viewtopic.php?p=43564#43564
You will have noticed some rra's in the example above that were not yet explained. I will show how to define another set of rra's that will have consolidation but hold more data as the default configuration. Done, see below
And last: There may be a more sophisticated approach for using existing Data Templates. A quick introduction is given in the cacti HOWTO forum. Some samples with a real life cacti configuration may go into another tutorial. Done, see http://forums.cacti.net/viewtopic.php?t=9826

I appreciate any feedback to improve this document.
You will find this document as print optimized PDF here: http://www.team-scheck.de/cacti/big-rra/howto.pdf

Reinhard Scheck aka lvm

[edit]
Graph Template with dependencies added for convenience. You will still have to follow the steps for Data Queries. You must have cacti 0.8.6g at minimum.
[/edit]



cacti_graph_template_1_interface_traffic_bitssec.xml
 Description:
Graph Template based on cacti 0.8.6g, including rra, CDEF, data template and Grah Template

Download
 Filename:  cacti_graph_template_1_interface_traffic_bitssec.xml
 Filesize:  14.26 KB
 Downloaded:  399 Time(s)



Last edited by gandalf on Wed Jun 18, 2008 1:30 pm; edited 11 times in total
Back to top
BSOD2600
Cacti Moderator


Joined: 08 May 2004
Posts: 6767
Location: WA, USA

PostPosted: Sun Oct 02, 2005 7:24 pm    Post subject: Reply with quote

Nice guides, you slated to do some FAQ writing?
Back to top
disorganizer



Joined: 04 Aug 2005
Posts: 47

PostPosted: Tue Oct 04, 2005 7:24 am    Post subject: Reply with quote

nice faq!
one point:
when using another function than average the graphs dont work, as they try to access the consolidation function average.

so it would be best to leave the function as average.
Back to top
gandalf
Developer


Joined: 02 Dec 2004
Posts: 12186
Location: Muenster, Germany

PostPosted: Tue Oct 04, 2005 11:59 am    Post subject: Reply with quote

disorganizer wrote:
nice faq!
one point:
when using another function than average the graphs dont work, as they try to access the consolidation function average.

so it would be best to leave the function as average.

Thank you for your comment. I've corrected this issue
Reinhard
Back to top
gandalf
Developer


Joined: 02 Dec 2004
Posts: 12186
Location: Muenster, Germany

PostPosted: Thu Oct 13, 2005 12:07 pm    Post subject: Reply with quote

For ease of use I've added another Graph Template similar to the first one.
This time, it includes 3 RRAs (you'll see them already in the examples above).
Reinhard



cacti_graph_template_2_interface_traffic_bitssec.xml
 Description:
Graph Templates referring to 3 newly defined RRAs (see above)

Download
 Filename:  cacti_graph_template_2_interface_traffic_bitssec.xml
 Filesize:  17.7 KB
 Downloaded:  322 Time(s)

Back to top
dtpiller



Joined: 17 Oct 2005
Posts: 2
Location: Brussels, Belgium

PostPosted: Mon Oct 17, 2005 8:39 am    Post subject: rrdtool space allocation Reply with quote

Thanks for posting that section about allocating space with rrdtool. We are planning to make an rra of 10min average in 1 year, 1hour average in 2 years and 30min average in 2years but before we start with that, we want to know how big will the rrd files be with this rra.
Your example is simple and direct, but i just want to be sure how to calculate the total data points if i use consolidation. Example rrd:

/var/www/html/cacti/rra/test_host_traffic_in_183.rrd \
--step 300 \
DS:traffic_in:COUNTER:600:0:100000000 \
DS:traffic_out:COUNTER:600:0:100000000 \
RRA:AVERAGE:0.5:6:35040 \
RRA:AVERAGE:0.5:2:105120 \
RRA:AVERAGE:0.5:12:17520 \
RRA:MAX:0.5:6:35040 \
RRA:MAX:0.5:2:105120 \
RRA:MAX:0.5:12:17520 \ --> do i calculate the total data points in this RRA by 12 steps*17520? (1 hour average in 2 years)

Help will be much appreciated!
Back to top
gandalf
Developer


Joined: 02 Dec 2004
Posts: 12186
Location: Muenster, Germany

PostPosted: Mon Oct 17, 2005 3:14 pm    Post subject: Re: rrdtool space allocation Reply with quote

dtpiller wrote:
Thanks for posting that section about allocating space with rrdtool. We are planning to make an rra of 10min average in 1 year, 1hour average in 2 years and 30min average in 2years but before we start with that, we want to know how big will the rrd files be with this rra.
Your example is simple and direct, but i just want to be sure how to calculate the total data points if i use consolidation. Example rrd:

/var/www/html/cacti/rra/test_host_traffic_in_183.rrd \
--step 300 \
DS:traffic_in:COUNTER:600:0:100000000 \
DS:traffic_out:COUNTER:600:0:100000000 \
RRA:AVERAGE:0.5:6:35040 \
RRA:AVERAGE:0.5:2:105120 \
RRA:AVERAGE:0.5:12:17520 \
RRA:MAX:0.5:6:35040 \
RRA:MAX:0.5:2:105120 \
RRA:MAX:0.5:12:17520 \ --> do i calculate the total data points in this RRA by 12 steps*17520? (1 hour average in 2 years)

Help will be much appreciated!

First: The definition above will not lead to the desired results. Or I'm missing something: do you plan to poll every 10 Minutes? Or do you want to poll every 5 Minutes but want to consolidate every two pollings into one result?
RRA:AVERAGE:0.5:6:35040 reads:
Build a RRA, use AVERAGE as the consolidation function, use 0.5 as x-files-factor, only store the AVREAGE of 6 primary data points as a consolidated data point, do this for 35040 stored values. If you're polling every 5 min, this will lead to AVERAGEing 30 mins. Is this the desired behaviour?
Second: For estimating the size, please add ALL data points of ALL RRA (in your case 35040 + 105120 + 17520 + 35040 + 105120 + 17520), multiply by the number of ds' used (2 in your case). Finally multiply by 8 (the number of bytes each data point occupies) and you'll get a rough estimate (this ignores all header overheads). Or simply create that rrd and you'll see Well, my calculation is deduced from my findings on Linux plattforms, surely Tobi Oetiker knows better
Third:
Quote:
1hour average in 2 years and 30min average in 2years
I do not understand, why you want to stored data for the same timeframe in two different resolutions? RRDTOOL graph will always use the best fitting one (see the docs at http://www.rrdtool.org). AFAIK, the 1h average data will simply be ignored ...

And concerning the detail data, please read my post at http://forums.cacti.net/viewtopic.php?t=9826. Perhaps youÄll be happy with the defaults anyway ...
Reinhard
Back to top
schef4711



Joined: 12 Jul 2005
Posts: 19
Location: Argentina

PostPosted: Sun Dec 04, 2005 5:48 pm    Post subject: 5min monthly with 95 Reply with quote

Hello,

is there also an easy way to create a template with 95th Percentile in 5min averrage for monthly and yearly ??

Because in the SNMP Data Queries there will not be a Interface for that.

bye alex
Back to top
gandalf
Developer


Joined: 02 Dec 2004
Posts: 12186
Location: Muenster, Germany

PostPosted: Wed Dec 07, 2005 3:39 pm    Post subject: Re: 5min monthly with 95 Reply with quote

schef4711 wrote:
is there also an easy way to create a template with 95th Percentile in 5min averrage for monthly and yearly ??
The 95th Percentile is not (only) a function of rrdtool but a feature of the cacti code. So I cannot answer this question.
Reinhard
Back to top
rony
Developer/Forum Admin


Joined: 17 Nov 2003
Posts: 5441
Location: Wisconsin, USA

PostPosted: Wed Dec 07, 2005 3:43 pm    Post subject: Reply with quote

If the 5 minute data exists for the selected time range, month, year, it will use the 5 minute data to calculate the 95th percentile. Otherwise, it will use the consolidated data for that range.
Back to top
thisted
Cacti User


Joined: 15 Apr 2005
Posts: 64
Location: Bay City, MI

PostPosted: Mon Jan 09, 2006 3:41 pm    Post subject: Reply with quote

Quote:
Timespan: used for displaying 3,305,184 seconds = about 382 days (taken from other cacti rra)


Should be 33,053,184
Back to top
gandalf
Developer


Joined: 02 Dec 2004
Posts: 12186
Location: Muenster, Germany

PostPosted: Tue Jan 10, 2006 12:22 pm    Post subject: Reply with quote

thisted wrote:
Should be 33,053,184
I've corrected that issue. Thanks for pointing this out
Reinhard
Back to top
james0



Joined: 25 Oct 2006
Posts: 1

PostPosted: Thu Oct 26, 2006 6:55 am    Post subject: Reply with quote

I followed the directions listed on this thread for creating a one year RRA. It seems like it works since I can zoom all the way in to 5 min intervals, but there is one thing I don't understand:

When I first look at the RRA, it shows a whole year, and the Average and MAX are X and Y. When I zoom in on today
Back to top
james0



Joined: 25 Oct 2006
Posts: 1

PostPosted: Thu Oct 26, 2006 7:33 am    Post subject: Reply with quote

**hmm, got an SQL error when posting the above... Posting again outside of IE7*


I followed the directions listed on this thread for creating a one year RRA. It seems like it works since I can zoom all the way in to 5 min intervals, but there is one thing I don't understand:

When I first look at the RRA, it shows a whole year, and the Average and MAX are X and Y. When I zoom in on today
Back to top
james0



Joined: 25 Oct 2006
Posts: 1

PostPosted: Thu Oct 26, 2006 7:36 am    Post subject: Reply with quote

*ok, this time without apostrophes*

I followed the directions listed on this thread for creating a one year RRA. It seems like it works since I can zoom all the way in to 5 min intervals, but there is one thing I don't understand:

When I first look at the RRA, it shows a whole year, and the Average and MAX are X and Y. When I zoom in on todays data(and todays data is the only data in the RRA so far), the average and max increase. Why is this?

I was hoping that after doing these things I could look at a one year graph and see a realistic average and max that has not been averaged down to nothing. Have I missed a step? or a concept?


Another example of this is:
I took a clean cacti install, and changed all of the default RRA's to use a step of 1, and then multiplied the Rows by the original step value. I realize that this may not seem like an effective use of disk space, but I want to be able to quickly look at what is happening today on the graph without zooming in, and disk space is not an issue for me.
What I ended up with was something with the same problem I first described. Todays graph shows accurate data, and the average and max are X and Y. The next graph, being the weekly graph has a lower avg and MAX. Monthly and yearly say NAN, but I can zoom in on today and then values do show up. Where is this averaging happening? I do not understand this behavior.


-James
Back to top
Display posts from previous:   
Post new topic   Reply to topic    Cacti Forum Index -> Informational/HOWTO's All times are GMT - 5 Hours
Goto page 1, 2  Next
Page 1 of 2

 



Powered by phpBB © 2001, 2005 phpBB Group