Cacti (home)ForumsRepositoryDocumentation
Cacti: offical forums and support  

 FAQFAQ   SearchSearch   MemberlistMemberlist    RegisterRegister   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in    


[ HOWTO ] Graphing Holt-Winters Predictive Analysis

 
Post new topic   Reply to topic    Cacti Forum Index -> Informational/HOWTO's
Author Message
koaps



Joined: 15 Feb 2007
Posts: 9

PostPosted: Thu Nov 20, 2008 4:37 pm    Post subject: [ HOWTO ] Graphing Holt-Winters Predictive Analysis Reply with quote

So, I finally worked out tick support using a nice little hack JasperJ worked out for VDEFs.

The stuff I'm about to mention is for graphing HW enabled rrd's,

I personally use external scripts to create and update rrd's, so I haven't messed with having Cacti actually create them.

First off you will need to edit include/global_arrays.php and add in 3 new CF's.

It should look like this:

Code:

$consolidation_functions = array(1 =>
        "AVERAGE",
        "MIN",
        "MAX",
        "LAST",
        "HWPREDICT",
        "DEVPREDICT",
        "FAILURES",
        );


Once you have that done, you will be able to create the graph templates and CDEF's.

The CDEF's you will need are:

Upper bounds: cdef=b,c,2,*,+
Lower bounds: cdef=b,c,2,*,-
Ticks: cdef=CURRENT_DATA_SOURCE TICK:CURRENT_DATA_SOURCE#ffffa0:1.0:"Failures\n"

Things to note here are, I have hardcoded the math in the bounds to use DEF's b and c, which in my case are HWPREDICT and DEVPREDICT respectively. Your setup could be different if you have more than one DS in your rrd, for instance for network traffic you have an in DS and an out DS. You will need to create Upper_In/Lower_In and Upper_Out/Lower_Out in that case and put the correct DEF letters inplace for the CDEFs.

The Ticks CDEF uses the graph injection hack that Jasper figured out, this could be removed in future versions of Cacti since it technically is a vulnerability though I don't think it's that severe of one.

Here's some debugging output from the graph so you can see what's going on:

Code:

DEF:a="my.rrd":bps:AVERAGE \
DEF:b="my.rrd":bps:HWPREDICT \
DEF:c="my.rrd":bps:DEVPREDICT \
DEF:d="my.rrd":bps:FAILURES \
CDEF:cdefc=b,c,2,*,+ \
CDEF:cdefd=b,c,2,*,- \
CDEF:cdefe=d TICK:d#ffffa0:1.0:"Failures\n" \
AREA:a#000000FF:"\n"  \
LINE3:b#FFF200B2:""  \
LINE2:cdefc#FF0000B2:""  \
LINE2:cdefd#FF0000B2:""  \
 \
LINE1:d:""


To get the DEF for FAILURES I made a line1 with no color, this gives me DEF d.
After I have that, I made a comment, with the CF set to FAILURES and the CDEF set to Tick.

That will set CURRENT_DATA_SOURCE to equal DEF d and the tick will be graphed.

Currently the ticks are plotting correctly for me, I have graphs generated both within Cacti and externally and so far they match. I have a situation right now that puts my graphs in a failure state, so I need to wait til I recover from that state to see if the failure ticks stop when I no longer violate my confidence bands.

Let me know if you guys need some more info, I have posted stuff on how I did my confidence bands here:

http://forums.cacti.net/viewtopic.php?t=25930&postdays=0&postorder=asc&highlight=hwpredict&start=30

When I get a chance I'll dump my graph templates and post them.



S1_A1.png
 Description:
Externally generated graph
 Filesize:  25.24 KB
 Viewed:  830 Time(s)

S1_A1.png



graph_image.php.png
 Description:
Cacti graph with ticks plotted.
 Filesize:  22.43 KB
 Viewed:  830 Time(s)

graph_image.php.png


Back to top
brian.nz



Joined: 08 May 2008
Posts: 4
Location: Wellington NZ

PostPosted: Thu Nov 20, 2008 6:23 pm    Post subject: Holt-Winter Reply with quote

Awesome work.

1 Question, you mentioned you use custom scripts to create and update rrd's for use with HW.

I have around 4800 rrd's that needs to be made HW aware, any chance of sharing those scripts or pointing me in the direction of those scripts.

Regards
Back to top
koaps



Joined: 15 Feb 2007
Posts: 9

PostPosted: Thu Nov 20, 2008 7:26 pm    Post subject: Reply with quote

There's a perl script to do something like that.

http://internap.dl.sourceforge.net/sourceforge/rrfw/rrdman-1.0.1.tar.gz

http://rrfw.sourceforge.net/rrdman/rrd_hwreapply.pod.html

I have never used it but it might work for ya.

My script does something completely different, it copies 2 gigs of random data to our san, doing a md5sum at the end to check for corruption and then saves the transfer rate into a rrd and makes a graph.

Just remember a few things, with 4800 rrds, if each has only one data store, you will be adding 3 more RRA's along with any already there(Average, Max, Min or Last). This can take a long time to update and with that many you can easily go past 5 minutes, causing updates to fail.

The other thing is, HW is for trending, if you use the default thresholds and window lengths you will be looking for 35 minutes of non-continuous failure in a 45 minute window, not exactly realtime, and in my experience it takes several days of data collection before confidence bounds even show up.

You can see this clearly in the image I posted from Cacti, there's 6 violations(values outside my confidence bounds) before it marks a failure, on the 7th(35 minutes total).

I'm still working on making my script react when it detects a failure, most likely it will tell nagios directly causing a page to go out.

My scripts are in perl, so I use the RRDs module directly and it's almost as easy as working with rrdtool on the command line.

To give you an idea, this is how my script creates rrd's

Code:

        if (not -e "$rrd_path/$k.rrd") {
#           print "Creating RRD $rrd_path/$k.rrd\n";
            RRDs::create ("$rrd_path/$k.rrd",
                "DS:$ds:GAUGE:1800:U:U",
                "RRA:AVERAGE:0.5:1:2016",
                "RRA:HWPREDICT:1440:0.1:0.0035:288",
                );
            $err=RRDs::error;
            if ($err) {print "problem creating the RRD: $err\n";}
        }


It first checks to see if the rrd exists, if not, creates it with default HW values.

RRA:HWPREDICT:1440:0.1:0.0035:288

Which I go from:

http://www.usenix.org/events/lisa2000/full_papers/brutlag/brutlag_html/index.html

I still have a lot testing to do to see how the system works, so far it's working.

We experienced a drive failure in a san disk tray and HW detected the RAID rebuilding process and marked it as a failure, this is why my bps dropped from over 200MB/s to less than 100MB/s.

Let me know if that script works or not, I can probably modify my script to do something like what you need.
Back to top
gandalf
Developer


Joined: 02 Dec 2004
Posts: 12642
Location: Muenster, Germany

PostPosted: Sat Nov 22, 2008 4:43 am    Post subject: Reply with quote

I will have to do a wrapup of everything you've mentioned here. The ugly one might be the VDEF issue. Then, there will be a good chance to get it into 088.
Reinhard
Back to top
koaps



Joined: 15 Feb 2007
Posts: 9

PostPosted: Mon Nov 24, 2008 4:38 pm    Post subject: Another HW behavior note. Reply with quote

On a side note about failure graphs, they are only applied to current data.

So if you try to go back and look for a failure, they will not be marked on the graph.

I have tested this directly with rrdtool, so it's not a side effect of the "hack" per se.
Back to top
Display posts from previous:   
Post new topic   Reply to topic    Cacti Forum Index -> Informational/HOWTO's All times are GMT - 5 Hours
Page 1 of 1

 



Powered by phpBB © 2001, 2005 phpBB Group