New Cacti Architecture (0.8.8) - RFC Response Location

Anything that you think should be in Cacti.

Moderators: Moderators, Developers

Author
Message
mrnoodle
Cacti User
Posts: 59
Joined: Sun Apr 02, 2006 3:56 pm

#46 Post by mrnoodle » Wed Dec 31, 2008 9:35 am

The distributed poller design will have to take into account time discrepancies on all of the machines. The RRD files are updated with single second accuracy so the poller will have to either be in complete sync with the main storage host or the timestamp used for the RRD data will have to be generated by the main host and used by the remote pollers for their inserts.

Along with multiple pollers by location, why not try to distribute the poller load across the polling interval. Currently, every 5 minutes, the poller starts and collects the data for all the hosts in the system causing a large spike in traffic and system utilization. Once the polling process is completed, the system sits idle until the next polling cycle starts when the process is repeated.

We could even out the load on a single system by dividing all of the hosts into multiple pollers and having each poller run on its own 5 minute interval. If the polling interval is 5 minutes and we have 500 hosts, the first poller would start at 0:00 and collect data from its assigned 100 hosts. The second poller would start at 0:01 and collect data from its assigned 100 hosts, etc.

This would allow the peak I/O load on the poller to be reduced and allow all of the updates, network access and RRD writes to be distributed across the entire polling cycle.

dononeil
Cacti User
Posts: 194
Joined: Wed Aug 06, 2008 4:45 pm

#47 Post by dononeil » Tue Apr 07, 2009 3:04 pm

tianye wrote:Hello all,

distributed polling was one of the topics discussed during the 3rd European Cacti Community Conference this weekend.
I tried to put together the key points of our brainstorming. Most ideas have already been around in this thread but I thought it's best to give a complete wrap up of our discussion today, even it that means to repeat a lot things.
The input below came from all participants of the conference but of course I take all responsibilities for errors or missing points :-)

Scenarios
The overall idea would be to have one central server and a number of agents that are just polling hosts.
The central server is used for the graphic interface, that means configuring Cacti and accessing graphs, and for keeping the rrd files.
The polling agents are just polling data (as the name already suggests) and report the results back to the central server.
This mechanism can be useful in some scenarios, for example
- Remote probes
Imagine a data center with your application server and your cacti server. The application is accessed from remote branches and you have to monitor your application from the user's point of view and not just the central system.
A polling agent can be place in the remote branch to achieve that. This can be helpful for trouble-shooting or even by recovered by your SLAs.
- Latency issues
Imagine a central cacti server monitoring devices that a far away, or, in network lingo, the connection has a high latency.
For only a few devices this might be possible, but with a growing number of devices this can slow down the overall performance of your cacti.
- Remote networks
Sometimes there's the need to monitor devices in a network that is due to routing constraints not easily reachable, e.g. networks with overlapping ip address scheme when using private ip addresses.
Accessing a small number of devices can be achieved through NAT or a tunnel, but this is getting difficult with a large number of devices.
I'm late to this discussion too... What about extending the Boost plugins 'RRD Server' application to be the interface to access the remote RRD's?

Right now cacti directly accesses the RRD's, but if the underlying architecture was changed to allow it to connect to a socket and interact with the RRD's for creating graphs like boost does when it updates the RRD's, then we're 1/2 way there... Other things to do would be to create the extension to add/delete graphs on remote collection 'agents', and configure them remotely. OR a 'graph only' host could be created that does nothing but query the remote host for it's devices/settings and presents the graphs through interaction of the RRD server from boost.

User avatar
TheWitness
Developer
Posts: 14834
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

#48 Post by TheWitness » Sun May 24, 2009 10:54 pm

All,

Just an update. There have been a few things from my first diagram that have come to light recently. Here are the summaries:

1) Spine 0.8.7d Supports the concept of a poller_id
2) Cacti 0.8.8 SVN Now supports the concept of a poller

However, there is still quite a bit to do to finish things off. Here are the finishers:

1) QA the Poller concept to confirm operation for plugin developers. Per Reinhard, some plugins may require access to either RRDfiles or poller_output data or have a poller bottom on the remote poller machine.
2) Toy with the concept of a local poller cache and poller output in cases where the master server is not available.
3) Create a remote poller package that includes a poller.php, database.php, config.php, global.php, etc. If deployed as a daemon, how to reconcile Windows vs. Linux. Daemons in PHP are dangerous. (version dependent)

Larry
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Central Plugin Repository
Central Templates Repository


I'm still out there people. Getting excited for Cacti 1.2. I think it will be a great release.

Dyr
Posts: 23
Joined: Mon Sep 10, 2007 5:21 am

#49 Post by Dyr » Tue Dec 08, 2009 6:05 am

Sorry if it already has in RFC, but - is any chance to see InnoDB with using foreight keys DB structure and more clear tables descriptions/names? Now it looks too confusing (I was shocked then I found SQL-query for change from 32bit counters to 64bit: about 6-8 JOINS, it looks crazy for me).
Would you like to use new RRD Caching Daemon?

User avatar
TheWitness
Developer
Posts: 14834
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

#50 Post by TheWitness » Tue Dec 08, 2009 7:32 am

Caching daemon is in. I am taking a break in mid December (vacation actually) and will be doing some cacti things related to polling and trees.

More details to follow.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of MacTrack, Boost, CLog, SpikeKill, Platform RTM, DSStats, maintainer of Spine, lot's of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Central Plugin Repository
Central Templates Repository


I'm still out there people. Getting excited for Cacti 1.2. I think it will be a great release.

cigamit
Developer
Posts: 2790
Joined: Thu Apr 07, 2005 3:29 pm
Location: B/CS Texas
Contact:

#51 Post by cigamit » Tue Dec 08, 2009 11:10 am

Caching daemon - Looks almost exactly like boost, just written in C.....

Dyr
Posts: 23
Joined: Mon Sep 10, 2007 5:21 am

#52 Post by Dyr » Wed Dec 09, 2009 4:34 am

Nice to heard it!

And what's about DB? ;)

xefil
Cacti User
Posts: 229
Joined: Tue Jun 20, 2006 2:48 am
Location: Italy
Contact:

#53 Post by xefil » Thu Jan 28, 2010 3:44 am

Good morning! (GMT+2) :)

I'm waiting for this great work 'cause I've to groove up my cacti installation with frontends, centralized DB, distributed RRD-files and I think it's better to wait this kind of solution, better supported, than creating something inhome.

Waiting for infos or betas to try :)

Are there any news about the scenario?

Simon

[email protected]
Posts: 4
Joined: Thu Feb 14, 2008 11:54 am

Cacti clustered on version 0.8.7e

#54 Post by [email protected] » Tue Feb 09, 2010 7:45 pm

Hi all,

I have changed an older clustered code to run under the latest version of cacti. It load balance and should properly fail over. Since it hasn't been tested a long time and on different conditions, it might be it is not 100%.

It has been running for few months now on a production instance I deployed and in overall it is running properly.

Both servers are exchanging properly the data of the RRD files. We didn't tested completely the fail-over, but it should be working.

After some more test, I would be happy to have this code included on the core code of Cacti. Since this job was done for the company I am working currently, it would be nice, if you decide to incorporate this code into the main Cacti code, to mention both me and the company as contributors on the Cacti project.

If there is any body interested, let me know.

Well, looking for testers!!!

Cheers,
Fausto

xefil
Cacti User
Posts: 229
Joined: Tue Jun 20, 2006 2:48 am
Location: Italy
Contact:

#55 Post by xefil » Wed Feb 10, 2010 11:48 am

Of course it might could be interesting. BTW if this could be implemented in an official release this would prevent problems on further updates of cacti itself. I'm searching/waiting for a ditributed HA solution to be applied asap. Our enviroment will groove a lot in the next months.

Simon

[email protected]
Posts: 4
Joined: Thu Feb 14, 2008 11:54 am

#56 Post by [email protected] » Thu Feb 11, 2010 5:00 am

Hi All,

Yes, I agree, that's why I am happy to share it.

I will organize myself to send a trial version for testing on the coming week or next.

Then we can start discussing how to integrate in the core code of Cacti, if this is of interest from the main developers. Could any of the main developer of Cacti comment on it?

Cheers,
Fausto

User avatar
gandalf
Developer
Posts: 22375
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

#57 Post by gandalf » Thu Feb 11, 2010 3:43 pm

[email protected] wrote:Could any of the main developer of Cacti comment on it?
Try.
R.

[email protected]
Posts: 4
Joined: Thu Feb 14, 2008 11:54 am

#58 Post by [email protected] » Mon Feb 15, 2010 9:29 pm

Hi,

Here it goes for testing the clustered patch I created. The attachments contains the patch and a guide for setup of the environment.

Please have a go and let me know if this is something that you are interested on.

There are still room for improvement, as I know, but it is a good starting point for a proper cluster support on Cacti.

Cheers,
Fausto
Attachments
cacti-0.8.7e-clustered.zip
(119.67 KiB) Downloaded 2330 times

X-dark
Cacti User
Posts: 111
Joined: Wed Jun 18, 2008 10:21 am
Location: France

#59 Post by X-dark » Mon Aug 30, 2010 9:07 am

What is the progress on that topic since last year ? It's really a feature that is lacking when you compare Cacti to other metrology solutions.

I'm currently thinking about a solution where I had several Cacti. I would need to develop some kind of frontend to try to make this distribution kind of transparent to the end user. But this solution is obviously not optimal.

Are you still looking for comments to this RFC or have the technical choices already been taken ?
Cedric Girard

cewood
Posts: 13
Joined: Thu Mar 25, 2010 11:09 pm
Location: Sydney, Australia

Re:

#60 Post by cewood » Wed Oct 27, 2010 12:00 am

X-dark wrote:What is the progress on that topic since last year ? It's really a feature that is lacking when you compare Cacti to other metrology solutions.

I'm currently thinking about a solution where I had several Cacti. I would need to develop some kind of frontend to try to make this distribution kind of transparent to the end user. But this solution is obviously not optimal.

Are you still looking for comments to this RFC or have the technical choices already been taken ?
+1 I too would like to know where this is at, I have a few large environments where I would like to have remote pollers that report back to a centralised instance.

Is this currently possible, and if so could someone point me in the right direction?

Any help is appreciated.

Cheers
Cameron.

Post Reply