|
|
| Author |
Message |
rony Developer/Forum Admin
Joined: 17 Nov 2003 Posts: 5365 Location: Wisconsin, USA
|
Posted: Wed Jan 16, 2008 10:09 am Post subject: |
|
|
| I know from past discussions that this is an issue we have identified, but I'm glad someone brought it up in this thread.
|
|
| Back to top |
|
 |
psyber
Joined: 02 Aug 2005 Posts: 46
|
Posted: Sun Jan 27, 2008 9:08 pm Post subject: |
|
|
I can see that you put some thought into this. I'm attempting to as well.
I work in what is essentially a NOC environment and for us alerting and prediction is everything
Couple of questions/comments/observations about the proposed architecture:
Alerting: Where would be the best pace to tap into to do alerting in your prescribed layout?? I would imagine a tap into the rrd data pipeline could provide threshold and aberrant behavior triggers.
Do you have any specific existing protocols in mind for the rrd update service? XML/RPC SOAP ....
I'm running the scenario of patch day in my head where each machine is taken out of the equation one at a time and everything looks ok until we get to the RRD update servers. I'm curious, does the db cache the polled data until the update server comes online then serilize the data to the RRD server when its back up to fill out the rrd? How could alerting work in this scenario???? Maybe tee some of the data in a sort of raid scenario where you could lose one rrd server and still have updates and alerting continue. you would need a sync mechanism to bring the short rrds up to par and a redistribution mechanism to enable adding another server to the cluster. On the httpd side you would need a way to determine which of the two rrds to display based on last update.
| Description: |
| rrd1-3 are the servers, colors represent individual rrd files housed on each server. |
|
| Filesize: |
56.84 KB |
| Viewed: |
1544 Time(s) |

|
|
|
| Back to top |
|
 |
bbice Cacti User
Joined: 13 May 2002 Posts: 66
|
Posted: Mon Feb 04, 2008 9:51 pm Post subject: |
|
|
I'm glad this topic is coming up for discussion. I definitely like the idea of having multiple pollers because we would like to be able to have multiple pollers in multiple geographic locations to keep as much of the polling traffic "local" as possible (less snmp traffic going over WAN/VPN connections and/or crossing high-latency paths).
One thing we'll want to be careful about, particularly with as many separate DB instances as I see in the diagram is that SQL database performance (at least in my experience) suffers horribly over high-latency links. This is especially true for apps doing lots of small DB queries instead of few big queries. (the last time I tried testing this it didn't matter what flavor of DB server I used either - accessing a server over a WAN circuit with a 30 ms latency dropped the number of transactions per second I could do from tens of thousands to hundreds)
I'd initially thought the way to spread my polling load out over multiple areas would be to have separate stand-alone cacti instances (say, one in each major office monitoring nearby hosts), but with hooks in the code (maybe a new plugin?) to be able to add references to graphs, hosts, and trees from remote cacti server.
But provided having poller groups separated from the rest of the architecture by medium to high latency links doesn't cause problems, this new design looks even better.
|
|
| Back to top |
|
 |
andrew2 Cacti User
Joined: 26 Jul 2004 Posts: 64 Location: Cincinnati, OH
|
Posted: Thu Feb 14, 2008 2:06 pm Post subject: |
|
|
Wow, I'm late to the party on this one.
| bbice wrote: | | One thing we'll want to be careful about, particularly with as many separate DB instances as I see in the diagram is that SQL database performance (at least in my experience) suffers horribly over high-latency links. This is especially true for apps doing lots of small DB queries instead of few big queries. |
I agree -- this was the concern that jumped out at me when looking at the diagram. I think the remote poller stations need to be as autonomous as possible, and need to not communicate directly with the central database servers. Not only is latency a major concern, but security is an issue as well. It would be great to have the ability to locate a poller somewhere and not have to worry about setting up a VPN to avoid exposing MySQL logins.
A slight tweak to the proposed design that would help address those issues:
Add a new logical service, let's call it "Poller Control." The poller controller could exist on its own hardware, or hitch a ride on existing hardware with one of the other logical services.
Whenever Cacti configuration changes are made (whether through the CLI or the web interface) one file per poller is created from the database and provided to the poller controller. The poller controller then pushes those config files out to the relevant pollers. To save bandwidth, something like rsync might even be employed to keep poller configs up to date.
The individual pollers then read the config files passed to them by the poller controller and execute their polling as specified.
Instead of passing the results of the polling directly to the database, the pollers write out the results to a CSV file (or whatever is appropriate) which is then passed back to the poller controller. In the event the poller controller can't be reached, the CSV files are cached on the poller until the poller controller becomes reachable again.
The poller controller takes the output (CSV) from the pollers and inserts the data in to the database.
That should remove any need for true "real-time" communication between the remote pollers and the rest of the infrastructure which should solve any latency or link-related problems. Further all communication between the pollers and the central infrastructure is limited to file transfers of one kind or another, which makes managing the security aspects very simple.
This is an interesting direction to me. I've been tackling the "remote poller" issue by using a Nagios server with remote NRPE clients to poll data and insert the data into RRD files from which I use Cacti to generate graphs and present the data. This project may allow me to greatly simplify that project while increasing the overall reliability. I'm excited!
Andrew
|
|
| Back to top |
|
 |
ldrolez
Joined: 12 Nov 2007 Posts: 9
|
Posted: Thu Feb 21, 2008 9:26 am Post subject: master/slaves cacti patch |
|
|
Hi !
Here's my small patch for Cacti 0.8.7b that implements a master/slaves arch:
http://forums.cacti.net/viewtopic.php?p=127122
It modifies only poller.php. The slaves send data by inserting poller_output lines in the master database using TCP port 3306.
Cheers,
Ludo.
|
|
| Back to top |
|
 |
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|