Cacti (home)ForumsRepositoryDocumentation
Cacti: offical forums and support  

 FAQFAQ   SearchSearch   MemberlistMemberlist    RegisterRegister   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in    


New Cacti Architecture (0.8.8) - RFC Resonse Location
Goto page Previous  1, 2, 3
 
Post new topic   Reply to topic    Cacti Forum Index -> Feature Requests
Author Message
rony
Developer/Forum Admin


Joined: 17 Nov 2003
Posts: 5398
Location: Wisconsin, USA

PostPosted: Wed Jan 16, 2008 10:09 am    Post subject: Reply with quote

I know from past discussions that this is an issue we have identified, but I'm glad someone brought it up in this thread.
Back to top
psyber



Joined: 02 Aug 2005
Posts: 46

PostPosted: Sun Jan 27, 2008 9:08 pm    Post subject: Reply with quote

I can see that you put some thought into this. I'm attempting to as well.
I work in what is essentially a NOC environment and for us alerting and prediction is everything
Couple of questions/comments/observations about the proposed architecture:

Alerting: Where would be the best pace to tap into to do alerting in your prescribed layout?? I would imagine a tap into the rrd data pipeline could provide threshold and aberrant behavior triggers.

Do you have any specific existing protocols in mind for the rrd update service? XML/RPC SOAP ....

I'm running the scenario of patch day in my head where each machine is taken out of the equation one at a time and everything looks ok until we get to the RRD update servers. I'm curious, does the db cache the polled data until the update server comes online then serilize the data to the RRD server when its back up to fill out the rrd? How could alerting work in this scenario???? Maybe tee some of the data in a sort of raid scenario where you could lose one rrd server and still have updates and alerting continue. you would need a sync mechanism to bring the short rrds up to par and a redistribution mechanism to enable adding another server to the cluster. On the httpd side you would need a way to determine which of the two rrds to display based on last update.



rrd raid.jpg
 Description:
rrd1-3 are the servers, colors represent individual rrd files housed on each server.
 Filesize:  56.84 KB
 Viewed:  1976 Time(s)

rrd raid.jpg


Back to top
bbice
Cacti User


Joined: 13 May 2002
Posts: 71

PostPosted: Mon Feb 04, 2008 9:51 pm    Post subject: Reply with quote

I'm glad this topic is coming up for discussion. I definitely like the idea of having multiple pollers because we would like to be able to have multiple pollers in multiple geographic locations to keep as much of the polling traffic "local" as possible (less snmp traffic going over WAN/VPN connections and/or crossing high-latency paths).

One thing we'll want to be careful about, particularly with as many separate DB instances as I see in the diagram is that SQL database performance (at least in my experience) suffers horribly over high-latency links. This is especially true for apps doing lots of small DB queries instead of few big queries. (the last time I tried testing this it didn't matter what flavor of DB server I used either - accessing a server over a WAN circuit with a 30 ms latency dropped the number of transactions per second I could do from tens of thousands to hundreds)

I'd initially thought the way to spread my polling load out over multiple areas would be to have separate stand-alone cacti instances (say, one in each major office monitoring nearby hosts), but with hooks in the code (maybe a new plugin?) to be able to add references to graphs, hosts, and trees from remote cacti server.

But provided having poller groups separated from the rest of the architecture by medium to high latency links doesn't cause problems, this new design looks even better.
Back to top
andrew2
Cacti User


Joined: 26 Jul 2004
Posts: 64
Location: Cincinnati, OH

PostPosted: Thu Feb 14, 2008 2:06 pm    Post subject: Reply with quote

Wow, I'm late to the party on this one.

bbice wrote:
One thing we'll want to be careful about, particularly with as many separate DB instances as I see in the diagram is that SQL database performance (at least in my experience) suffers horribly over high-latency links. This is especially true for apps doing lots of small DB queries instead of few big queries.


I agree -- this was the concern that jumped out at me when looking at the diagram. I think the remote poller stations need to be as autonomous as possible, and need to not communicate directly with the central database servers. Not only is latency a major concern, but security is an issue as well. It would be great to have the ability to locate a poller somewhere and not have to worry about setting up a VPN to avoid exposing MySQL logins.

A slight tweak to the proposed design that would help address those issues:

Add a new logical service, let's call it "Poller Control." The poller controller could exist on its own hardware, or hitch a ride on existing hardware with one of the other logical services.

Whenever Cacti configuration changes are made (whether through the CLI or the web interface) one file per poller is created from the database and provided to the poller controller. The poller controller then pushes those config files out to the relevant pollers. To save bandwidth, something like rsync might even be employed to keep poller configs up to date.

The individual pollers then read the config files passed to them by the poller controller and execute their polling as specified.

Instead of passing the results of the polling directly to the database, the pollers write out the results to a CSV file (or whatever is appropriate) which is then passed back to the poller controller. In the event the poller controller can't be reached, the CSV files are cached on the poller until the poller controller becomes reachable again.

The poller controller takes the output (CSV) from the pollers and inserts the data in to the database.

That should remove any need for true "real-time" communication between the remote pollers and the rest of the infrastructure which should solve any latency or link-related problems. Further all communication between the pollers and the central infrastructure is limited to file transfers of one kind or another, which makes managing the security aspects very simple.

This is an interesting direction to me. I've been tackling the "remote poller" issue by using a Nagios server with remote NRPE clients to poll data and insert the data into RRD files from which I use Cacti to generate graphs and present the data. This project may allow me to greatly simplify that project while increasing the overall reliability. I'm excited!

Andrew
Back to top
ldrolez



Joined: 12 Nov 2007
Posts: 9

PostPosted: Thu Feb 21, 2008 9:26 am    Post subject: master/slaves cacti patch Reply with quote

Hi !

Here's my small patch for Cacti 0.8.7b that implements a master/slaves arch:

http://forums.cacti.net/viewtopic.php?p=127122

It modifies only poller.php. The slaves send data by inserting poller_output lines in the master database using TCP port 3306.

Cheers,

Ludo.
Back to top
driden91



Joined: 20 Jan 2006
Posts: 42

PostPosted: Thu Oct 02, 2008 7:48 am    Post subject: Reply with quote

hello

I would like your opinion on this architecture.

for you it's possible to realise this architecture ?




for the master / slave poller i want use this patch : http://forums.cacti.net/viewtopic.php?p=127122
Back to top
killshoot



Joined: 26 Mar 2008
Posts: 32

PostPosted: Fri Oct 03, 2008 4:50 am    Post subject: Reply with quote

Well... performance for RRD update through NFS are very poor (tried with netapp 3050)

@All

My spec : 2200 appliances, 50'000 ds and 40000 graphs
Polling time with spine ~130 second
dual quad core Xeon 3ghz, 4gb ram

A distributed architecture is a good thing BUT FIRST we need to optimize Cacti. In fact I use a dual quad core xeon and my problem is that one core is always at 100% and the other sleeps.

Spine is multithreaded, but other process (and plugin) are only monothreaded. That would say that mysql is always accessed by one php process thus mysql allow maximum one CPU per process thus 1 core do the job and other sleep.

Vince
Back to top
cigamit
Developer


Joined: 07 Apr 2005
Posts: 934
Location: B/CS Texas

PostPosted: Tue Oct 07, 2008 3:01 pm    Post subject: Reply with quote

Such is the problem with PHP. PHP is supposedly thread safe, but most modules with it are not.


http://www.php.net/manual/en/install.unix.apache2.php

Quote:
Warning

We do not recommend using a threaded MPM in production with Apache 2. Use the prefork MPM instead, or use Apache 1. For information on why, read the related FAQ entry on using Apache2 with a threaded MPM
Back to top
gandalf
Developer


Joined: 02 Dec 2004
Posts: 11725
Location: Muenster, Germany

PostPosted: Wed Oct 08, 2008 1:43 pm    Post subject: Reply with quote

killshoot wrote:
A distributed architecture is a good thing BUT FIRST we need to optimize Cacti. In fact I use a dual quad core xeon and my problem is that one core is always at 100% and the other sleeps.
There will be an optimization with 087c to equally distribute data sources between different threads. It does not yet take into account different runtimes of different data sources
Reinhard
Back to top
Display posts from previous:   
Post new topic   Reply to topic    Cacti Forum Index -> Feature Requests All times are GMT - 5 Hours
Goto page Previous  1, 2, 3
Page 3 of 3

 



Powered by phpBB © 2001, 2005 phpBB Group