|
|
| Author |
Message |
aldobert
Joined: 04 Sep 2007 Posts: 2 Location: Tallinn, Estonia
|
Posted: Thu Feb 07, 2008 8:59 am Post subject: boost_server.php forgets config |
|
|
I'm trying to set up and test a kind of Cacti cluster. My live setup contains Cacti 0.8.7a+plugininfra, 27k datasources, dual dualcore opterons, 8G mem, SAS disk array and so on. Main problem is iowait during rrd updates.
In first phase I separated mysql, but impact to main server load was next to nothing.
Next, I made a test environment where Boost_server, boost_poller and RRD files are on a separate server and sharing RRD directory to main (web+poller) server via NFS. This setup seems very good. Test env contains 2k RRD-s, boost poller runs by crontab in every 15 min and run duration is less than 2 seconds. But boost_server on demand update gets broken after about 20 minutes of run. It starts to reject connections from main server:
| Code: | 2008:02:07 12:55:29 - Host Connected '192.168.25.120'
02/07/2008 01:36:07 PM - BOOST SERVER: Poller[0] WARNING: Host '192.168.25.120' Not Authorized. |
I made some debugging and added variable dumps into this authorization code and found that boost_server.php just misses configuration value for boost_server_clients.
$config before:
| Code: |
array (
'url_path' => '/cacti_test/',
'cacti_server_os' => 'unix',
'php_snmp_support' => true,
'base_path' => '/home/cacti/cacti_test',
'library_path' => '/home/cacti/cacti_test/lib',
'include_path' => '/home/cacti/cacti_test/include',
'rra_path' => '/home/cacti/cacti_test/rra',
'boost_version' => '1.8',
'config_options_array' =>
array (
'availability_method' => '2',
'ping_method' => '2',
'ping_port' => '23',
'ping_timeout' => '1000',
'ping_retries' => '0',
'snmp_ver' => '1',
'snmp_community' => '',
'snmp_username' => '',
'snmp_password' => '',
'snmp_auth_protocol' => 'MD5',
'snmp_priv_passphrase' => '',
'snmp_priv_protocol' => 'DES',
'snmp_port' => '161',
'snmp_timeout' => '500',
'max_get_size' => '50',
'boost_server_effective_user' => 'cacti',
'boost_server_listen_port' => '9051',
'path_rrdtool' => '/usr/bin/rrdtool',
'boost_path_rrdupdate' => '/usr/bin/rrdupdate',
'path_php_binary' => '/usr/bin/php',
'path_cactilog' => '/home/cacti/cacti_test/log/cacti.log',
'poller_verbosity' => NULL,
'log_verbosity' => '5',
'boost_server_multiprocess' => '1',
'boost_server_clients' => '192.168.25.120',
'log_destination' => '1',
),
'cacti_version' => '0.8.7a',
) |
$config after:
| Code: |
array (
'url_path' => '/cacti_test/',
'cacti_server_os' => 'unix',
'php_snmp_support' => true,
'base_path' => '/home/cacti/cacti_test',
'library_path' => '/home/cacti/cacti_test/lib',
'include_path' => '/home/cacti/cacti_test/include',
'rra_path' => '/home/cacti/cacti_test/rra',
'boost_version' => '1.8',
'config_options_array' =>
array (
'availability_method' => '2',
'ping_method' => '2',
'ping_port' => '23',
'ping_timeout' => '1000',
'ping_retries' => '0',
'snmp_ver' => '1',
'snmp_community' => '',
'snmp_username' => '',
'snmp_password' => '',
'snmp_auth_protocol' => 'MD5',
'snmp_priv_passphrase' => '',
'snmp_priv_protocol' => 'DES',
'snmp_port' => '161',
'snmp_timeout' => '500',
'max_get_size' => '50',
'boost_server_effective_user' => 'cacti',
'boost_server_listen_port' => '9051',
'path_rrdtool' => '/usr/bin/rrdtool',
'boost_path_rrdupdate' => '/usr/bin/rrdupdate',
'path_php_binary' => '/usr/bin/php',
'path_cactilog' => '/home/cacti/cacti_test/log/cacti.log',
'poller_verbosity' => NULL,
'log_verbosity' => '5',
'boost_server_multiprocess' => '1',
'boost_server_clients' => '127.0.0.1',
'log_destination' => '1',
),
'cacti_version' => '0.8.7a',
)
|
And another bad thing is that the main server loses also the data from poller_output_boost after unsuccessful rrd update against boost_server.
I can handle this issue by some custom hacks but it is good to know if someone else had done similar setup and problems too  |
|
| Back to top |
|
 |
aldobert
Joined: 04 Sep 2007 Posts: 2 Location: Tallinn, Estonia
|
Posted: Fri Feb 08, 2008 10:35 am Post subject: |
|
|
From other logs I found that boost_server goes really mad after that and also found the root cause - mysql connection broke by idle timeout. I had mysql variable wait_timeout set to 900 seconds.
I suggest to add automatic reconnect to boost server because when it's used only for on demand rrd updates, it may get timeout for any timeout value if users do not use cacti wui that long time.
Some examples what happened after timeout:
boost_server.log (boost_server.php -d -f > $LOG):
| Code: |
2008:02:08 12:05:51 - Host Connected '192.168.25.120'
2008:02:08 12:05:51 - Host Validated '192.168.25.120'
2008:02:08 12:05:51 - RRD Command 'update /home/cacti/cacti_test/rra/cactibe_rb_traf_tx_27529.rrd --template rb_traf_rx:rb_traf_tx 1202472316:6
4956675963:59680022319'
02/08/2008 12:05:51 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/bin/php, ARGS: plugins/boost/boost_rrdupdate.php 5
18 /usr/bin/rrdupdate /home/cacti/cacti_test/rra/cactibe_rb_traf_tx_27529.rrd --template rb_traf_rx:rb_traf_tx 1202472316:64956675963:596800223
19]
02/08/2008 12:06:01 PM - BOOST SERVER: Poller[0] WARNING: Broken connection detected '192.168.25.120'
|
cacti.log@boost_server:
| Code: |
02/08/2008 12:05:51 PM - CMDPHP: Poller[0] ERROR: SQL Row Failed!, Error:'2006', SQL:"select value from settings where name='poller_verbosity'"
02/08/2008 12:05:51 PM - BOOST: Poller[0] PROGERR: ERRNO:'2' TYPE:'Warning' MESSAGE:'pclose(): 57 is not a valid stream resource' IN FILE:'/hom
e/cacti/cacti_test/lib/rrd.php' LINE NO:'57'
... and thousands of such rows until disk gets full
|
and cacti.log@frontend server:
| Code: |
02/08/2008 12:06:01 PM - BOOST: Poller[0] ERROR: Timeout detected. Boost server is down. Contact support immediately!!
02/08/2008 12:06:01 PM - BOOST: Poller[0] WARNING: RRD Update Warning 'FAILED' for Local Data ID '27529'
02/08/2008 12:06:01 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "INSERT INTO `rcaston_1202472351` (local_data_id, rrd_name, time) VALUES ('27529',
'rb_traf_rx', '2008-02-08 12:05:16'), ('27529', 'rb_traf_tx', '2008-02-08 12:05:16') ON DUPLICATE KEY UPDATE local_data_id=VALUES(local_data_id
)"
02/08/2008 12:06:01 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "DELETE FROM `poller_output_boost` USING `poller_output_boost`, `rcaston_120247235
1` WHERE rcaston_1202472351.local_data_id=poller_output_boost.local_data_id AND rcaston_1202472351.rrd_name=poller_output_boost.rrd_name AND
rcaston_1202472351.time=poller_output_boost.time"
02/08/2008 12:06:01 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "DROP table `rcaston_1202472351`"
02/08/2008 12:06:01 PM - BOOST: Poller[0] PROGERR: ERRNO:'2' TYPE:'Warning' MESSAGE:'socket_write(): 55 is not a valid Socket resource' IN FILE
:'/home/cacti/cacti_test/plugins/boost/setup.php' LINE NO:'1193'
02/08/2008 12:06:01 PM - BOOST: Poller[0] PROGERR: ERRNO:'2' TYPE:'Warning' MESSAGE:'socket_close(): 55 is not a valid Socket resource' IN FILE
:'/home/cacti/cacti_test/plugins/boost/setup.php' LINE NO:'1194'
|
|
|
| Back to top |
|
 |
TheWitness Developer
Joined: 14 May 2002 Posts: 9671 Location: MI, USA
|
Posted: Fri Feb 08, 2008 7:09 pm Post subject: |
|
|
Holly crap, nasty dependencies. This is one of my "projects" to get this thing working again.
TheWitness |
|
| Back to top |
|
 |
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|