Cacti (home)ForumsRepositoryDocumentation
Cacti: offical forums and support  

 FAQFAQ   SearchSearch   MemberlistMemberlist    RegisterRegister   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in    


spine 0.8.7-SVN segfaults

 
Post new topic   Reply to topic    Cacti Forum Index -> Help: Unstable Development Versions
Author Message
frankfegert



Joined: 16 Nov 2007
Posts: 26
Location: Stuttgart, Germany

PostPosted: Sat Feb 02, 2008 6:57 pm    Post subject: spine 0.8.7-SVN segfaults Reply with quote

Hello,

i don't know if this is related to the other spine segfault reports, so i'm starting a new topic.

Running:
- On Solaris 10 w/ net-snmp-5.3.1
- spine-0.8.7 from SVN with the following patches:
Code:

--- ping.c.orig Sat Jan 19 13:48:42 2008
+++ ping.c      Sat Jan 19 14:49:37 2008
@@ -226,7 +226,7 @@
        struct sockaddr_in fromname;
        char   socket_reply[BUFSIZE];
        int    retry_count;
-       char   *cacti_msg = "cacti-monitoring-system";
+       char   *cacti_msg = "cacti-monitoring-system\0";
        int    packet_len;
        int    fromlen;
        int    return_code;
@@ -750,8 +788,11 @@
                sum += *w++;
                nleft -= 2;
        }
-       if (nleft == 1)
-               sum += *(unsigned char*)w;
+    if (nleft == 1) {
+        *(unsigned char *)(&answer) = *(unsigned char *)w ;
+        sum += answer;
+    }
+
        sum = (sum >> 16) + (sum & 0xffff);
        sum += (sum >> 16);
        answer = ~sum;                          /* truncate to 16 bits */


--- poller.c.orig       Sun Jan  6 18:40:14 2008
+++ poller.c    Sun Jan  6 18:40:33 2008
@@ -126,7 +126,7 @@
        char last_snmp_password[50];
        char last_snmp_auth_protocol[5];
        char last_snmp_priv_passphrase[200];
-       char last_snmp_priv_protocol[6];
+       char last_snmp_priv_protocol[7];
        char last_snmp_context[65];

        /* reindex shortcuts to speed polling */


spine bombs out with the following truss output, but only on if the ICMP&SNMP downed detection method is chosen and a script query is run. It only happens every other run and not with SNMP-queries. If i turn the downed detection method to SNMP-only everything seems fine.

Code:

truss -f -wall -u ::snmp_shutdown /usr/local/bin/spine -R -S -f 6 -l 6
...
17479/1:        waitid(P_PID, 17481, 0xFFBFF490, WEXITED|WTRAPPED) = 0
17479/1:        brk(0x0003ADC8)                                 = 0
17479/1:        _exit(0)
17478/3:        waitid(P_PID, 17479, 0xFE977E88, WEXITED|WTRAPPED) = 0
Host[6] DS[195] SCRIPT: <script> <host>:<port> <parameters> valid, output: status:1 qm:10

17478/3:        write(1, 0xFECF7238, 162)                       = 162
17478/3:           H o s t [ 6 ]   D S [ 1 9 5 ]   S C R I P T :   <script>
17478/3:           <host>:<port> <parameters>
17478/3:              v a l i d ,   o u t p u t :   s t a t u s : 1   q m : 1 0
17478/3:          \n\n
17478/3:        brk(0x000F63E0)                                 = 0
17478/3:        brk(0x000FE3E0)                                 = 0
17478/3:        close(6)                                        = 0
17478/3:        fcntl(7, F_SETFL, FWRITE|FNONBLOCK)             = 0
17478/3:        read(7, 0x000EEC70, 8192)                       Err#11 EAGAIN
17478/3:        fcntl(7, F_SETFL, FWRITE)                       = 0
17478/3:        write(7, "01\0\0\001", 5)                       = 5
17478/3:        shutdown(7, SHUT_RDWR, SOV_DEFAULT)             = 0
17478/3:        close(7)                                        = 0
17478/3:        lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
17478/3:        lwp_exit()
17478/1:        nanosleep(0xFFBFF240, 0x00000000)               = 0
17478/1:        nanosleep(0xFFBFF240, 0x00000000)               = 0
17478/1@1:      -> libnetsnmp:snmp_shutdown(0x1f0e8, 0x0, 0xff000000, 0xff000000)
17478/1:            Incurred fault #6, FLTBOUNDS  %pc = 0xFEC56178
17478/1:              siginfo: SIGSEGV SEGV_MAPERR addr=0x001775F8
17478/1:            Received signal #11, SIGSEGV [caught]
17478/1:              siginfo: SIGSEGV SEGV_MAPERR addr=0x001775F8
17478/1:        lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
17478/1:        sigaction(SIGSEGV, 0xFFBFE3E0, 0xFFBFE480)      = 0
FATAL: Spine Encountered a Segmentation Fault (Spine parent)

17478/1:        write(1, 0xFECF7238, 62)                        = 62
17478/1:           F A T A L :   S p i n e   E n c o u n t e r e d   a   S e g m e
17478/1:           n t a t i o n   F a u l t   ( S p i n e   p a r e n t )\n\n
17478/1:        llseek(3, 0, SEEK_CUR)                          = 2463
17478/1:        _exit(11)


The spine processes started from cron keep hanging around and start piling up. Attaching to them with truss only shows them sleeping.

Has anyone else experienced this? The bugs related to net-snmps snmp_shutdown seemed to be all fixed in my version (5.3.1). I guess i'll try updating net-snmp tomorrow and will be reporting back.

Regards,

Frank
Back to top
frankfegert



Joined: 16 Nov 2007
Posts: 26
Location: Stuttgart, Germany

PostPosted: Sun Feb 03, 2008 5:47 pm    Post subject: Reply with quote

Hello,

updated to net-snmp-5.4.1 today. The problem remains the same - the spine thread bombs out with a segfault from snmp_spine_close()/snmp_shutdown().
Anyone got an any idea what - besides a bug in net-snmp - could be the cause?

Regards,

Frank
Back to top
fmangeant
Cacti Guru User


Joined: 19 Sep 2003
Posts: 2324
Location: Sophia-Antipolis, France

PostPosted: Mon Feb 04, 2008 2:48 am    Post subject: Reply with quote

Moving to "Unstable Development Versions".
Back to top
frankfegert



Joined: 16 Nov 2007
Posts: 26
Location: Stuttgart, Germany

PostPosted: Wed Feb 06, 2008 2:35 pm    Post subject: Reply with quote

Rebuild net-snmp-5.4.1 with debugging symbols and found this in the core dump:

# gdb /usr/local/bin/spine ../../core.spine.12453
GNU gdb 6.7.1
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.10"...
Reading symbols from /usr/local/lib/libnetsnmp.so.15...done.
Loaded symbols for /usr/local/lib/libnetsnmp.so.15
Reading symbols from /usr/local/lib/mysql/libmysqlclient_r.so.14...done.
Loaded symbols for /usr/local/lib/mysql/libmysqlclient_r.so.14
Reading symbols from /lib/librt.so.1...done.
Loaded symbols for /lib/librt.so.1
Reading symbols from /lib/libgen.so.1...done.
Loaded symbols for /lib/libgen.so.1
Reading symbols from /lib/libthread.so.1...
warning: Lowest section in /lib/libthread.so.1 is .dynamic at 00000074 done.
Loaded symbols for /lib/libthread.so.1
Reading symbols from /usr/local/lib/libssl.so.0.9.8...done.
Loaded symbols for /usr/local/lib/libssl.so.0.9.8
Reading symbols from /usr/local/lib/libcrypto.so.0.9.8...done.
Loaded symbols for /usr/local/lib/libcrypto.so.0.9.8
Reading symbols from /lib/libkstat.so.1...done.
Loaded symbols for /lib/libkstat.so.1
Reading symbols from /usr/lib/libz.so.1...done.
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libpthread.so.1...
warning: Lowest section in /lib/libpthread.so.1 is .dynamic at 00000074 done.
Loaded symbols for /lib/libpthread.so.1
Reading symbols from /lib/libm.so.2...done.
Loaded symbols for /lib/libm.so.2
Reading symbols from /lib/libsocket.so.1...done.
Loaded symbols for /lib/libsocket.so.1
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libc.so.1...done.
Loaded symbols for /lib/libc.so.1
Reading symbols from /usr/local/lib/libgcc_s.so.1...done.
Loaded symbols for /usr/local/lib/libgcc_s.so.1
Reading symbols from /lib/libaio.so.1...done.
Loaded symbols for /lib/libaio.so.1
Reading symbols from /lib/libmd.so.1...done.
Loaded symbols for /lib/libmd.so.1
Reading symbols from /lib/libdl.so.1...
warning: Lowest section in /lib/libdl.so.1 is .hash at 000000b4 done.
Loaded symbols for /lib/libdl.so.1
Reading symbols from /platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1...done.
Loaded symbols for /platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1
Reading symbols from /usr/lib/locale/en_US.ISO8859-1/en_US.ISO8859-1.so.3...done.
Loaded symbols for /usr/lib/locale/en_US.ISO8859-1/en_US.ISO8859-1.so.3
Reading symbols from /lib/ld.so.1...done.
Loaded symbols for /lib/ld.so.1
Core was generated by `/usr/local/bin/spine -R -S -f 6 -l 6 -V 5'.
Program terminated with signal 11, Segmentation fault.
#0 0xfec56178 in realfree () from /lib/libc.so.1
(gdb) bt
#0 0xfec56178 in realfree () from /lib/libc.so.1
#1 0xfec569a0 in _free_unlocked () from /lib/libc.so.1
#2 0xfec568dc in free () from /lib/libc.so.1
#3 0xff2a60dc in free_enums (spp=0xe19a0) at parse.c:5065
#4 0xff2a6258 in free_partial_tree (tp=0xe2d50, keep_label=0) at parse.c:853
#5 0xff2a63a4 in free_tree (Tree=0xe27c0) at parse.c:878
#6 0xff2a8880 in unload_module_by_ID (modID=48, tree_top=0xe2d50) at parse.c:3999
#7 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0xe2c70) at parse.c:3989
#8 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0xe2c00) at parse.c:3989
#9 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0xe2700) at parse.c:3989
#10 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0xe2880) at parse.c:3989
#11 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0x94420) at parse.c:3989
#12 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0x943b0) at parse.c:3989
#13 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0x94340) at parse.c:3989
#14 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0x942d0) at parse.c:3989
#15 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0x941c8) at parse.c:3989
#16 0xff2a88d0 in unload_module_by_ID (modID=48, tree_top=0x452d0) at parse.c:3989
#17 0xff2a8ba8 in unload_all_mibs () at parse.c:4071
#18 0xff29adb8 in shutdown_mib () at mib.c:2716
#19 0xff2bbbfc in snmp_shutdown (type=0x1eb08 "spine") at snmp_api.c:872
#20 0x0001586c in snmp_spine_close () at snmp.c:127
#21 0x000139e8 in main (argc=5000, argv=0xffbff320) at spine.c:608
(gdb) frame 3
#3 0xff2a60dc in free_enums (spp=0xe19a0) at parse.c:5065
warning: Source file is more recent than executable.
5065 free(pp->label);
(gdb) list
5060 *spp = NULL;
5061
5062 while (pp) {
5063 npp = pp->next;
5064 if (pp->label)
5065 free(pp->label);
5066 free(pp);
5067 pp = npp;
5068 }
5069 }
(gdb) print pp->next
$1 = (struct enum_list *) 0x0
(gdb) print pp->label
$2 = 0xe19b8 "readOnly"

Posted to the net-snmp-users ML. Maybe someone there has a clue why this is breaking.

Regards,

Frank
Back to top
TheWitness
Developer


Joined: 14 May 2002
Posts: 9255
Location: MI, USA

PostPosted: Wed Feb 06, 2008 4:57 pm    Post subject: Reply with quote

I have some code to commit. This issue is platform specific. My appologies. Just simply to busy.

TheWitness
Back to top
TheWitness
Developer


Joined: 14 May 2002
Posts: 9255
Location: MI, USA

PostPosted: Wed Feb 06, 2008 5:16 pm    Post subject: Reply with quote

Just for laughs, comment out the two functions:

init_snmp() and snmp_shutdown() and see what happens.

TheWtiness
Back to top
frankfegert



Joined: 16 Nov 2007
Posts: 26
Location: Stuttgart, Germany

PostPosted: Fri Feb 08, 2008 6:50 am    Post subject: Reply with quote

Commented 3 occurrences of init_snmp() and one snmp_shutdown(). The spine binary seems to work now, without dumping a core.
Just in case i did misinterpret the changes: This effectively disables SNMP-support within spine, doesn't it?
Back to top
TheWitness
Developer


Joined: 14 May 2002
Posts: 9255
Location: MI, USA

PostPosted: Fri Feb 08, 2008 7:01 pm    Post subject: Reply with quote

No. It simply disables some of the internals that slow it down a bit. This could have been a permissions problem. However, test using snmp devices and provide feedback.

TheWitness
Back to top
Display posts from previous:   
Post new topic   Reply to topic    Cacti Forum Index -> Help: Unstable Development Versions All times are GMT - 5 Hours
Page 1 of 1

 



Powered by phpBB © 2001, 2005 phpBB Group