|
|
| Author |
Message |
pierre-luc
Joined: 10 Aug 2005 Posts: 8 Location: Montreal, Canada
|
Posted: Mon Jun 02, 2008 11:45 am Post subject: NetApp Filer: graphing Performance Stats and IO's (template) |
|
|
Hello,
Here is the host template and scripts I did this to graph storage performance for Netapp Filer using Manage Ontap SDK 3.0: Perl API.
graph list:
- LUN: IOPS, Latency, data throuput
- Volume: IOPS, Latency
- Target interfaces: IOPS
- filer total IOPS per protocols (FC/iSCSI/nfs/cifs/...)
See screenshot.
With the host template of Network-Appliance using SNMPv1 available on this forum, Gathering NetApp SAN performance statistics with Cacti is quite complete.
Requirements:
- Manage OnTap SDK 3.0 perl api install on cacti host
- Netapp Filer: http enable
tested on cacti Version 0.8.7b
| Description: |
| graph sample of Netapp-ontapsdk template. |
|
| Filesize: |
321.02 KB |
| Viewed: |
21844 Time(s) |

|
| Description: |
| cacti template, scripts and .xml files |
|
 Download |
| Filename: |
NetApp_OnTap-SDK_cacti-20080602.tgz |
| Filesize: |
256.95 KB |
| Downloaded: |
1943 Time(s) |
|
|
| Back to top |
|
 |
pflaherty
Joined: 17 Mar 2008 Posts: 2
|
Posted: Mon Jun 02, 2008 2:41 pm Post subject: Good work |
|
|
I've been playing with the SDK for a few weeks and had a half working implementation of this when I saw your post. Templates all installed with no trouble. Everything seems to be graphing correclty. I wanted to make a dig at the color scheme, but it's growing on me -=]
Awesome work, you saved me a ton of time.
|
|
| Back to top |
|
 |
wazoqaz
Joined: 24 May 2006 Posts: 15 Location: md, us
|
Posted: Thu Jun 05, 2008 7:18 am Post subject: |
|
|
Beautiful!! This was the first that I'd heard of the SDK. The installation was simple and the result are great. It is nice to have another view of what is going on inside my filer.
Just a question, would it make sense to repalce the other NetApp graphs done via SNMP with similar ones done via the SDK?
Thanks for your hard work.
|
|
| Back to top |
|
 |
evilensky
Joined: 05 Jun 2008 Posts: 1
|
Posted: Thu Jun 05, 2008 2:50 pm Post subject: |
|
|
| This looks great. Thank you!
|
|
| Back to top |
|
 |
o_dupuis
Joined: 18 Mar 2005 Posts: 18 Location: Paris/France
|
Posted: Wed Jun 11, 2008 6:28 am Post subject: Problem |
|
|
Hi,
those template looks great but I must confess that I couldn't make them work.
First I discovered that with cactid the full perl path should be provided in the xml files (query-netapp-ontapsdk-lun.xml..)
I fixed this but I keep getting a partial results error :
| Code: |
06/11/2008 12:22:05 AM - CACTID: Poller[0] Host[42] DS[2160] WARNING: Result from SCRIPT not valid. Partial Result: ...
06/11/2008 12:22:05 AM - CACTID: Poller[0] Host[42] DS[2160] SCRIPT: /usr/bin/perl /opt/apache/php/cacti-0.8.7b/scripts/netapp-ontapsdk-perf.pl 10.12.2.3 "cacti" "cacti2008" volume get avg_latency voloracle_1_archive, output: U
06/11/2008 12:22:05 AM - CACTID: Poller[0] Host[42] DEBUG: The POPEN returned the following File Descriptor 16
06/11/2008 12:22:05 AM - CACTID: Poller[0] Host[42] ERROR: Empty result [10.12.2.3]: '/usr/bin/perl /opt/apache/php/cacti-0.8.7b/scripts/netapp-ontapsdk-perf.pl 10.12.2.3 "cacti" "cacti2008" volume get avg_latency voloracle_1_data'
|
but if I run the script manually I get the correct answer :
| Code: | # /usr/bin/perl /opt/apache/php/cacti-0.8.7b/scripts/netapp-ontapsdk-perf.pl 10.12.2.3 "cacti" "cacti2008" volume get avg_latency voloracle_1_data
8363761585
# |
Thx for any help,
Olivier
|
|
| Back to top |
|
 |
mark@eatonkaye.com
Joined: 25 Jan 2006 Posts: 6
|
Posted: Wed Jun 11, 2008 11:24 am Post subject: Cool NetAPP Query |
|
|
Hi,
This looks cool, I have got the attached template working - thank you this great use of the SDK (very little known of in many circles of NetAPP apparently speaking to one of there SE) however it does not seem to include a lot of the graphs I can see from you pics the script queries systems etc seem to be present. but only get the following graphs when I use the host template.
I could quite easily concede I am missing something
All Nics+
cache age
CIFS Ops
CPU % Busy
NFS Ops
Many thanks for posting this info, its nice to be able to put NetAPP Performance in our common dash board i.e. Cacti and not just use DFM OppMan.
Kind regards,
Mark Kaye
|
|
| Back to top |
|
 |
mark@eatonkaye.com
Joined: 25 Jan 2006 Posts: 6
|
Posted: Wed Jun 11, 2008 11:39 am Post subject: plz Ignore previous post - user error (mine) |
|
|
Sorry
Mark
|
|
| Back to top |
|
 |
pierre-luc
Joined: 10 Aug 2005 Posts: 8 Location: Montreal, Canada
|
Posted: Mon Jun 16, 2008 1:00 pm Post subject: |
|
|
Hi Mark,
Regarding missing graph, I would say that regarding protocol specific graph, it can be very easy to add since NFS and CIFS IOPS are provide for the "Per Protocol" graph.
All nics and cache age, for now we are monitoring them using SNMP and another cacti template provide somewhere in this forums.
It could be a good idea to add these feature in the SDK template and used only the SDK to gather stats... future project...
Thanks all for your comments.
|
|
| Back to top |
|
 |
eschoeller Cacti User
Joined: 13 Dec 2004 Posts: 139
|
Posted: Tue Aug 05, 2008 12:22 pm Post subject: |
|
|
This is great data! It seems to come with a heavy cost for us however. Is anyone else noticing severe performance issues after using this template?
I initially added over 300 data sources using this template in my development environment. It ran fine for a little while, until I noticed that performance had degraded so badly that my poller was timing out. Before adding these data sources my poller runtime was around 15s, then it was timing out after 56s.
I trimmed this down to only 60 data sources, but I'm still seeing terrible performance. I am using a 1 minute poller so I don't have much flexibility to run a polling interval longer than 45s. I attached several charts to indicate the issues I am seeing.
This is running on a Dell Precision 450 desktop. It only has one CPU and one disk, so take that into account, but please don't completely blame it on the hardware. If I had added an additional 60 SNMP data sources I would have never seen such performance loss.
I've tried tweaking #threads, #processes, #script servers but it hasn't improved anything. I am already running the latest version of spine. From the logs I can always see that the netapp-ontapsdk-perf.pl script is running towards the end of the polling cycle, so I know that's what is prolonging the runtime.
Running the netapp-ontapsdk-perf.pl by hand while my poller isn't running usually takes about half a second. Then, while the poller is running it can take as long as 4-5 seconds to run. This leads me to believe it's possibly a system issue.
But why is this data collection script so resource intensive?
I have also noticed that on many of the context menus, the Netapp graphs show up first in the list and not in alphabetical order, but this is probably an entirely different issue.
| Description: |
| poller runtime before and after using this template |
|
| Filesize: |
24.92 KB |
| Viewed: |
19876 Time(s) |

|
| Description: |
| number of objects before and after using this template |
|
| Filesize: |
28.29 KB |
| Viewed: |
19876 Time(s) |

|
| Description: |
| load of cacti host before and after using this template |
|
| Filesize: |
41.68 KB |
| Viewed: |
19876 Time(s) |

|
| Description: |
| cpu usage of cacti host before and after using this template |
|
| Filesize: |
31.34 KB |
| Viewed: |
19876 Time(s) |

|
|
|
| Back to top |
|
 |
eschoeller Cacti User
Joined: 13 Dec 2004 Posts: 139
|
Posted: Thu Aug 07, 2008 11:15 am Post subject: |
|
|
I also noticed that Interface traffic went through the roof as well. It seems that this script is pulling in LOTS of data and doing some sort of computational work to come up with the figures it needs. I haven't had a chance to look closely at the script to see if there is any room for optimization.
| Description: |
| interface traffic before and after using this template |
|
| Filesize: |
26.15 KB |
| Viewed: |
19764 Time(s) |

|
|
|
| Back to top |
|
 |
pierre-luc
Joined: 10 Aug 2005 Posts: 8 Location: Montreal, Canada
|
Posted: Fri Aug 08, 2008 9:17 am Post subject: |
|
|
Hello eschoeller,
Yes, the script netapp-ontapsdk-perf.pl is not optimized! I got a bug with the Manage-OnTap-SDK while I was devellopping the template, the SDK was unable to return a specific value for a specific object (example: query avg_latency for a LUN). So, The actual API that work is to query all LUN for the avg_latency and than grab the selected one. This mean that if there is 300 LUN into your Filer, the API will return 300 value to the netapp-ontapsdk-perf.pl script. Here is the HUGE over head of this template.
There is a very small thread on NetApp forum regarding this issue: http://communities.netapp.com/thread/1405?tstart=0
So because the API "perf-object-get-instances" wasn't working I used the
API "perf-object-get-instances-iter-*" it almost like querying the universe to grab a mosquito.
I hope future release of SDK will fix this issue so it would improve performance...
Regarding the graph sorting, I didn't try something to do the sorting in alphabetic order. the actual sorting is based on the index provide by the API which is by objects creation date. I'm not sure if changing the sorting index in the query-netapp-ontapsdk-*.xml files would fix it or create another issue???
P-L
|
|
| Back to top |
|
 |
eschoeller Cacti User
Joined: 13 Dec 2004 Posts: 139
|
Posted: Fri Aug 15, 2008 1:14 pm Post subject: |
|
|
I read the short post mentioned above. I have another member on our team looking into optimizing the code. In the meantime we upgraded to a Dell 2950 quad core 3ghz xeon with 8 GB of ram, 4 column RAID 10 disk.
Here are the performance metrics of the cacti server before and after the upgrade in case anyone is interested.
But, Long story short, these templates will work OK with a fast enough system, Despite the fact that there is a lot of room for performance improvements. I still have around 750 Data sources and 550 RRDs.
| Description: |
| poller runtime before and after upgrade. |
|
| Filesize: |
22.68 KB |
| Viewed: |
19443 Time(s) |

|
| Description: |
| Load before and after upgrade |
|
| Filesize: |
33.14 KB |
| Viewed: |
19443 Time(s) |

|
| Description: |
| CPU usage before and after upgrade |
|
| Filesize: |
32.52 KB |
| Viewed: |
19443 Time(s) |

|
|
|
| Back to top |
|
 |
kkoduru
Joined: 25 Jun 2008 Posts: 7
|
Posted: Wed Sep 17, 2008 6:03 pm Post subject: Not discovering objects |
|
|
Hi Gurus
I was able to get the ontap sdk and import the template. The first issue I faced was with perl where it complains about "\N" and I had to give the entire path with double-backslashes
use lib "C:\\manage-ontap-sdk-1.6\\lib\\perl\\NetApp"
Now, when i discover the filer, it cannot find any objects with the below message.
This data query returned 0 rows, perhaps there was a problem executing this data query. You can run this data query in debug mode to get more information.
Upon running in verbose mode, below is the output
+ Running data query [17].
+ Found type = '4 '[script query].
+ Found data query XML file at 'C:/Inetpub/wwwroot/cacti/resource/script_queries/query-netapp-ontapsdk-system.xml'
+ XML file parsed ok.
+ Executing script for list of indexes 'perl C:\Inetpub\wwwroot\cacti\scripts\netapp-ontapsdk-perf.pl nfiler2.rws.ad.ea.com "xxxx" "xxxx" system index'
+ Executing script query 'perl C:\Inetpub\wwwroot\cacti\scripts\netapp-ontapsdk-perf.pl nfiler2.rws.ad.ea.com "xxxx" "xxxx" system query index'
+ Found data query XML file at 'C:/Inetpub/wwwroot/cacti/resource/script_queries/query-netapp-ontapsdk-system.xml'
+ Found data query XML file at 'C:/Inetpub/wwwroot/cacti/resource/script_queries/query-netapp-ontapsdk-system.xml'
+ Found data query XML file at 'C:/Inetpub/wwwroot/cacti/resource/script_queries/query-netapp-ontapsdk-system.xml'
If i run the script manually, it works just fine returning the information that is expected to see..the lun info etc.
Could you please point me where I am doing wrong?
thanks in advance
KK
Last edited by kkoduru on Thu Sep 18, 2008 3:58 pm; edited 2 times in total |
|
| Back to top |
|
 |
eschoeller Cacti User
Joined: 13 Dec 2004 Posts: 139
|
Posted: Wed Sep 17, 2008 7:21 pm Post subject: |
|
|
This is what mine looks like:
| Code: |
+ Running data query [14].
+ Found type = '4 '[script query].
+ Found data query XML file at '/usr/local/cacti-0.8.7b/resource/script_queries/query-netapp-ontapsdk-system.xml'
+ XML file parsed ok.
+ Executing script for list of indexes 'perl /usr/local/cacti-0.8.7b/scripts/netapp-ontapsdk-perf.pl 255.255.255.123 "USER" "PASSWORD" system index'
+ Executing script query 'perl /usr/local/cacti-0.8.7b/scripts/netapp-ontapsdk-perf.pl 255.255.255.123 "USER" "PASSWORD" system query index'
+ Found item [index='system'] index: system
+ Found data query XML file at '/usr/local/cacti-0.8.7b/resource/script_queries/query-netapp-ontapsdk-system.xml'
+ Found data query XML file at '/usr/local/cacti-0.8.7b/resource/script_queries/query-netapp-ontapsdk-system.xml'
+ Found data query XML file at '/usr/local/cacti-0.8.7b/resource/script_queries/query-netapp-ontapsdk-system.xml'
+ Found data query XML file at '/usr/local/cacti-0.8.7b/resource/script_queries/query-netapp-ontapsdk-system.xml'
|
I have 4 lines of data query XML, you only have 3. Another thing, since you're using windows, you may have to specify the full path to your perl binary.
Hope this helps!
|
|
| Back to top |
|
 |
kkoduru
Joined: 25 Jun 2008 Posts: 7
|
Posted: Thu Sep 18, 2008 1:02 pm Post subject: |
|
|
| I am not sure what/where the 3-line vs 4-line output is controlled from as it should be coming from the script itself. Also, the perl path is embedded into the cacti installer; i couldn't find a way to change it.
|
|
| Back to top |
|
 |
|