Thursday, 8 April 2010

Plugin : Update RRD Tool with Kettle (Cacti, MRTG …).

Hi all,
This post will give you more details about my new Kettle plugin to feed RRDTools database.

What is RRDTool ?

According to Tobias Oetiker, RRDtool is the OpenSource industry standard, high performance data logging and graphing system for time series data. Use it to write your custom monitoring shell scripts or create whole applications using its Perl, Python, Ruby, TCL or PHP bindings.
You can learn more about RRDTools and Tobias Oetiker fantastic work on his homepage HERE.
With RRDTools, you can easily store time series data and create realtime graphics like the one below. For instance, I use it today for one of my client on Paris in order to monitor various real time business / IT indicators : travel booking, passenger reservation, search engine sollicitation, xml proxy load and mainframe usage.
image

The libs I used

I used JRobin, a java port of Tobias Oetiker RRDTools. JRobin was made by the talented Sasa Markovic. The JRobin home page are HERE and HERE. I recommand a visit in order to be fully aware of all JRobin features.
According to Sasa Markovic, “JRobin is a 100% pure java implementation of RRDTool's functionality. It follows the same logic and uses the same data sources, archive types and definitions as RRDTool does. JRobin supports all standard operations on Round Robin Database (RRD) files: CREATE, UPDATE, FETCH, LAST, DUMP, XPORT  and GRAPH. JRobin's API is made for those who are familiar with RRDTool's concepts and logic, but prefer to work with pure java. If you provide the same data to RRDTool and JRobin, you will get exactly the same results and graphs.”
I confirm everything.
The graphical rendering is very good looking as you can see below.


 

The plugin

First, I recommand to read carefully everything related to RRDTools and JRobin. You must be familiar with this technology first.
The Kettle plugin is quite simple : a single user interface to create a RRD file, add archives and feed the file.
image
image A short description of this window :
  • Nom étape (sorry in french, will be translated) : Step name
  • RRD File : the RRD file to be created. This file will hold all your time series data and archives.
  • Datasource : an RRD file can have 1 or more datasource. For the moment, my plugin is restricted to 1 datasource, which is most of the time enough.
  • Type :
    • Gauge : Does no store the rate of change, it saves the actual value itself.
    • Counter : To store the rate of change of the value over a step period (assume the value is always increasing). Ex : traffic counters.
    • Derive : The same as Counter, but will handle negative values. Ex : free disk space.
    • Absolute : To store the rate of change, but the previous value is set to 0.
  • Heartbeat : If the RRD file does not receive value (PDP) within 300 seconds, it will wait for another 300 seconds (total = 600 seconds). If no value after 600 seconds, the flag UNKNOWN will be stored.
  • Starttime : The unix timestamp as the RRD file starting point. Must be a unix timestamp. In a future release, I will code a converter and place it into the user interface. You can easily compute unix timestamps by using this web page or this one. This timestamp must be lower than the one coming from your data.
  • Min and Max : The minimum value and the maximum value, if predictable.
  • The combo zone : This combo gadget will be used to define RRA : Round Robin Archives. An RRA will define how the consolidated data is stored. We have 4 major parameters :
    • CF, for Consolidation Function :
      • AVERAGE : Store the average value
      • MIN : Store the minimum value
      • MAX : Store the max value
      • LAST : Store the last known value
    • xff : XFile factor. This is the percentage of values that can be unknown without making the recorded value flagged as UNKNOWN. Must be between 0 and 1, with 0.1 intervals.
    • Steps :  Number of values to be consolidated, regarding the chosen CF. Must be integer.
    • Rows : Number of samples to keep. Must be integer.
The Add button will add the RRA (round robin archive) in the combo list, and then will be used for the RRD file creation.
Once the RRD file is successfully created, you will see a little message on botton of the user interface.
image
Let’s have a look to some RRD file internals, using another nice tool from Sasa Markovic : RRD inspector. I’m sure you will easily understand the RRD structure, if you are not already familiar with RRDTools.
The above screen shows us a RRD file created with one datasource called Speed, using a type GAUGE with a heartbeat of 600 seconds, with minimum and maximum values set to 0 to 2000. image
This RRD file also has a unique RRA (round robin archive), using the AVERAGE consolidation function, with xff set to 0.5, 1 step (compute each value = no average in fact) and 24 rows.This file has been created using the Kettle plugin on my C:\ harddrive.image
If we select then panel “Archive data” we will be able to see all the values currently stored into the RRD file.image
RRD inspector is a fantastic little tool, very usefull when creating RRD files and checking everything is well done.

How to use the plugin ?

Very simple. First you create a RRD File using the user interface shown above. Then you have to control the file has been created, just to be sure. Finally, you can connect the step to a previous one in Kettle. In my example, I used a flat file containing some simple timestamps and values.
Here is my flat file : a unix timestamp with 5 mins intervals (starting Thu, 8 Apr 2010 12:00:00 UTC) and some simple values from 5 to 140.
TimeStamp;Value
1270728000;5
1270728300;10
1270728600;15
1270728900;20
1270729200;25
1270729500;40
1270729800;50
1270730100;60
1270730400;70
1270730700;80
1270731000;90
1270731300;100
1270731600;120
1270731900;140
And here is my sample transformation.
image
Hit play, and voilà … the plugin will feed the RRD file and give you a nice output log for each value. To be short : a RRD file is only expecting a unix timestamp and a value.
image

Generating a graph

Well, this is not really Kettle oriented, but I will give you some code to create graphics from your RRD file, previously loaded with Kettle.
This simple java snippet …
public static void RenderRRDGraph(long TimeStart, long TimeStop, String Consol, String RRDGraphFormat) throws IOException, RrdException{
    //Create gif graph
    RrdGraphDef graphDef = new RrdGraphDef();
    graphDef.setVerticalLabel("m/s");
    graphDef.setTimeSpan(TimeStart, TimeStop);
    graphDef.datasource("myspeed", "C:\\testRRD", "speed", Consol);
    graphDef.line("myspeed", new Color(0xFF, 0, 0), null, 2);
    graphDef.setFilename("C:\\testRRD." + RRDGraphFormat);
    RrdGraph graph = new RrdGraph(graphDef);
}
testRRD
… will create this png.
Very simple as you can see (well this example is really really simple compared to what we can really do, but I can’t give you any snapshot of the graphs I did for my client – I have NDA on this). You can imagine now creating some real time graphics (RTG) using this technology.

The package

Let’s go back to Pentaho and Kettle : I created a package for you. You will find the plugin itself (compiled and archived under Eclipse using Fat Jar in order to embedd the JRobin library, the icon, the xml file, the flat file and a sample transformation (the one described above).
This package can be donwloaded on its Google code page.
Please keep me informed about your testing, and feel free to contact me if further features (or fixes !) are needed.

3 commentaires:

Alexandre Dumont said...

This plugin looks really great! I was however looking for the other way to integrate Kettle & RRD, that is extracting data FROM RRD files.

Do you know if there is such plugin out there (using Kettle PDI CE)?

Fabrice Bacchella said...

I see you use jrobin, which is not very active any more. Did you have a look at http://code.google.com/p/rrd4j/, which is a fork of rrd4j, is more active and got very nice performances improvements ?

Anonymous said...

Never, never, never, never give up

-----------------------------------