Creating a portable RRD format

Current Situation

The RRD data format is native. This means it depends on the architecture of the machine as well as on the OS used to write the data. This has the advantage of being fast and simple. The downside of this is, that data can in general not be accessed from two different HW/OS combinations.

There are some provisions in RRD to detect such cross architecture access. But detection is not perfect. Some combinations are bound to even almost work. Most CPUS only differ in their representation of NANs.

Figuring 64bit Floating Point Numbers

The portable RRD format should work on all platforms transparently. My first idea was to leverage Suns XDR format for RRD. Unfortunately, XDR does not handle NANs, which are pretty essential for RRDtool. So I did some investigations into binary representation of IEEE 754 floating point data. I found that it is actually pretty simple to bridge the gap between the different architectures. The following program helped a lot in this task. It shows the binary representation of a few 'interesting' floating point values.

#include <stdio.h>
#include <inttypes.h>
typedef union INSPECTOR {
    uint8_t   b[8];
    uint64_t  l;
    double    f;
} INSPECTOR;

int main(
    int argc,   
    char *argv[])
{
    int i,ii;
    double number[] = {
        0,1,-1,0.0/0.0,1.0/0.0,-1.0/0.0,2,4,8,16,8.642135E130
    };
    for (i=0;i<11;i++){
        INSPECTOR native;
        native.f = number[i];
        printf("%16e -> ",native.f);
        for (ii=0;ii<8;ii++)
            printf(" %02x",native.b[ii]);
        printf("\n");
    }
    return 0;
}

I used this to figure the binary representation of double precision floating point numbers on several architectures:

SPARC 32 and 64 bit:

    0.000000e+00 -> 00 00 00 00  00 00 00 00
    1.000000e+00 -> 3f f0 00 00  00 00 00 00
   -1.000000e+00 -> bf f0 00 00  00 00 00 00
             NaN -> 7f ff ff ff  ff ff ff ff
             Inf -> 7f f0 00 00  00 00 00 00
            -Inf -> ff f0 00 00  00 00 00 00
    2.000000e+00 -> 40 00 00 00  00 00 00 00
    4.000000e+00 -> 40 10 00 00  00 00 00 00
    8.000000e+00 -> 40 20 00 00  00 00 00 00
    1.600000e+01 -> 40 30 00 00  00 00 00 00
   8.642135e+130 -> 5b 1f 2b 43  c7 c0 25 2f

PPC / POWER5 (32 and 64bit)

    0.000000e+00 -> 00 00 00 00  00 00 00 00
    1.000000e+00 -> 3f f0 00 00  00 00 00 00
   -1.000000e+00 -> bf f0 00 00  00 00 00 00
             nan -> 7f f8 00 00  00 00 00 00
             inf -> 7f f0 00 00  00 00 00 00
            -inf -> ff f0 00 00  00 00 00 00
    2.000000e+00 -> 40 00 00 00  00 00 00 00
    4.000000e+00 -> 40 10 00 00  00 00 00 00
    8.000000e+00 -> 40 20 00 00  00 00 00 00
    1.600000e+01 -> 40 30 00 00  00 00 00 00
   8.642135e+130 -> 5b 1f 2b 43  c7 c0 25 2f

PA-RISC64

    0.000000e+00 ->  00 00 00 00 00 00 00 00
    1.000000e+00 ->  3f f0 00 00 00 00 00 00
   -1.000000e+00 ->  bf f0 00 00 00 00 00 00
             nan ->  7f f4 00 00 00 00 00 00
             inf ->  7f f0 00 00 00 00 00 00
            -inf ->  ff f0 00 00 00 00 00 00
    2.000000e+00 ->  40 00 00 00 00 00 00 00
    4.000000e+00 ->  40 10 00 00 00 00 00 00
    8.000000e+00 ->  40 20 00 00 00 00 00 00
    1.600000e+01 ->  40 30 00 00 00 00 00 00
   8.642135e+130 ->  5b 1f 2b 43 c7 c0 25 2f

MIPS

    0.000000e+00 ->  00 00 00 00 00 00 00 00
    1.000000e+00 ->  3f f0 00 00 00 00 00 00
   -1.000000e+00 ->  bf f0 00 00 00 00 00 00
             nan ->  7f ff ff ff ff ff ff ff
             inf ->  7f f0 00 00 00 00 00 00
            -inf ->  ff f0 00 00 00 00 00 00
    2.000000e+00 ->  40 00 00 00 00 00 00 00
    4.000000e+00 ->  40 10 00 00 00 00 00 00
    8.000000e+00 ->  40 20 00 00 00 00 00 00
    1.600000e+01 ->  40 30 00 00 00 00 00 00
   8.642135e+130 ->  5b 1f 2b 43 c7 c0 25 2f

x86 32 and 64 bit

    0.000000e+00 -> 00 00 00 00  00 00 00 00
    1.000000e+00 -> 00 00 00 00  00 00 f0 3f
   -1.000000e+00 -> 00 00 00 00  00 00 f0 bf
             nan -> 00 00 00 00  00 00 f8 7f
             inf -> 00 00 00 00  00 00 f0 7f
            -inf -> 00 00 00 00  00 00 f0 ff
    2.000000e+00 -> 00 00 00 00  00 00 00 40
    4.000000e+00 -> 00 00 00 00  00 00 10 40
    8.000000e+00 -> 00 00 00 00  00 00 20 40
    1.600000e+01 -> 00 00 00 00  00 00 30 40
   8.642135e+130 -> 2f 25 c0 c7  43 2b 1f 5b

IA64 (Itanium)

    0.000000e+00 ->  00 00 00 00 00 00 00 00
    1.000000e+00 ->  00 00 00 00 00 00 f0 3f
   -1.000000e+00 ->  00 00 00 00 00 00 f0 bf
             nan ->  00 00 00 00 00 00 f8 ff
             inf ->  00 00 00 00 00 00 f0 7f
            -inf ->  00 00 00 00 00 00 f0 ff
    2.000000e+00 ->  00 00 00 00 00 00 00 40
    4.000000e+00 ->  00 00 00 00 00 00 10 40
    8.000000e+00 ->  00 00 00 00 00 00 20 40
    1.600000e+01 ->  00 00 00 00 00 00 30 40
   8.642135e+130 ->  2f 25 c0 c7 43 2b 1f 5b

ARMv4l

    0.000000e+00 ->  00 00 00 00 00 00 00 00
    1.000000e+00 ->  00 00 f0 3f 00 00 00 00
   -1.000000e+00 ->  00 00 f0 bf 00 00 00 00
             nan ->  ff ff ff ff ff ff ff ff
             inf ->  00 00 f0 7f 00 00 00 00
            -inf ->  00 00 f0 ff 00 00 00 00
    2.000000e+00 ->  00 00 00 40 00 00 00 00
    4.000000e+00 ->  00 00 10 40 00 00 00 00
    8.000000e+00 ->  00 00 20 40 00 00 00 00
    1.600000e+01 ->  00 00 30 40 00 00 00 00
   8.642135e+130 ->  43 2b 1f 5b 2f 25 c0 c7

ARMv5b

    0.000000e+00 ->  00 00 00 00 00 00 00 00
    1.000000e+00 ->  3f f0 00 00 00 00 00 00
   -1.000000e+00 ->  bf f0 00 00 00 00 00 00
             nan ->  7f f8 00 00 00 00 00 00
             inf ->  7f f0 00 00 00 00 00 00
            -inf ->  ff f0 00 00 00 00 00 00
    2.000000e+00 ->  40 00 00 00 00 00 00 00
    4.000000e+00 ->  40 10 00 00 00 00 00 00
    8.000000e+00 ->  40 20 00 00 00 00 00 00
    1.600000e+01 ->  40 30 00 00 00 00 00 00
   8.642135e+130 ->  5b 1f 2b 43 c7 c0 25 2f

Alpha (cc -ieee)

    0.000000e+00 ->  00 00 00 00 00 00 00 00
    1.000000e+00 ->  00 00 00 00 00 00 f0 3f
   -1.000000e+00 ->  00 00 00 00 00 00 f0 bf
            NaNQ ->  00 00 00 00 00 00 f8 ff
             INF ->  00 00 00 00 00 00 f0 7f
            -INF ->  00 00 00 00 00 00 f0 ff
    2.000000e+00 ->  00 00 00 00 00 00 00 40
    4.000000e+00 ->  00 00 00 00 00 00 10 40
    8.000000e+00 ->  00 00 00 00 00 00 20 40
    1.600000e+01 ->  00 00 00 00 00 00 30 40
   8.642135e+130 ->  2f 25 c0 c7 43 2b 1f 5b

As you can see, there is not all that much difference between the architectures (it is all IEEE 754 after all). For one there is the endianess difference and then there are some differing ideas regarding NANs. In any event, a converter between these formats is only a few defines away.

#define endianflip(A) ((((uint64_t)(A) & 0xff00000000000000LL) >> 56) | \
                       (((uint64_t)(A) & 0x00ff000000000000LL) >> 40) | \
                       (((uint64_t)(A) & 0x0000ff0000000000LL) >> 24) | \
                       (((uint64_t)(A) & 0x000000ff00000000LL) >> 8)  | \
                       (((uint64_t)(A) & 0x00000000ff000000LL) << 8)  | \
                       (((uint64_t)(A) & 0x0000000000ff0000LL) << 24) | \
                       (((uint64_t)(A) & 0x000000000000ff00LL) << 40) | \
                       (((uint64_t)(A) & 0x00000000000000ffLL) << 56))

#define intswap(A)    ((((uint64_t)(A) & 0xffffffff00000000LL) >> 32) | \
                       (((uint64_t)(A) & 0x00000000ffffffffLL) << 32))

#define sparc2x86(A)   ((uint64_t)(A) == 0x7fffffffffffffffLL \
                                       ? 0x000000000000f87fLL \
                                       : endianflip(A))

#define x862sparc(A)   ((uint64_t)(A) == 0x000000000000f87fLL \
                                       ? 0x7fffffffffffffffLL \
                                       : endianflip(A))

#define mips2x86(A)   ((uint64_t)(A) == 0x7fffffffffffffffLL \
                                       ? 0x000000000000f87fLL \
                                       : endianflip(A))

#define x862mips(A)   ((uint64_t)(A) == 0x000000000000f87fLL \
                                       ? 0x7fffffffffffffffLL \
                                       : endianflip(A))


#define parisc2x86(A)  ((uint64_t)(A) == 0x7ff4000000000000LL \
                                       ? 0x000000000000f87fLL \
                                       : endianflip(A))

#define x862parisc(A)  ((uint64_t)(A) == 0x000000000000f87fLL \
                                       ? 0x7ff4000000000000LL \
                                       : endianflip(A))

#define ppc2x86(A)     endianflip(A)

#define x862ppc(A)     endianflip(A)

#define armv5b2x86(A)  endianflip(A)

#define x862armv5b(A)  endianflip(A)

#define armv4l2x86(A) ((uint64_t)(A) == 0xffffffffffffffffLL \
                                       ? 0x000000000000f87fLL \
                                       : intswap(A))

#define x862armv4l(A)  ((uint64_t)(A) == 0x000000000000f87fLL \
                                       ? 0xffffffffffffffffLL \
                                       : intswap(A))

#define itanium2x86(A) ((uint64_t)(A) == 0x000000000000f8ffLL \
                                       ? 0x000000000000f87fLL \
                                       : A )

#define x862itanium(A) ((uint64_t)(A) == 0x000000000000f87fLL \
                                       ? 0x000000000000f8ffLL \
                                       : A )

#define alpha2x86(A)   ((uint64_t)(A) == 0x000000000000f8ffLL \
                                       ? 0x000000000000f87fLL \
                                       : A )

#define x862alpha(A)   ((uint64_t)(A) == 0x000000000000f87fLL \
                                       ? 0x000000000000f8ffLL \
                                       : A )


I have done some quick performance tests on a Pentium 4 3.00GHz and found that the x86->sparc conversion runs at 3e8 (300 million) updates per second if I compiled my test with -O0. The same test with -O2 was 5 times faster, but maybe the optimizer found that I was not actually using the data and optimized it away. In any event, that there will be no measurable performance impact from such a change since the rrd_update process runs at 20k updates per second when all data is kept in memory.

Data Alignment

Most of today's workstations run either 32 or 64 bit. This also has an influence on the data layout. The classic RRDtool data format heavily relies on doubles and longs, bundled into structs. This is a challenging mix for portability:

  • longs are 4 bytes long in 32bit OSes. In 64 bit OSes, they are often represented as 8 byte integers.
  • struct members are aligned to either 32 bit or 64 bit boundaries.

The following little program tests how structs get aligned by your compiler/os/hw.

#include <stdio.h>
#include <string.h>
#include <inttypes.h>

typedef enum IDEAS {
    ID_ONE = 0, ID_TWO, ID_THREE, ID_FOUR
} IDEAS;


typedef struct FORMATa {
    char      c1[5];
    uint32_t  i1;
    double    f1;
    uint32_t  i2;
    double    f2;
    uint64_t  l1;
    IDEAS     e1;
} FORMATa;

typedef struct FORMATb { 
    char      c1[8];   
    uint32_t  i1;
    uint32_t  i2;
    double    f1;  
    double    f2;
    uint64_t  l1;
    IDEAS     e1;
} FORMATb;

typedef union OVERLAYa { 
    uint8_t   c[sizeof(FORMATa)];
    FORMATa   f;
} OVERLAYa;

typedef union OVERLAYb { 
    uint8_t   c[sizeof(FORMATb)];
    FORMATb   f;
} OVERLAYb;

int main(
    int argc,
    char *argv[])
{
    uint32_t  i;

    OVERLAYa  sta;
    OVERLAYb  stb;
 
    for (i = 0; i < sizeof(OVERLAYa); i++) {
        sta.c[i] = 0x88;
    };
    strcpy(sta.f.c1, "012");
    sta.f.i1 = 123456789;
    sta.f.f1 = 123456789e123;
    sta.f.f2 = 123456789e123;
    sta.f.i2 = 123456789;
    sta.f.e1 = ID_TWO;
    sta.f.l1 = 123456789;

    printf("struct test non-aligned:\n");
    for (i = 0; i < sizeof(OVERLAYa); i++) {
        printf(" %02x", sta.c[i]);
        if (i % 4 == 3)
            printf("  ");
        if (i % 8 == 7)
            printf("\n");
    };
    printf("\n\n");

    for (i = 0; i < sizeof(OVERLAYb); i++) {
        stb.c[i] = 0x88;
    };
      strcpy(stb.f.c1, "012");
    stb.f.i1 = 123456789;
    stb.f.f1 = 123456789e123;
    stb.f.f2 = 123456789e123;
    stb.f.i2 = 123456789;
    stb.f.e1 = ID_TWO;
    stb.f.l1 = 123456789;

    printf("struct test aligned:\n");
    for (i = 0; i < sizeof(OVERLAYb); i++) {
        printf(" %02x", stb.c[i]);
        if (i % 4 == 3)
            printf("  ");
        if (i % 8 == 7)
            printf("\n");
    };
    printf("\n\n");

    return 0;
}

I have run this on a number of different architectures. The memory is initialized to 0x88 so all places where you see 88 the memory has not been touched. In the first line, this is to be expected since the char array is 8 bytes long but I copy only a 3 byte null terminated string into it. If you can add yours this would be great.

On Linux x86/32bit gcc-4.1 everything is tightly packed.

struct test non-aligned:
 30 31 32 00   88 88 88 88  
 15 cd 5b 07   81 ee 46 95  
 5e 43 26 5b   15 cd 5b 07  
 81 ee 46 95   5e 43 26 5b  
 15 cd 5b 07   00 00 00 00  
 01 00 00 00  

struct test aligned:
 30 31 32 00   88 88 88 88  
 15 cd 5b 07   15 cd 5b 07  
 81 ee 46 95   5e 43 26 5b  
 81 ee 46 95   5e 43 26 5b  
 15 cd 5b 07   00 00 00 00  
 01 00 00 00  

On Linux x86/64bit gcc-4.1 things are aligned to 64bit bounderies when

struct test non-aligned:
 30 31 32 00   88 88 88 88  
 15 cd 5b 07   88 88 88 88  
 81 ee 46 95   5e 43 26 5b  
 15 cd 5b 07   88 88 88 88  
 81 ee 46 95   5e 43 26 5b  
 15 cd 5b 07   00 00 00 00  
 01 00 00 00   88 88 88 88  


struct test aligned:
 30 31 32 00   88 88 88 88  
 15 cd 5b 07   15 cd 5b 07  
 81 ee 46 95   5e 43 26 5b  
 81 ee 46 95   5e 43 26 5b  
 15 cd 5b 07   00 00 00 00  
 01 00 00 00   88 88 88 88  

solaris sparc 32bit as well as 64bit gcc-3.1. Interestingly enough, the sun people align to 64bit bounderies in 32bit as well as 64bit mode.

struct test non-aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   88 88 88 88  
 5b 26 43 5e   95 46 ee 81  
 07 5b cd 15   88 88 88 88  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  

struct test aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   07 5b cd 15  
 5b 26 43 5e   95 46 ee 81  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  

irix mips, gcc and cc

struct test non-aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   88 88 88 88  
 5b 26 43 5e   95 46 ee 81  
 07 5b cd 15   88 88 88 88  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  


struct test aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   07 5b cd 15  
 5b 26 43 5e   95 46 ee 81  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  

hpux itanium 64bit cc as well as gcc

struct test non-aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   88 88 88 88  
 5b 26 43 5e   95 46 ee 81  
 07 5b cd 15   88 88 88 88  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  


struct test aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   07 5b cd 15  
 5b 26 43 5e   95 46 ee 81  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  

hpux parisc 64bit gcc

struct test non-aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   88 88 88 88  
 5b 26 43 5e   95 46 ee 81  
 07 5b cd 15   88 88 88 88  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  


struct test aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   07 5b cd 15  
 5b 26 43 5e   95 46 ee 81  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  

aix, ppc/power5 gcc 64bit (interesting that 64bit floats do not get moved to 64bit boundaries)

struct test non-aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   5b 26 43 5e  
 95 46 ee 81   07 5b cd 15  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  


struct test aligned:
 30 31 32 00   88 88 88 88  
 07 5b cd 15   07 5b cd 15  
 5b 26 43 5e   95 46 ee 81  
 5b 26 43 5e   95 46 ee 81  
 00 00 00 00   07 5b cd 15  
 00 00 00 01   88 88 88 88  

osx gcc ppc64

struct test non-aligned:
30 31 32 00   88 88 88 88
07 5b cd 15   88 88 88 88
5b 26 43 5e   95 46 ee 81
07 5b cd 15   88 88 88 88
5b 26 43 5e   95 46 ee 81
00 00 00 00   07 5b cd 15
00 00 00 01   88 88 88 88


struct test aligned:
30 31 32 00   88 88 88 88
07 5b cd 15   07 5b cd 15
5b 26 43 5e   95 46 ee 81
5b 26 43 5e   95 46 ee 81
00 00 00 00   07 5b cd 15
00 00 00 01   88 88 88 88

osx gcc ppc32

struct test non-aligned:
30 31 32 00   88 88 88 88
07 5b cd 15   5b 26 43 5e
95 46 ee 81   07 5b cd 15
5b 26 43 5e   95 46 ee 81
00 00 00 00   07 5b cd 15
00 00 00 01

struct test aligned:
30 31 32 00   88 88 88 88
07 5b cd 15   07 5b cd 15
5b 26 43 5e   95 46 ee 81
5b 26 43 5e   95 46 ee 81
00 00 00 00   07 5b cd 15
00 00 00 01

tru64 alpha

struct test non-aligned:
 30 31 32 00   88 88 88 88  
 15 cd 5b 07   88 88 88 88  
 81 ee 46 95   5e 43 26 5b  
 15 cd 5b 07   88 88 88 88  
 81 ee 46 95   5e 43 26 5b  
 15 cd 5b 07   00 00 00 00  
 01 00 00 00   88 88 88 88  


struct test aligned:
 30 31 32 00   88 88 88 88  
 15 cd 5b 07   15 cd 5b 07  
 81 ee 46 95   5e 43 26 5b  
 81 ee 46 95   5e 43 26 5b  
 15 cd 5b 07   00 00 00 00  
 01 00 00 00   88 88 88 88  

This means, that if the struct members are layed out properly, they will occupy the same amount of memory everywhere.

Longs and Integers

On 32 bit architectures, integers and longs are 32 bit wide. While longs are normally 64 bit wide on 64 bit architectures. Since the RRDtool data format uses a lot of longs, this also has to be addressed in a portable format. Fortunately, there are datatypes that are less architecture dependent, defined in the inttypes header.

#include <inttypes.h>
uint8_t  unsigned_8_bit_integer;
uint32_t unsigned_32_bit_integer;
int64_t  signed_64_bit_integer;
/* and so on */

Conclusion

Based on this information a portable RRDtool data format that works at least on PPC, x86, Itanium, PARISC, ARM4, ARM5, Alpha, MIPS and SPARC will be not all that difficult to design.

Information on other architectures is welcome.

Workaround

The TIRD project is providing statically linked binaries and a qemu wrapper allowing you to dump rrd files from other architectures. We use this to dump rrd files downloaded from small embedded boxes whose cpu power is too weak to do the dump for themselves as well as the file size of the xml overhead to large.

Last modified 7 years ago Last modified on Nov 30, 2011, 3:47:45 PM

NOTE: The content of this website is accessible with any browser. The graphical design though relies completely on CSS2 styles. If you see this text, this means that your browser does not support CSS2. Consider upgrading to a standard conformant browser like Mozilla Firefox or Opera but also Apple's Safari or KDE's Konqueror for example. It may also be that you are looking at a mirror page which did not copy the CSS for this page. Or if some pictu res are missing, then the mirror may not have picked up the contents of the inc directory.