Revision 76a92d4...
Go back to digest for 22nd December 2013Optimization in Educational
Save 16 bytes per sky object.
In practice, the `long double` type has 16 byte size and alignment.
We can inspect the memory layout of some class inheriting SkyPoint using
clang [1]:
*** Dumping AST Record Layout
0 | class StarObject
0 | class SkyObject (primary base)
0 | class SkyPoint (primary base)
0 | (SkyPoint vtable pointer)
0 | (SkyPoint vftable pointer)
16 | long double lastPrecessJD
32 | class dms RA0
32 | double D
| [sizeof=8, dsize=8, align=8
| nvsize=8, nvalign=8]
...(snipped)...
184 | float B
188 | float V
| [sizeof=192, dsize=192, align=16
| nvsize=192, nvalign=16]
The vtable takes up only 8 bytes (on 64-bit), but we waste 8 bytes on
padding. Moreover, we then take up 16 bytes to store lastPrecessJD.
Using a program like the following:
#include <stdio.h>
#include <math.h>
int main()
{
double jd2000 = 2451545.0;
double delta = nextafter(jd2000,jd2000+1) - jd2000;
printf("delta: %.30f\n", delta);
return 0;
}
we can compute that at J2000, the minimum time step at double precision
is approximately 40 microseconds, so it's not clear that we gain
anything by using 80-bit long doubles instead of 64-bit doubles.
Changing the `long double` to `double` (and placing it last) results in
memory layout like so:
*** Dumping AST Record Layout
0 | class SkyPoint
0 | (SkyPoint vtable pointer)
0 | (SkyPoint vftable pointer)
8 | class dms RA0
8 | double D
| [sizeof=8, dsize=8, align=8
| nvsize=8, nvalign=8]
16 | class dms Dec0
16 | double D
| [sizeof=8, dsize=8, align=8
| nvsize=8, nvalign=8]
24 | class dms RA
24 | double D
| [sizeof=8, dsize=8, align=8
| nvsize=8, nvalign=8]
32 | class dms Dec
32 | double D
| [sizeof=8, dsize=8, align=8
| nvsize=8, nvalign=8]
40 | class dms Alt
40 | double D
| [sizeof=8, dsize=8, align=8
| nvsize=8, nvalign=8]
48 | class dms Az
48 | double D
| [sizeof=8, dsize=8, align=8
| nvsize=8, nvalign=8]
56 | double lastPrecessJD
| [sizeof=64, dsize=64, align=8
| nvsize=64, nvalign=8]
This also has the benefit that the SkyPoint data fits in a single cache
line, though I don't think this really makes a difference given the
inefficiencies in the rest of the code. A before/after test showed a
drop in memory usage of about 6%.
[1]: http://eli.thegreenplace.net/2012/12/17/dumping-a-c-objects-memory-layout-with-clang/
File Changes
- kstars/skyobjects/skypoint.h