performance - Primitive types slower than user types in C++? -
i curious , did little benchmark determine performance delta between primitive types such int
or float
, user types.
i created template class var
, created inline arithmetic operators. test consisted of looping loop both primitive , var
vectors:
for (unsigned = 0; < 1000; ++i) { in1[i] = i; in2[i] = -i; out[i] = (i % 2) ? in1[i] + in2[i] : in2[i] - in1[i]; }
i quite surprised results, turns out var
class faster of time, int on average loop took 5700 nsec less class. out of 3000 runs, int faster 11 times vs. var
faster 2989 times. similar results float
, var
15100 nsec faster floatin 2991 of runs.
shouldn't primitive types faster?
edit: compiler rather ancient mingw 4.4.0, build options defaults of qtcreator, no optimizations:
qmake call: qmake.exe c:\...\untitled15.pro -r -spec win32-g++ "config+=release"
ok, posting full source, platform 64 bit win7, 4 gb ddr2-800, core2duo@3ghz
#include <qtextstream> #include <qvector> #include <qelapsedtimer> template<typename t> class var{ public: var() {} var(t val) : var(val) {} inline t operator+(var& other) { return var + other.value(); } inline t operator-(var& other) { return var - other.value(); } inline t operator+(t& other) { return var + other; } inline t operator-(t& other) { return var - other; } inline void operator=(t& other) { var = other; } inline t& value() { return var; } private: t var; }; int main() { qtextstream cout(stdout); qelapsedtimer timer; unsigned count = 1000000; qvector<double> pin1(count), pin2(count), pout(count); qvector<var<double> > vin1(count), vin2(count), vout(count); unsigned t1, t2, pacc = 0, vacc = 0, repeat = 10, pcount = 0, vcount = 0, ecount = 0; (int cc = 0; cc < 5; ++cc) { (unsigned c = 0; c < repeat; ++c) { timer.restart(); (unsigned = 0; < count; ++i) { pin1[i] = i; pin2[i] = -i; pout[i] = (i % 2) ? pin1[i] + pin2[i] : pin2[i] - pin1[i]; } t1 = timer.nsecselapsed(); cout << t1 << endl; timer.restart(); (unsigned = 0; < count; ++i) { vin1[i] = i; vin2[i] = -i; vout[i] = (i % 2) ? vin1[i] + vin2[i] : vin2[i] - vin1[i]; } t2 = timer.nsecselapsed(); cout << t2 << endl;; pacc += t1; vacc += t2; } pacc /= repeat; vacc /= repeat; if (pacc < vacc) { cout << "primitive faster" << endl; pcount++; } else if (pacc > vacc) { cout << "var faster" << endl; vcount++; } else { cout << "amazingly, both equally fast" << endl; ecount++; } cout << "average primitive type " << pacc << ", average var " << vacc << endl; } cout << "int faster " << pcount << " times, var faster " << vcount << " times, equal " << ecount << " times, " << pcount + vcount + ecount << " times ran total" << endl; }
relatively, floats var class 6-7% faster floats, ints 3%.
i ran test vector length of 10 000 000 instead of original 1000 , results still consistent , in favor of class.
with qvector
replaced std::vector
, @ -o2
optimization level, code generated gcc 2 types same, instruction instruction.
without replacement, generated code different, that's hardly surprising, considering qtvector
implemented differently primitive , non-primitive types (look qtypeinfo<t>::iscomplex
in qvector.h
).
update looks iscomplex
not affect linner oop, i.e. measured part. loop code still differs 2 types, albeit slightly. looks difference due gcc.
Comments
Post a Comment