User Tools

Site Tools


development:numeric_math

Numeric / Math / Linear Algebra

Numerically Robust Variance and weighted sum

C/C++ Libraries

IEEE-754 Float Numbers

    • existence of denormalized numbers (aka subnormal numbers) should be known
    • existence of signed zeros might be interesting
    • existence of following non-numbers (Not-a-Number: NaN) should be known
      • quiet NaN (qNaN)
      • signaling NaN (sNaN)
      • positive and negative infinity (+/- inf)
    • rounding modes can be controlled
      • nearest, towards 0, towards + or -inf
      • be aware, that rotating a 2D-point or complex coordinate in a loop will go towards zero, due to error propagation with default rounding mode 'round towards zero' !
    • exceptions can be handled, by setting up float-traps/signal handler

Performance Issues with Denormals and NANs

Calculation with denormals or non-numbers slows down performance - even when not signalled.
it might be interesting to abort a calculation, e.g. a matrix/vector multiplication, with first occurence of NaN - or with one of the other conditions .. Unfortunately, SIMD instruction sets have issues on exception trapping and NAN propagation. See Agner Fog's article at https://www.agner.org/optimize/#nan_propagation

Special options like DAZ (Denormals-Are-Zero) and FTZ (Flush-To-Zero) can be used, if the application won't care about very small denomalized numbers, see https://en.wikipedia.org/wiki/Subnormal_number

IPP library does also provide helper functions:

Fast cache-efficient matrix transposition

The topic is explained at wikipedia - with a specialized article for the in-place operation. Matrix transposition can also be utilized in the field of image processing. (De-)Interleaving is a different wording for the same operation, e.g. multi-channel audio data. See https://stackoverflow.com/questions/7780279/de-interleave-an-array-in-place

Transposing a matrix the simple way will produce many cache misses. That is, why special algorithms, like Cache-oblivious ones, are beneficial. There are numerous scientific papers on this topic, e.g. Cache-efficient matrix transposition. But the problem is also discussed on stackoverflow - happily with some code snippets.

In general, one should consider following aspects:

  • rectangular or square matrix
  • transpose only from/to rectangular regions of bigger matrices or images
  • in-place or out-of-place operation
  • combination with conjugation

Here some libraries, which should be quite performance efficient, providing transpose functions:

github.com also produces many results, when searching for “transpose”.

Pavel Zemtsov wrote a bunch of related articles at Experiments in program optimisation, backed with sources at https://github.com/pzemtsov/article-e1-cache and https://github.com/pzemtsov/article-E1-demux-C:

other links:

there's also a new library: https://github.com/hayguen/libtranspose

development/numeric_math.txt · Last modified: 2023/08/14 by hayati