В начало

Rationalization of vector of double

Sergei Khashin


rationalize.h     - header
rationalize.cpp - realization.

For the rational approximation of real numbers one can use "continued", or "chain" fraction, see,for example, here.

Consider, for example, a rational approximation for e and π:

π ≈
e ≈

Let us write the resulting fractions and their errors (deviations):

n value deviation value deviation
π e
1 3 0.14 2 0.71
2 22/7 0.0012 3 0.28
3 333/106 0.000083 8/3 0.051
4 355/113 0.26e-6 11/4 0.031
5 103993/33102 0.57e-9 19/7 0.004
6 104348/33215 0.33e-9 87/32 0.0004
7 208341/66317 0.12e-9 106/39 0.00033
8 312689/99532 0.29e-10 193/71 0.000028
9 833719/265381 0.87e-11 1264/465 0.22e-5
10 1146408/364913 0.16e-11 1457/536 0.17e-5
11 4272943/1360120 0.40e-12 2721/1001 0.11e-6
12 5419351/1725033 0.22e-13 23225/8544 0.67e-8
13 80143857/25510582 0.58e-15 25946/9545 0.55e-8
14 49171/18089 0.28e-9
15 517656/190435 0.13e-10
16 566827/208524 0.11e-10
17 1084483/398959 0.48e-12
18 13580623/4996032 0.20e-13
19 14665106/5394991 0.17e-13
20 28245729/10391023 0.61e-15

Implementation of this algorithm is given in the C ++ function chainFract, which using an auxiliary function chainFractStep:

bool chainFractStep( double x, int &p, int &q, int maxQ) // one step of chainFract
{
    int maxN1;
    if( q<=0 ) { q=1; p=round(x); return true; }
    maxN1 = maxQ/q;
    double err=q*x-p;
    if( fabs(err)*maxQ <1 ) return false;
    int n1 = round(1/err);
    if( abs(n1)> maxN1 ) return false;
    p = p*n1+1;
    q *=n1;
    if( n1<0 ) {q=-q; p=-p; }
    return true;
}
double chainFract( double x, int &p, int &q, int maxQ) // rat.approx p/q of x, return tolerance
{
    double err=1;
    for( q=1; chainFractStep(x,p,q,maxQ); ) ;
    return fabs(x-(double)p/q);
}

How can that be, if we must at the same time bring some rational numbers? Of course, you can zoom each of them separately, but the denominators to get, most likely different. So when casting to a common denominator, the numbers will be too great. For example, bringing thus the pair (π, e) with the accuracy up to 1e-10 will receive:

(π,   e) ≈ (833719 / 265381,   1084483 / 398959 ) = ( 332619698521, 287801183023)  / 105876138379.

This, of course, not a very good approximation.

Here is a function in C ++ that allows to solve this problem.
rationalize.h     - header
rationalize.cpp - realization.

In the unit there is, in fact, the one and only function:

double rationalize(int n, double *v, double tol); // calculate R: v[i]*R - (almost) integer with deviation tol. Return R

In fact, there are several auxiliary, very simple functions, implementing some basic operations with vectors of double numbers:

double vdbl_len   (int N, double *a);                   // |a|
double vdbl_scpod (int N, double *a, double *b);        // (a,b)
void   vdbl_add   (int N, double *a, double *b,double R);// a += R*b
double vdbl_minAbs(int N, double *a, int &i);           // minimal abs.value of non-zero elemtnts = a[i]
void   vdbl_zero  (int N, double *a, double eps);       // less 0
double deviationI (double R);                           // deviation R from integer
double vdbl_devi  (int n, double *a, double co);        // deviation a*co from integer

The rationalize for the given vector v of length n finds the actual number R such that the vector R*v has an almost integer coordinates. Denote by w vector R*v with coordinates rounded to a whole number. Then the vector w/R will be different from the original vector v not more than tol.

Consider a few examples.

Let vector v is a:

    v = (1, π).

Apply function rationalize with value of tol equals to 0.1, 0.01, ....

tol w[0] w[1] deviation:
abs(π - w[1]/w[0])
1e-1 1 3 0.14
1e-2 7 22 0.0012
1e-3 7 22 0.0012
1e-4 113 355 0.26e-6
1e-5 113 355 0.26e-6
1e-6 113 355 0.26e-6
1e-7 113 355 0.26e-6
1e-8 33215 104348 0.33e-9
1e-9 33215 104348 0.33e-9
1e-10 33215 104348 0.33e-9
1e-11 99532 312689 0.3e-10
1e-12 364913 1146408 0.2e-11
1e-13 1725033 5419351 0.2e-13
1e-14 1725033 5419351 0.2e-13
1e-15 25510582 80143857 0.6e-15

That is, the obtained results are consistent with an approximation of π, as described above.

Let vector v is:

    v = (1, π, e) ≈ (1,  3.1415926535897932385, 2.7182818284590452354).

Apply function rationalize with value of tol equals to 0.1, 0.01, ....

tol w[0] w[1] w[2]
1e-1 1 3 3
1e-2 7 22 19
1e-3 7 22 19
1e-4 685 2152 1862
1e-5 685 2152 1862
1e-6 20446 64233 55578
1e-7 123368 387572 335349
1e-8 1131457 3554578 3075619
1e-9 1131457 3554578 3075619
1e-10 13968034 43881874 37969053
1e-11 150377371 472424444 408768075
1e-12 3738677499 11745401765 10162779108
1e-13 3738677499 11745401765 10162779108
1e-14 37084888791 116505614185 100807179311
1e-15 37084888791 116505614185 100807179311

Consider another example. Suppose we have a second-order equation of the curve:

  f(x,y) = a[0]*x^2 + a[1]*x*y + a[2]*y^2 + a[3]*x + a[4]*y + a[5],
where
    a[0] = 0.051995332184145352452
    a[1] =-0.069327109578860469936
    a[2] = 0.086658886973575587420
    a[3] = 0.12132244176300582239
    a[4] =-0.051995332184145352452
    a[5] =-0.19064955134186629232

As a result of execution:

    double v[] = {  0.051995332184145352452, -0.069327109578860469936,  0.086658886973575587420,  
                    0.12132244176300582239 , -0.051995332184145352452, -0.19064955134186629232   };
    int n = sizeof(v) / sizeof(double);
    double R = rationalize(n, v, 1e-10);
we obtain R=57.697, hence f(x,y) proportionally
  3*x^2 - 4*x*y + 5*y^2 + 7*x - 3*y - 11.

This algorithm can be demonstrated and directly into JavaScript, the module rationalize.js

tol=
Vector v:

:



free counters