help-glpk
[Top][All Lists]

Re: [Help-glpk] Non-sparse problems and other questions

 From: Andrew Makhorin Subject: Re: [Help-glpk] Non-sparse problems and other questions Date: Fri, 31 May 2002 12:41:49 +0400

```>I have written a program which uses GLPK to solve a particular class of
>problems (FIR filter design). On large filters, the problem is specified by
>as many as N columns and 3*N rows where N can be as much as 1000. The
problem
>is also completely non-sparse as every row and every column has an entry.
>On problems of this size I often find I run out of memory on a machine with
>768Meg of RAM.
>
>A rough back-of-the-envelope calculation suggests that if these LP
>matrices were stored as arrays of doubles, the memory used would be
>
> 1000 * 3000 * sizeof (double) ~= 24 Meg
>
>Since I'm running out of memory, my guess (without looking at the source
>code) is that you are using some sort of sparse matrix data storage.

1000 rows and 3000 columns with dense matrix this is cool :+)

In glpk the constraint matrix is stored using the sparse format, which
includes not only numerical values of constraint coefficients, but some
auxiliary information. In particular, each non-zero coefficients in LPI
problem object takes at least 24 bytes (it depends on a particular
platform; for example, if 8-byte alignment is used for structure
members, each coefficient takes 40 bytes). However, if you use old api
routines, before solving LPI object is converted to LPX object, in which
yet another copy of the constraint matrix is created. In the latter case
the matrix is stored in both row-wise and column-wise formats, so in
total each non-zero takes 2 * (8 + 4) = 24 bytes. You also should
remember that there is a factorization of basis matrix, which in your
case has 1,000,000 non-zeros, where each non-zero takes 24 bytes, plus
an extra memory (about 30%) reserved for future refactorizations.
Besides, glpk routines allocate and free memory dynamically, and in case
of huge memory blocks (the sparse vector area that contains all non-zero
elements of the sparse matrix is allocated as one memory block) the
memory may be badly fragmented. Besides, there is a lot of information
related to rows and columns, which, however, linearly depends on the
problem size. I should say that glpk successfully solved some very large
scale LPs with a million of non-zero constraint coefficients (for
example, some LP with 200,000 rows, 380,000 columns, and 1,500,000
non-zeros was successfully solved on a PC with 256 Mb dram), however,
they were sparse instances, and I think currently this is a limit of
what glpk lp solver can do.

>On these large problems (using glp_call_rsm1) I find that even finding
>an  initial feasible solution can take a lot of number crunching and
>considerable time. I do however have a very computationally cheap method
>of obtaining a near-feasible solution.

I would suggest you to use new api routines introduced in glpk 3.1.
Firstly, this would allow you saving about a half of memory, because in
this case no copy of the original problem instance is created. And
secondly, the new api routines are a bit more intelligent than the old
ones. Once you have created your LP problem instance, you can specify
a desired initial basis using the new api routines lpx_set_row_stat and
lpx_set_col_stat (for details see the reference manual included in the
distribution).

>    gm_scaling: max / min = 1.476e+18
>    gm_scaling: max / min = 1.476e+18
>    glp_simplex2: k = 1372; invalid basis information

The last message means that you incorrectly set statuses of rows and
columns. Using the new api routines allows avoiding intricate rules for
specifying an initial basis. Besides, using the routine lpx_warm_up you
can compute the basic solution at the initial vertex you specified and
see if this initial point is that you would like the simplex to start.

I'd like to draw your attention on the first two lines. They mean that
max|a[i,j]|/min|a[i,j]| = 1.5e+18, where a[i,j] is non-zero constraint
coefficients in your problem, i.e. the constraint matrix is extremely
badly scaled, due to that the lp solver routines are unable to provide
any appropriate accuracy during computations. That's why you see the
message "Numerical instablity".

I'm not an expert in FIR filters design, but I think there is something
wrong in the problem formulation. I believe there should be another
approach to formulate the problem in some other way. Or the methoda you
use and that working well for small sized problems can't be directly
generalized for problems of such high sizes. (So, it is doubtful that
using a dense version of lp solver would help in this case.)

```