l4-hurd
[Top][All Lists]

## Mechanism to request physical memory with certain properties

 From: Matthieu Lemerre Subject: Mechanism to request physical memory with certain properties Date: Tue, 17 May 2005 02:18:26 +0200 User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)

```Hi,

Neal asked me to find a mechanism to request physical memory with
constraints on location, so here's my proposal.

things.  Maybe some of my explanations are also unclear.  Maybe
there's faster ways to achieve this. Please tell me!

Mechanism to request physical memory with certain properties
============================================================

Description:
------------

We need a mechanism to request physical memory with constraints on location.
Constraints are requirements on the bits on the address:

01X0X10X11X00X means that the bytes you require on the allocation must have 1
and 0 where specified, and can be either 1 or 0 when there is a 'X'. To do this,
we will pass a pair of two words like this:

*bits in the first word will indicate bits which have to be on in
the address of the physical memory eventually choosen

*bits on in the second word will indicate bits which have to be off
in the address of the physical memory eventually choosen

So with the example above, 01X0X10X11X00X is represented by
(01000100110000, 10010010000110).

The allocation unit is the frame, wich is 2^p bytes (p=12 on IA32)

Possible solution:
------------------

Here's my proposal to solve this problem.

Memory is represented by a table of bits, in which a bit is set to 1
if the corresponding frame is free, and 0 if not (or not existing).

The table is the following:

|2^(N-z)-1 | 2^(N-z)-2 | ... | 1 | 0 |
--------|------------------------------------|
0       |    0           1             0   0 |
1       |    1           1             0   1 |
2       |    0           1 (A)         0   1 |
...     |                                    |
(2^z)-1 |    1           0             1   1 |

Frames are numbered from 0 to (2^N)-1. The number of a frame in the table is:
col_number * 2^z + row_number.

For instance, frame (A) is free (its bit is set to 1), and its frame number is
(2^(N-z)-2) * 2^z + 2.

So, any frame number can be decomposed like this: 01001 101001010010010001001001
^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^
(N-z) bits          z bits

Similarly, we decompose the constraint word like this:

01XX1 10X10XX11X00X1 XXXXXXXXXXXX
^^^^^ ^^^^^^^^^^^^^^ ^^^^^^^^^^^^
(N-z)bits  z bits    p undefined bits

(since the allocation unit is the frame, you can't impose constraints on the 12
last bits)

The general idea is:
1/To generate a mask of possible cols (made with the N-z bits of the constraint
word)
2/To iterate over the possible rows (with respect of the z bits of the
constraint words)
3/In each iteration, we apply a logical AND on the mask and the row.  This give
which frames were allocated. Then we clear the corresponding bytes.

It's simpler for the algorithm if 2^(N-z) is the size of a word ( N-z=5 on
IA32).

Iteration over the possible rows:
---------------------------------

Here's the algorithm to iterate over the possible rows: (Pseudo-C code)

void
iterate_rows (word_t constraint1, word_t constraint0, function_t function)
{
iterate_rows_rec (constraint1 >> p, constraint0 >> p,
2^z, 0, function);
}

void
iterate_rows_rec (word_t constraint1, word_t constraint0,
{
function(beginning_of_word);
else
{
beginning_of_word, function);

}
}

Note: the problem with this algorithm is that allocation is not distributed,
so future allocations may take more time.  But there exist workarounds, like
the following modification to the algorithm:

void
iterate_rows_rec (word_t constraint1, word_t constraint0,
word_t pseudo_random_word, function)
{
function(beginning_of_word);
else
{
{
beginning_of_word, function);

}
else
{

beginning_of_word, function);
}
}
}

Pseudo-random word could be set so that we can indicate that some memory
is more desirable than other. (Like memory for DMA transfers should be allocated
less easily).

-----------------------------

We have to generate a column mask which will make us able to check one row at
one time.
If there is no constraints on the upper N-z cols, we can check 2^(N-z) frame at
one time!

If N-z = 5:

0b00110011001100110011001100110011,
0b00001111000011110000111100001111,
0b00000000111111110000000011111111,
0b00000000000000001111111111111111};

0b11001100110011001100110011001100,
0b11110000111100001111000011110000,
0b11111111000000001111111100000000,
0b11111111111111110000000000000000};

/* Only the first N-z bits are interresting.  */
constraint1 = constraint1>>(p+z);
constraint0 = constraint0>>(p+z);

for(i=4; i>=0; i--)
{
}

Test of the row and allocation
------------------------------

Returns the number of allocated frames in the row.

int
allocate_frames(word_t row_number, word_t col_mask, allocation_t result)
{
word_t allocated_frames = memory[row_number] & col_mask;

/* Deallocates the allocated frames*/

result.row_number = row_number;
result.allocated_frames = allocated_frames;

return NUMBER_OF_BITS (allocated_frames);
}

An allocation is a list of (row_numbers X allocated_frames):

struct allocation
{
word_t row_number;
word_t allocated_frames;
struct allocation_unit *next;
};

Deallocation of frames
----------------------

Thus, deallocation is trivial: We just have to re-set the right bits
at the right place.

void
deallocate (allocation_t *allocation)
{
allocation_t unit = allocation;
do
{
word_t row_number = unit.row_number;
memory[row_number] |= unit.allocated_frames;
} while(unit = unit.next);
}

Conclusion
==========

Allocating memory should not be too long using this algorithm, provided that
memory
is distributed over the different rows.  Deallocation is very fast.

One possible variant would be not to have an array of bits, but to
have an array (of size 2^z) of lists. But that would consume more
memory and would be slowest, I think.

Thanks,

Matthieu

```