gnue-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnue-dev] Cache in Common


From: Jan Ischebeck
Subject: [Gnue-dev] Cache in Common
Date: Fri, 20 Dec 2002 23:40:09 +0100
User-agent: KMail/1.4.3

Hi,

after a talk with jamest about the actual cache implementation in common I 
thought that we should talk about how to improve it

Status now:
   If a query is executed the cache is cleared. Then "cachecount"[1] rows of 
   the returned resultset are read into the cache. If a row is accessed which  
   is not stored in the cache the next "cachecount" rows are read into cache.
   i.e. be "cachecount" = 5: after reading the first row, the cache is storing  
   
   5 rows, after reading the 6th row, the cache stores 10 rows. 
   when you directly jump to record 2001, then the cache is storing 2005 rows.

[1] a number which defines how many rows are loaded at once, this is defined 
in the <datasource> tag. f.e. <datasource name="???" cache="5">


Future:
The cache should have a maximum size, i.e. if you go to record 200 the cache 
should just cache the last 100 entries. (jamest said, that it should be this 
way)

But theres's a problem, if you have changed the first record you still have to 
keep the modifications in the cache. Now there are two possibilities: You 
don't allow the user to advance any further as the maximum cache size allow. 
Or you define, that the cache can hold a maximum number of normal records and 
a maximum number of modified records (dirty records). 

In many situations, like when you have somebody editing an address list, you 
would say: records_to_cache_per_step = 5, maximum_number_of_records_in_cache 
= 5, and you wouldn't allow to store modified records seperatly. Then the 
user would have to commit after each 5 newly added records, and the user 
could browse to records with more than 4 steps distants from a modified 
record.
This could make sense for an simple address list. In this case all settings 
would have the same value.

In another situation a manager has to change the wages of selected employees, 
but he has to do everything in one transaction. In this case there is the 
need for a possibility to advance till the end of the recordset without a 
single commit in between.


To be prepared for cases like described above and to have a flexible scheme I 
would recomend that the caching system uses 4 settings.

1. a number which defines how many records are loaded in one step (like the 
actual "cache" directive)

2. a maximal number of clean records in the cache
if set to 0, the maximal number of clean records in cache just depends on the 
maximum cache size.

3. a maximal number of dirty records in the cache
if set to 0, the maximal number of dirty records in cache just depends on the 
maximum cache size.

4. a maximum cache size (dirty and clean records)
if set to 0 then the cache has no size limitation.

These settings don't have to be directly changable by the user. It could f.e. 
be possible to just define 3 types of caches and a cachesize setting and 
calculate the 4 internal settings out of these 4. 

The only critical point with having these 4 settings I see is in having too 
much complexity and slowing down common by this. You could change the 4 
settings to 3 settings and have comparable choice and similar slowdowns. You 
can even use caching as it is now and have no extra slowdown, but you'll get 
memory problems when you have too much records in a resultset.

So what do you think?
Please comment.

Jan

------------------------
Jan Ischebeck e-Services
address@hidden





reply via email to

[Prev in Thread] Current Thread [Next in Thread]