[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] using mmap files as strings?

From: Alan Post
Subject: Re: [Chicken-users] using mmap files as strings?
Date: Wed, 27 Oct 2010 09:23:18 -0600

On Wed, Oct 27, 2010 at 02:02:15PM +0200, Jörg F. Wittenberger wrote:
> Am Donnerstag, den 21.10.2010, 15:01 -0600 schrieb Alan Post:
> > So far so good, that is what I would expect.  I'd like to work with
> > an mmap buffer like a string.  Is it possible to create an object
> > that will treat the mmap area as a string that I can run regular
> > string operations on without copying the mmap buffer?  I'm
> > specifically interested in running regular expressions across the
> > mmap space.
> Among other reasons this is one why I've been contemplating how one
> could intercept chicken's string handling.
> Another application would be shared substrings.  Or the combination of
> both.  Example: feed a file content to a port, formatted as HTTP chunked
> encoding.  A shared substring pointing right into the mmaped file could
> save all copying.  The expense would be one object allocation holding
> #{pointer, start, end}.
> However this would somehow have to overwrite the basic string handling.
> I have not yet tried that, but at least the utf8 egg hints that it must
> be possible to do so.

In a lisp system I worked on a few years ago, I have both a string
type and a "static" type.  They both acted like strings, but the
static type had a pointer+len to non-memory-pool memory, rather than
allocating the string data with the object.  I believe also the
static object was read-only, for my own simplicity.

I had also wanted to implement substrings, but the problem I ran
into was that I may have a pointer *only* to a substring when the GC
was called, and I had no way to access the full string and make sure
it was properly copied.  I wasn't using your suggestiong above of
having a pointer, start, and end, though I'm not sure why.  Probably
because I needed things to be just so.  :-p

I had been thinking about this feature for the core system, mostly
because I'd like to try my hand at working on the C code in Chicken,
which isn't a promise for anything, I'm still working on my first
egg here!

In that egg, I do need to store substrings, and am doing so with a
pointer to the string and an index.  I rarely need to know the full
length of the string, so compute it from the string object when I
need.  This allowed me to avoid a whole bunch of string copying
without having to incur much/any overhead.  It is conceptually
cleaner to work with substrings, but the effect and performance
should be about the same.

.i ko djuno fi le do sevzi

reply via email to

[Prev in Thread] Current Thread [Next in Thread]