qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6] 9pfs: use GHashTable for fid table


From: Christian Schoenebeck
Subject: Re: [PATCH v6] 9pfs: use GHashTable for fid table
Date: Thu, 06 Oct 2022 18:12:38 +0200

On Mittwoch, 5. Oktober 2022 11:38:39 CEST Christian Schoenebeck wrote:
> On Dienstag, 4. Oktober 2022 14:54:16 CEST Christian Schoenebeck wrote:
> > On Dienstag, 4. Oktober 2022 12:41:21 CEST Linus Heckemann wrote:
> > > The previous implementation would iterate over the fid table for
> > > lookup operations, resulting in an operation with O(n) complexity on
> > > the number of open files and poor cache locality -- for every open,
> > > stat, read, write, etc operation.
> > > 
> > > This change uses a hashtable for this instead, significantly improving
> > > the performance of the 9p filesystem. The runtime of NixOS's simple
> > > installer test, which copies ~122k files totalling ~1.8GiB from 9p,
> > > decreased by a factor of about 10.
> > > 
> > > Signed-off-by: Linus Heckemann <git@sphalerite.org>
> > > Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > Reviewed-by: Greg Kurz <groug@kaod.org>
> > > Message-Id: <20220908112353.289267-1-git@sphalerite.org>
> > > [CS: - Retain BUG_ON(f->clunked) in get_fid().
> > > 
> > >      - Add TODO comment in clunk_fid(). ]
> > > 
> > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > ---
> > 
> > In general: LGTM now, but I will definitely go for some longer test runs
> > before queuing this patch. Some minor side notes below ...
> 
> So I was running a compilation marathon on 9p as root fs this night, first
> couple hours went smooth, but then after about 12 hours 9p became unusable
> with error:
> 
>   Too many open files
> 
> The question is, is that a new issue introduced by this patch? I.e. does it
> break the reclaim fd code? Or is that rather unrelated to this patch, and a
> problem we already had?
> 
> Linus, could you look at this? It would probably make sense to force getting
> into this situation much earlier like:
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index aebadeaa03..0c104b81e1 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -4330,6 +4330,6 @@ static void __attribute__((__constructor__))
> v9fs_set_fd_limit(void)
>          error_report("Failed to get the resource limit");
>          exit(1);
>      }
> -    open_fd_hw = rlim.rlim_cur - MIN(400, rlim.rlim_cur / 3);
> +    open_fd_hw = rlim.rlim_cur - MIN(50, rlim.rlim_cur / 3);
>      open_fd_rc = rlim.rlim_cur / 2;
>  }
> 
> I can't remember that we had this issue before, so there might still be
> something wrong with this GHashTable patch here.

Much easier reproducer; and no source changes required whatsoever:

  prlimit --nofile=140 -- qemu-system-x86_64 ...

And I actually get this error without this patch as well, which suggests that 
we already had a bug in the reclaim FDs code before? :/

Anyway, as it seems that this bug was not introduced by this particular patch, 
and with the unnecesary `goto` and `out:` label removed:

Queued on 9p.next:
https://github.com/cschoenebeck/qemu/commits/9p.next

Best regards,
Christian Schoenebeck





reply via email to

[Prev in Thread] Current Thread [Next in Thread]