Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop

From:	Christian Schoenebeck
Subject:	Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop
Date:	Thu, 07 May 2020 13:37:30 +0200

On Mittwoch, 6. Mai 2020 19:49:10 CEST Greg Kurz wrote:
> > Ok, but why not both? Moving locks to worker thread and QemuMutex ->
> > CoMutex?
> Using CoMutex would be mandatory if we leave the locking where it sits
> today, so that the main thread can switch to other coroutines instead
> of blocking. We don't have the same requirement with the worker thread:
> it just needs to do the actual readdir() and then it goes back to the
> thread pool, waiting to be summoned again for some other work. 

Yes, I know.

> So I'd
> rather use standard mutexes to keep things simple... why would you
> want to use a CoMutex here ?

Like you said, it would not be mandatory, nor a big deal, the idea was just if 
a lock takes longer than expected then a worker thread could already continue 
with another task. I mean the amount of worker threads are limited they are 
not growing on demand, are they?

I also haven't reviewed QEMU's lock implementations in very detail, but IIRC 
CoMutexes are completely handled in user space, while QemuMutex uses regular 
OS mutexes and hence might cost context switches. 

> > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > > index 9e046f7acb51..ac84ae804496 100644
> > > --- a/hw/9pfs/9p.c
> > > +++ b/hw/9pfs/9p.c
> > > @@ -2170,7 +2170,7 @@ static int coroutine_fn
> > > v9fs_do_readdir_with_stat(V9fsPDU *pdu, int32_t count = 0;
> > > 
> > >      struct stat stbuf;
> > >      off_t saved_dir_pos;
> > > 
> > > -    struct dirent *dent;
> > > +    struct dirent dent;
> > > 
> > >      /* save the directory position */
> > >      saved_dir_pos = v9fs_co_telldir(pdu, fidp);
> > > 
> > > @@ -2181,13 +2181,11 @@ static int coroutine_fn
> > > v9fs_do_readdir_with_stat(V9fsPDU *pdu, while (1) {
> > > 
> > >          v9fs_path_init(&path);
> > > 
> > > -        v9fs_readdir_lock(&fidp->fs.dir);
> > > -
> > 
> > That's the deadlock fix, but ...
> > 
> > >          err = v9fs_co_readdir(pdu, fidp, &dent);
> > > 
> > > -        if (err || !dent) {
> > > +        if (err <= 0) {
> > > 
> > >              break;
> > >          
> > >          }
> > 
> > ... even though this code simplification might make sense, I don't think
> > it
> > should be mixed with the deadlock fix together in one patch. They are not
> 
> I could possibly split this in two patches, one for returning a copy
> and one for moving the locking around, but...
> 
> > related with each other, nor is the code simplification you are aiming
> > trivial
> ... this assertion is somewhat wrong: moving the locking to
> v9fs_co_readdir() really requires it returns a copy.

Yeah, I am also not sure whether a split would make it more trivial enough in 
this case to be worth the hassle. If you find an acceptable solution, good, if 
not then leave it one patch.

> > enough to justify squashing. The deadlock fix should make it through the
> > stable branches, while the code simplification should not. So that's
> > better
> > off as a separate cleanup patch.
> 
> The issue has been there for such a long time without causing any
> trouble. Not worth adding churn in stable for a bug that is impossible
> to hit with a regular linux guest.

Who knows. There are also other clients out there. A potential deadlock is 
still a serious issue after all.

> > > @@ -32,13 +32,20 @@ int coroutine_fn v9fs_co_readdir(V9fsPDU *pdu,
> > > V9fsFidState *fidp, struct dirent *entry;
> > > 
> > >              errno = 0;
> > > 
> > > +
> > > +            v9fs_readdir_lock(&fidp->fs.dir);
> > > +
> > > 
> > >              entry = s->ops->readdir(&s->ctx, &fidp->fs);
> > >              if (!entry && errno) {
> > >              
> > >                  err = -errno;
> > > 
> > > +            } else if (entry) {
> > > +                memcpy(dent, entry, sizeof(*dent));
> > > +                err = 1;
> > 
> > I find using sizeof(*dent) a bit dangerous considering potential type
> > changes in future. I would rather use sizeof(struct dirent). It is also
> > more human friendly to read IMO.
> 
> Hmm... I believe it's the opposite actually: with sizeof(*dent), memcpy
> will always copy the number of bytes that are expected to fit in *dent,
> no matter the type.

Yes, but what you intend is to flat copy a structure, not pointers. So no 
matter how the type is going to be changed you always actually wanted 
(semantically)

        copy(sizeof(struct dirent), nelements)

Now it is nelements=1, in future it might also be nelements>1, but what you 
certainly don't ever want here is

        copy(sizeof(void*), nelements)

> But yes, since memcpy() doesn't do any type checking for us, I think
> I'll just turn this into:
> 
>                 *dent = *entry;

Ok

Best regards,
Christian Schoenebeck

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Christian Schoenebeck, 2020/05/06
- Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Christian Schoenebeck, 2020/05/06
  - Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Greg Kurz, 2020/05/06
    - Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Christian Schoenebeck, 2020/05/07
    - Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Greg Kurz, 2020/05/07
- Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Greg Kurz, 2020/05/06
  - Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Christian Schoenebeck <=
    - Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Greg Kurz, 2020/05/07
    - Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Christian Schoenebeck, 2020/05/07
    - Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop, Greg Kurz, 2020/05/07

Prev by Date: Re: [PATCH v3 09/33] block: Add generic bdrv_inherited_options()
Next by Date: RE: [PATCH v4 5/6] net/colo-compare.c, softmmu/vl.c: Check that colo-compare is active
Previous by thread: Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop
Next by thread: Re: [PATCH] 9pfs: Fix potential deadlock of QEMU mainloop
Index(es):
- Date
- Thread