[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH RFC v3 4/7] migration: fix the compression code
From: |
Peter Xu |
Subject: |
Re: [Qemu-devel] [PATCH RFC v3 4/7] migration: fix the compression code |
Date: |
Thu, 20 Sep 2018 13:33:09 +0800 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Thu, Sep 20, 2018 at 01:06:21PM +0800, Fei Li wrote:
>
>
> On 09/20/2018 12:31 PM, Peter Xu wrote:
> > On Wed, Sep 19, 2018 at 09:35:20PM +0800, Fei Li wrote:
> > > Add judgement in compress_threads_save_cleanup() to check whether the
> > > static CompressParam *comp_param has been allocated. If not, just
> > > return; or else Segmentation fault will occur when using the
> > > NULL comp_param's parameters in terminate_compression_threads().
> > > One test case can reproduce this error is: set the compression on
> > > and migrate to a wrong nonexistent host IP address.
> > >
> > > Add judgement before handling comp_param[idx]'s quit and cond in
> > > terminate_compression_threads(), in case they are not initialized.
> > > Or else "qemu_mutex_lock_impl: Assertion `mutex->initialized' failed."
> > > will occur. One test case can reproduce this error is: set the
> > > compression on and fail to fully setup the eight compression thread
> > > in compress_threads_save_setup().
> > >
> > > Signed-off-by: Fei Li <address@hidden>
> > > ---
> > > migration/ram.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index 79c89425a3..522a5550b4 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -427,6 +427,9 @@ static inline void terminate_compression_threads(void)
> > > thread_count = migrate_compress_threads();
> > > for (idx = 0; idx < thread_count; idx++) {
> > > + if (!comp_param[idx].mutex.initialized) {
> > > + break;
> > > + }
> > Instead of accessing the mutex internal variable "initialized", how
> > about we just squash the terminate_compression_threads() into existing
> > compress_threads_save_cleanup()? Note that we have this in
> > compress_threads_save_cleanup() already:
> >
> > for (i = 0; i < thread_count; i++) {
> > /*
> > * we use it as a indicator which shows if the thread is
> > * properly init'd or not
> > */
> > if (!comp_param[i].file) {
> > break;
> > }
> > qemu_thread_join(compress_threads + i);
> > qemu_mutex_destroy(&comp_param[i].mutex);
> > qemu_cond_destroy(&comp_param[i].cond);
> > deflateEnd(&comp_param[i].stream);
> > g_free(comp_param[i].originbuf);
> > qemu_fclose(comp_param[i].file);
> > comp_param[i].file = NULL;
> > }
> >
> > So we already try to detect this using the comp_param[i].file, which
> > IMO is better (the exposure of the mutex.init var is just "an
> > accident" - logically we could hide that from mutex impl).
> Actually, I firstly use the comp_param[i].file as the judging condition, but
> I am not sure whether the comp_param[i].file is also needed to be protected
> by the mutex lock..
> And I have no objection to the squash.
It should be fine to only check non-zero. This is slightly tricky so
we have had some comment there which clarifies it a bit.
> >
> > > qemu_mutex_lock(&comp_param[idx].mutex);
> > > comp_param[idx].quit = true;
> > > qemu_cond_signal(&comp_param[idx].cond);
> > > @@ -438,7 +441,7 @@ static void compress_threads_save_cleanup(void)
> > > {
> > > int i, thread_count;
> > > - if (!migrate_use_compression()) {
> > > + if (!migrate_use_compression() || !comp_param) {
> > This should be a valid fix for a crash (since we might call the
> > cleanup code even without setup when connect() never worked, so we'd
> > better handle it well).
> >
> > Considering that this seems to fix a migration crash on source, I'm
> > CCing Dave and Juan in case this (or a new version) could even be a
> > good candidate as part of a migration pull.
> >
> > Thanks,
> Thanks for helping cc. ;)
No problem. If you're gonna post another version, feel free to CC
Dave and Juan on this patch, and you can post this as a standalone one
since it's not directly related with the series and it fixes an even
more severe issue on migration side (source VM will data-loss if this
is triggerred; and it triggers easily).
Regards,
--
Peter Xu
[Qemu-devel] [PATCH RFC v3 3/7] qemu_init_vcpu: add a new Error parameter to propagate, Fei Li, 2018/09/19
[Qemu-devel] [PATCH RFC v3 5/7] migration: fix the multifd code, Fei Li, 2018/09/19
[Qemu-devel] [PATCH RFC v3 6/7] qemu_thread_join: fix segmentation fault, Fei Li, 2018/09/19
[Qemu-devel] [PATCH RFC v3 7/7] qemu_thread_create: propagate the error to callers to handle, Fei Li, 2018/09/19