[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmenta
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault |
Date: |
Thu, 10 Jan 2019 17:06:16 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) |
Fei Li <address@hidden> writes:
> 在 2019/1/10 下午5:20, Markus Armbruster 写道:
>> fei <address@hidden> writes:
>>
>>>> 在 2019年1月9日,23:24,Markus Armbruster <address@hidden> 写道:
>>>>
>>>> Fei Li <address@hidden> writes:
>>>>
>>>>>> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>>>>>> fei <address@hidden> writes:
>>>>>>
>>>>>>>> 在 2019年1月8日,01:55,Markus Armbruster <address@hidden> 写道:
>>>>>>>>
>>>>>>>> Fei Li <address@hidden> writes:
>>>>>>>>
>>>>>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>>>>>> return when the QemuThread *thread failed to be created in either
>>>>>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>>>>>
>>>>>>>>> Cc: Stefan Weil <address@hidden>
>>>>>>>>> Signed-off-by: Fei Li <address@hidden>
>>>>>>>>> Reviewed-by: Fam Zheng <address@hidden>
>>>>>>>>> ---
>>>>>>>>> util/qemu-thread-posix.c | 3 +++
>>>>>>>>> util/qemu-thread-win32.c | 2 +-
>>>>>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>>>>>> index 39834b0551..3548935dac 100644
>>>>>>>>> --- a/util/qemu-thread-posix.c
>>>>>>>>> +++ b/util/qemu-thread-posix.c
>>>>>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>>>>> int err;
>>>>>>>>> void *ret;
>>>>>>>>>
>>>>>>>>> + if (!thread->thread) {
>>>>>>>>> + return NULL;
>>>>>>>>> + }
>>>>>>>> How can this happen?
>>>>>>> I think I have answered this earlier, please check the following link
>>>>>>> to see whether it helps:
>>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>>>>>> Thanks for the pointer. Unfortunately, I don't understand your
>>>>>> explanation. You also wrote there "I will remove this patch in next
>>>>>> version"; looks like you've since changed your mind.
>>>>> Emm, issues left over from history.. The background is I was hurry to
>>>>> make those five
>>>>> Reviewed-by patches be merged, including this v9 16/16 patch but not
>>>>> the real
>>>>> qemu_thread_create() modification. But actually this patch is to fix
>>>>> the segmentation
>>>>> fault after we modified qemu_thread_create() related functions
>>>>> although it has got a
>>>>> Reviewed-by earlier. :) Thus to not make troube, I wrote the
>>>>> "remove..." sentence
>>>>> to separate it from those 5 Reviewed-by patches, and were plan to send
>>>>> only four patches.
>>>>> But later I got a message that these five patches are not that urgent
>>>>> to catch qemu v3.1,
>>>>> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
>>>>> a better review.
>>>>>
>>>>> Sorry for the trouble, I need to explain it without involving too much
>>>>> background..
>>>>>
>>>>> Back at the farm: in our current qemu code, some cleanups use a loop
>>>>> to join()
>>>>> the total number of threads if caller fails. This is not a problem
>>>>> until applying the
>>>>> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
>>>>> fails while trying to create the last do_data_compress thread,
>>>>> segmentation fault
>>>>> will occur when join() is called (sadly there's not enough condition
>>>>> to filter this
>>>>> unsuccessful created thread) as this thread is actually not be created.
>>>>>
>>>>> Hope the above makes it clear. :)
>>>> Alright, let's have a look at compress_threads_save_setup():
>>>>
>>>> static int compress_threads_save_setup(void)
>>>> {
>>>> int i, thread_count;
>>>>
>>>> if (!migrate_use_compression()) {
>>>> return 0;
>>>> }
>>>> thread_count = migrate_compress_threads();
>>>> compress_threads = g_new0(QemuThread, thread_count);
>>>> comp_param = g_new0(CompressParam, thread_count);
>>>> qemu_cond_init(&comp_done_cond);
>>>> qemu_mutex_init(&comp_done_lock);
>>>> for (i = 0; i < thread_count; i++) {
>>>> comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
>>>> if (!comp_param[i].originbuf) {
>>>> goto exit;
>>>> }
>>>>
>>>> if (deflateInit(&comp_param[i].stream,
>>>> migrate_compress_level()) != Z_OK) {
>>>> g_free(comp_param[i].originbuf);
>>>> goto exit;
>>>> }
>>>>
>>>> /* comp_param[i].file is just used as a dummy buffer to save
>>>> data,
>>>> * set its ops to empty.
>>>> */
>>>> comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
>>>> comp_param[i].done = true;
>>>> comp_param[i].quit = false;
>>>> qemu_mutex_init(&comp_param[i].mutex);
>>>> qemu_cond_init(&comp_param[i].cond);
>>>> qemu_thread_create(compress_threads + i, "compress",
>>>> do_data_compress, comp_param + i,
>>>> QEMU_THREAD_JOINABLE);
>>>> }
>>>> return 0;
>>>>
>>>> exit:
>>>> compress_threads_save_cleanup();
>>>> return -1;
>>>> }
>>>>
>>>> At label exit, we have @i threads, all fully initialized. That's an
>>>> invariant.
>>>>
>>>> compress_threads_save_cleanup() finds the threads to clean up by
>>>> checking comp_param[i].file:
>>>>
>>>> static void compress_threads_save_cleanup(void)
>>>> {
>>>> int i, thread_count;
>>>>
>>>> if (!migrate_use_compression() || !comp_param) {
>>>> return;
>>>> }
>>>>
>>>> thread_count = migrate_compress_threads();
>>>> for (i = 0; i < thread_count; i++) {
>>>> /*
>>>> * we use it as a indicator which shows if the thread is
>>>> * properly init'd or not
>>>> */
>>>> ---> if (!comp_param[i].file) {
>>>> ---> break;
>>>> ---> }
>>>>
>>>> qemu_mutex_lock(&comp_param[i].mutex);
>>>> comp_param[i].quit = true;
>>>> qemu_cond_signal(&comp_param[i].cond);
>>>> qemu_mutex_unlock(&comp_param[i].mutex);
>>>>
>>>> qemu_thread_join(compress_threads + i);
>>>> qemu_mutex_destroy(&comp_param[i].mutex);
>>>> qemu_cond_destroy(&comp_param[i].cond);
>>>> deflateEnd(&comp_param[i].stream);
>>>> g_free(comp_param[i].originbuf);
>>>> qemu_fclose(comp_param[i].file);
>>>> comp_param[i].file = NULL;
>>>> }
>>>> qemu_mutex_destroy(&comp_done_lock);
>>>> qemu_cond_destroy(&comp_done_cond);
>>>> g_free(compress_threads);
>>>> g_free(comp_param);
>>>> compress_threads = NULL;
>>>> comp_param = NULL;
>>>> }
>>>>
>>>> Due to the invariant, a comp_param[i] with a null .file doesn't need
>>>> *any* cleanup.
>>>>
>>>> To maintain the invariant, compress_threads_save_setup() carefully
>>>> cleans up any partial initializations itself before a goto exit. Since
>>>> the code is arranged smartly, the only such cleanup is the
>>>> g_free(comp_param[i].originbuf) before the second goto exit.
>>>>
>>>> Your PATCH 13 adds a third goto exit, but neglects to clean up partial
>>>> initializations. Breaks the invariant.
>>>>
>>>> I see two sane solutions:
>>>>
>>>> 1. compress_threads_save_setup() carefully cleans up partial
>>>> initializations itself. compress_threads_save_cleanup() copes only
>>>> with fully initialized comp_param[i]. This is how things work before
>>>> your series.
>>>>
>>>> 2. compress_threads_save_cleanup() copes with partially initialized
>>>> comp_param[i], i.e. does the right thing for each goto exit in
>>>> compress_threads_save_setup(). compress_threads_save_setup() doesn't
>>>> clean up partial initializations.
>>>>
>>>> Your PATCH 13 together with the fixup PATCH 16 does
>>>>
>>>> 3. A confusing mix of the two.
>>>>
>>>> Don't.
>>> Thanks for the detail analysis! :)
>>> Emm.. Actually I have thought to do the cleanup in the setup() function for
>>> the third ‘goto exit’ [1], which is a partial initialization.
>>> But due to the below [1] is too long and seems not neat (I notice that most
>>> cleanups for each thread are in the xxx_cleanup() function), I turned to
>>> modify the join() function..
>>> Is the long [1] acceptable when the third ‘goto exit’ is called, or is
>>> there any other better way to do the cleanup?
>>>
>>> [1]
>>> qemu_mutex_lock(&comp_param[i].mutex);
>>> comp_param[i].quit = true;
>>> qemu_cond_signal(&comp_param[i].cond);
>>> qemu_mutex_unlock(&comp_param[i].mutex);
>>>
>>> qemu_mutex_destroy(&comp_param[i].mutex);
>>> qemu_cond_destroy(&comp_param[i].cond);
>>> deflateEnd(&comp_param[i].stream);
>>> g_free(comp_param[i].originbuf);
>>> qemu_fclose(comp_param[i].file);
>>> comp_param[i].file = NULL;
>> Have you considered creating the thread earlier, e.g. right after
>> initializing compression with deflateInit()?
> I am afraid we can not do this, as the members of comp_param[i], like
> file/done/quit/mutex/cond
> will be used later in the new created thread: do_data_[de]compress via
> qemu_thread_create().
You're right.
> Thus it seems we have to accept the above long [1] if we do want to
> clean up partial initialization
> in xxx_setup(). :(
>
> BTW, there is no other argument can be used except the
> "(compress_threads+i)->thread" to
> differentiate whether should we join() the thread, just in case we
> want to change the
> xxx_cleanup() function.
We can try to make compress_threads_save_cleanup() cope with partially
initialized comp_param[i]. Let's have a look at its members:
bool done; // no cleanup
bool quit; // see [2]
bool zero_page; // no cleanup
QEMUFile *file; // qemu_fclose() if non-null
QemuMutex mutex; // see [1]
QemuCond cond; // see [1]
RAMBlock *block; // no cleanup (must be null)
ram_addr_t offset; // no cleanup
/* internally used fields */
z_stream stream; // see [3]
uint8_t *originbuf; // unconditional g_free()
[1]: we could do something like
if (comp_param[i].mutex.initialized) {
qemu_mutex_destroy(&comp_param[i].mutex);
}
if (comp_param[i].cond.initialized) {
qemu_cond_destroy(&comp_param[i].cond);
}
but that would be unclean. Instead, I'd initialize these guys first, so
we can clean them up unconditionally.
[2] This is used to make the thread terminate. Must be done before we
call qemu_thread_join(). I think it can safely be done always, as long
as long as .mutex and .cond are initialized. Trivial if we initialize
them first.
[3]: I can't see a squeaky clean way to detect whether .stream has been
initialized with deflateInit(). Here's a slightly unclean way:
deflateInit() sets .stream.msg to null on success, and to non-null on
failure. We can make it non-null until we're ready to call
deflateInit(), then have compress_threads_save_cleanup() clean up
.stream when it's null. If that's too unclean for your or your
reviewers' taste, add a boolean @stream_initialized flag.
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault, Markus Armbruster, 2019/01/07
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault, fei, 2019/01/08
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault, Markus Armbruster, 2019/01/08
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault, Fei Li, 2019/01/09
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault, Markus Armbruster, 2019/01/09
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault, fei, 2019/01/09
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault, Markus Armbruster, 2019/01/10
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault, Fei Li, 2019/01/10
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault,
Markus Armbruster <=
- Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault, Fei Li, 2019/01/11