qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] migration/multifd: close TLS channel before socket finali


From: Zheng Chuan
Subject: Re: [PATCH v2] migration/multifd: close TLS channel before socket finalize
Date: Tue, 10 Nov 2020 19:56:40 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0


On 2020/11/10 19:01, Daniel P. Berrangé wrote:
> On Tue, Nov 10, 2020 at 06:45:45PM +0800, Zheng Chuan wrote:
>>
>>
>> On 2020/11/10 18:12, Daniel P. Berrangé wrote:
>>> On Fri, Nov 06, 2020 at 06:54:54PM +0800, Chuan Zheng wrote:
>>>> Since we now support tls multifd, when we cancel migration, the TLS
>>>> sockets will be left as CLOSE-WAIT On Src which results in socket
>>>> leak.
>>>> Fix it by closing TLS channel before socket finalize.
>>>>
>>>> Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
>>>> ---
>>>>  migration/multifd.c | 14 ++++++++++++++
>>>>  1 file changed, 14 insertions(+)
>>>>
>>>> diff --git a/migration/multifd.c b/migration/multifd.c
>>>> index 68b171f..a6838dc 100644
>>>> --- a/migration/multifd.c
>>>> +++ b/migration/multifd.c
>>>> @@ -523,6 +523,19 @@ static void multifd_send_terminate_threads(Error *err)
>>>>      }
>>>>  }
>>>>  
>>>> +static void multifd_tls_socket_close(QIOChannel *ioc, Error *err)
>>>> +{
>>>> +    if (ioc &&
>>>> +        object_dynamic_cast(OBJECT(ioc),
>>>> +                            TYPE_QIO_CHANNEL_TLS)) {
>>>> +        /*
>>>> +         * TLS channel is special, we need close it before
>>>> +         * socket finalize.
>>>> +         */
>>>> +        qio_channel_close(ioc, &err);
>>>> +    }
>>>> +}
>>>
>>> This doesn't feel quite right to me.  Calling qio_channel_close will close
>>> both the TLS layer, and the underlying QIOChannelSocket. If the latter
>>> is safe to do, then we don't need the object_dynamic_cast() check, we can
>>> do it unconditionally whether we're using TLS or not.
>>>
>>> Having said that, I'm not sure if we actually want to be using
>>> qio_channel_close or not ?
>>>
>>> I would have expected that there is already code somewhere else in the
>>> migration layer that is closing these multifd channels, but I can't
>>> actually find where that happens right now.  Assuming that code does
>>> exist though, qio_channel_shutdown(ioc, BOTH) feels like the right
>>> answer to unblock waiting I/O ops.
>>>
>> Hi, Daniel.
>> Actually, I have tried to use qio_channel_shutdown at the same place,
>> but it seems not work right.
>> the socket connection is closed by observing through 'ss' command but
>> the socket fds in /proc/$(qemu pid)/fd are still residual.
>>
>> The underlying QIOChannelSocket will be closed by
>> qio_channel_socket_finalize() through object_unref(QIOChannel) later
>> in socket_send_channel_destroy(),
>> does that means it is safe to close both of TLS and tcp socket?
> 
> Hmm, that makes me even more confused, because the object_unref
> should be calling qio_channel_close() already.
> 
> eg with your patch we have:
> 
>        multifd_tls_socket_close(p->c, NULL);
>            -> qio_channel_close(p->c)
>               -> qio_channel_tls_close(p->c)
>                      -> qio_channel_close(master)
> 
>        socket_send_channel_destroy(p->c)
>            -> object_unref(p->c)
>                -> qio_channel_tls_socket_finalize(p->c)
>                     -> object_unref(master)
>                             -> qio_channel_close(master)
> 
> so the multifd_tls_socket_close should not be doing anything
> at all *assuming* we releasing the last reference in our
> object_unref call.
> 
> Given what you describe, I think we are *not* releasing the
> last reference. There is an active reference being held
> somewhere else, and that is preventing the QIOChannelSocket
> from being freed and thus the socket remains open.
> 
> If that extra active reference is a bug, then we have a memory
> leak of the QIOChannelSocket object, that needs fixing somewhere.
> 
> If that extra active reference is intentional, then we do indeed
> need to explicitly close the socket. That is possibly better
> handled by putting a qio_channel_close() call into the
> socket_send_channel_destroy() method.
> 
> I wonder if we're leaking a reference to the underlying QIOChannelSocket,
> when we create the QIOChannelTLS wrapper ? That could explain a problem
> that only happens when using TLS.
> 
Aha, you are right!
The QIOChannelSocket is added by an extra reference.

Thread 1 "qemu-system-aar" hit Breakpoint 1, socket_send_channel_destroy (
    send=0xaaaabea527f0) at migration/socket.c:44
44      migration/socket.c: No such file or directory.
(gdb) p *((QIOChannelTLS)*send)->master
$5 = {parent = {class = 0xaaaabc690c90, free = 0xffff9bd16c40 <g_free>,
    properties = 0xffff61a04de0, ref = 2, parent = 0x0}, features = 2, name = 
0x0,
  ctx = 0x0, read_coroutine = 0x0, write_coroutine = 0x0}
(gdb) p (QIOChannelTLS)*send
$6 = {parent = {parent = {class = 0xaaaabc6430c0, free = 0xffff9bd16c40 
<g_free>,
      properties = 0xffff61a04f00, ref = 1, parent = 0x0}, features = 2,
    name = 0xaaaabdd81290 "multifd-tls-outgoing", ctx = 0x0, read_coroutine = 
0x0,
    write_coroutine = 0x0}, master = 0xaaaabe350e90, session = 0xaaaabcf1ead0, 
shutdown = 0}
(gdb) p *((QIOChannelTLS)*send)->master
$7 = {parent = {class = 0xaaaabc690c90, free = 0xffff9bd16c40 <g_free>,
    properties = 0xffff61a04de0, ref = 2, parent = 0x0}, features = 2, name = 
0x0,
  ctx = 0x0, read_coroutine = 0x0, write_coroutine = 0x0}

I'll make it further to see where we do this thing...

> Regards,
> Daniel
> 

-- 
Regards.
Chuan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]