[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Verita

From: Ketan Nilangekar
Subject: Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support
Date: Fri, 18 Nov 2016 10:34:54 +0000
User-agent: Microsoft-MacOutlook/

On 11/18/16, 12:56 PM, "Jeff Cody" <address@hidden> wrote:

>On Wed, Nov 16, 2016 at 08:12:41AM +0000, Stefan Hajnoczi wrote:
>> On Tue, Nov 15, 2016 at 10:38 PM, ashish mittal <address@hidden> wrote:
>> > On Wed, Sep 28, 2016 at 2:45 PM, Stefan Hajnoczi <address@hidden> wrote:
>> >> On Tue, Sep 27, 2016 at 09:09:49PM -0700, Ashish Mittal wrote:
>> >> 5.
>> >> I don't see any endianness handling or portable alignment of struct
>> >> fields in the network protocol code.  Binary network protocols need to
>> >> take care of these issue for portability.  This means libqnio compiled
>> >> for different architectures will not work.  Do you plan to support any
>> >> other architectures besides x86?
>> >>
>> >
>> > No, we support only x86 and do not plan to support any other arch.
>> > Please let me know if this necessitates any changes to the configure
>> > script.
>> I think no change to ./configure is necessary.  The library will only
>> ship on x86 so other platforms will never attempt to compile the code.
>> >> 6.
>> >> The networking code doesn't look robust: kvset uses assert() on input
>> >> from the network so the other side of the connection could cause SIGABRT
>> >> (coredump), the client uses the msg pointer as the cookie for the
>> >> response packet so the server can easily crash the client by sending a
>> >> bogus cookie value, etc.  Even on the client side these things are
>> >> troublesome but on a server they are guaranteed security issues.  I
>> >> didn't look into it deeply.  Please audit the code.
>> >>
>> >
>> > By design, our solution on OpenStack platform uses a closed set of
>> > nodes communicating on dedicated networks. VxHS servers on all the
>> > nodes are on a dedicated network. Clients (qemu) connects to these
>> > only after reading the server IP from the XML (read by libvirt). The
>> > XML cannot be modified without proper access. Therefore, IMO this
>> > problem would be  relevant only if someone were to use qnio as a
>> > generic mode of communication/data transfer, but for our use-case, we
>> > will not run into this problem. Is this explanation acceptable?
>> No.  The trust model is that the guest is untrusted and in the worst
>> case may gain code execution in QEMU due to security bugs.
>> You are assuming block/vxhs.c and libqnio are trusted but that
>> assumption violates the trust model.
>> In other words:
>> 1. Guest exploits a security hole inside QEMU and gains code execution
>> on the host.
>> 2. Guest uses VxHS client file descriptor on host to send a malicious
>> packet to VxHS server.
>> 3. VxHS server is compromised by guest.
>> 4. Compromised VxHS server sends malicious packets to all other
>> connected clients.
>> 5. All clients have been compromised.
>> This means both the VxHS client and server must be robust.  They have
>> to validate inputs to avoid buffer overflows, assertion failures,
>> infinite loops, etc.
>> Stefan
>The libqnio code is important with respect to the VxHS driver.  It is a bit
>different than other existing external protocol drivers, in that the current
>user and developer base is small, and the code itself is pretty new.  So I
>think for the VxHS driver here upstream, we really do need to get some of
>the libqnio issues squared away.  I don't know if we've ever explicitly
>address the extent to which libqnio issues affect the driver
>merging, so I figure it is probably worth discussing here.
>To try and consolidate libqnio discussion, here is what I think I've read /
>seen from others as the major issues that should be addressed in libqnio:
>* Code auditing, static analysis, and general code cleanup.  Things like
>  memory leaks shouldn't be happening, and some prior libqnio compiler
>  warnings imply that there is more code analysis that should be done with
>  libqnio.
>  (With regards to memory leaks:  Valgrind may be useful to track these down:
>    # valgrind  ./qemu-io -c 'write -pP 0xae 66000 128k' \
>            vxhs://localhost/test.raw
>    ==30369== LEAK SUMMARY:
>    ==30369==    definitely lost: 4,168 bytes in 2 blocks
>    ==30369==    indirectly lost: 1,207,720 bytes in 58,085 blocks) 

We have done and are doing exhaustive memory leak tests using valgrind. Memory 
leaks within qnio have been addressed to some extent. We will post detailed 
valgrind results to this thread.

>* Potential security issues such as buffer overruns, input validation, etc., 
>  need to be audited.

We have known a few such issues from previous comments and have addressed some 
of those. If there are any important outstanding ones, please let us know and 
we will fix them on priority.

>* Async operations need to be truly asynchronous, without blocking calls.

There is only one blocking call in libqnio reconnect which we has been pointed 
out. We will fix this soon.

>* Daniel pointed out that there is no authentication method for taking to a
>  remote server.  This seems a bit scary.  Maybe all that is needed here is
>  some clarification of the security scheme for authentication?  My
>  impression from above is that you are relying on the networks being
>  private to provide some sort of implicit authentication, though, and this
>  seems fragile (and doesn't protect against a compromised guest or other
>  process on the server, for one).

Our auth scheme is based on network isolation at L2/L3 level. If there is a 
simplified authentication mechanism which we can implement without imposing 
significant penalties on IO performance, please let us know and we will 
implement that if feasible.

>(if I've missed anything, please add it here!)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]