Basic Info:
1. Issue: I got a " double free or corruption (out)", head for attachment debug.log for details, the debug.log print the backtrace of one virtual machine
2. Reproduce: currently I cann't destribe how to reproduce this bug, because it's in my productive enviroment which include some special stuffs
3. qemu version: I'm using is qemu-6.0.1
4. qemu ccmdline in short:(checkout detail in the virtual machine log message)
qemu
...
-device virtio-scsi-pci,iothread=iothread1,bus=pci.0,addr=0x5,id=xxx
num_queues=4 \
-device scsi-hd,drive=xxxx,lun=0,bootindex=1,serial=41fd1ad0-cdbd-4ab2-
b603-3727c4b02c32 \
-drive file=iscsi://xxxxxxxxxxxxxxxxxxxc14/0,if=none,format=raw,id=xxx,file.watch_action=exit,file.initiator-name=xxxxxxxxx
The BackTrace log:
Currently I know from debug.log
in frame 19, which trigger the assert, which seems the qiov is invalid here,? i guess
#19 0x0000558534b577cd in blk_aio_write_entry (opaque=0x7fad1401f880)
at ../qemu-6.0.1/block/block-backend.c:1476
1469 static void blk_aio_write_entry(void *opaque)
1470 {
1471 BlkAioEmAIOCB *acb = opaque;
1472 BlkRwCo *rwco = &acb->rwco;
1473 QEMUIOVector *qiov = rwco->iobuf;
1474
1475 assert(!qiov || qiov->size == acb->bytes);
1476 rwco->ret = blk_do_pwritev_part(rwco->blk, rwco->offset, acb->bytes,
1477 qiov, 0, rwco->flags);
in frame 6 which finally result a double free or corruption error
#6 0x00007fad27876cfc in malloc_printerr (str=str@entry=0x7fad279b47b0 "double free or corruption (out)")
at ./malloc/malloc.c:5664
so i think the bug trigger by the line iscsi.c:667
9 0x0000558534bc53d8 in iscsi_co_writev
(bs=<optimized out>, sector_num=<optimized out>, nb_sectors=<optimized out>, iov=0x7fad14046610,
flags=<optimized out>) at ../qemu-6.0.1/block/iscsi.c:667
666 if (iTask.task != NULL) {
667 scsi_free_scsi_task(iTask.task); <----maybe a wild pointer enter this function and do a double free?
668 iTask.task = NULL;
669 }
VirtualMachine Logs: The error message in virtual machine log(there are two virtual machine log)
in virtual machine : 1ade583e-4807-448b-9b3d-e3276f797c14.log which has below message
2024-01-16T02:03:11.897044Z qemu-system-x86_64: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 12 ms): BUSY
2024-01-16T02:03:11.899431Z qemu-system-x86_64: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 11 ms): BUSY
2024-01-16T02:03:13.903009Z qemu-system-x86_64: iSCSI Busy/TaskSetFull/TimeOut (retry #2 in 70 ms): BUSY
qemu-system-x86_64: ../qemu-6.0.1/block/block-backend.c:1475: blk_aio_write_entry: Assertion `!qiov || qiov->size == acb->bytes' failed.
2024-01-16 02:03:15.078+0000: shutting down, reason=crashed
int another virtual machine log : ee31a1f1-cd04-42cb-882a-232d045bcba9.log
2024-01-16T02:03:12.590381Z qemu-system-x86_64: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 2 ms): BUSY
2024-01-16T02:03:14.248267Z qemu-system-x86_64: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 3 ms): BUSY
2024-01-16T02:03:14.596367Z qemu-system-x86_64: iSCSI Busy/TaskSetFull/TimeOut (retry #2 in 8 ms): BUSY
double free or corruption (out)
2024-01-16 02:03:15.226+0000: shutting down, reason=crashed
I'm Trying to Test:
I checkout the block/iscsi.c and find out there are man calls "scsi_free_scsi_task(task);"
some will make the task to NULL pointer manually and some do not, why is that? leak?
Is it a potential that the task being a wild pointer if there is no manually make the task = NULL;?
I'm trying to test that out, any suggestion for me to quickly fix bug are welcome, thanks in advance.