qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] hw/ide: Remove status register read side effect


From: jasper.lowell
Subject: Re: [PATCH] hw/ide: Remove status register read side effect
Date: Tue, 25 Feb 2020 03:55:48 +0000

> > ide_exec_cmd 0.461 pid=147030 bus=0x55b77f922d10
> > state=0x55b77f922d98 cmd=0xef

> The command is run here if I'm not mistaken Does this set the irq
> right 
> away on QEMU where on real hadware this may take some time? Not sure
> if 
> that's a problem but trying to understand what's happening.

Yes. QEMU raises an IRQ on the completion of the command.

complete = ide_cmd_table[val].handler(s, val);
if (complete) {
        s->status &= ~BUSY_STAT;
        assert(!!s->error == !!(s->status & ERR_STAT));

        if ((ide_cmd_table[val].flags & SET_DSC) && !s->error) {
                s->status |= SEEK_STAT;
        }

        ide_cmd_done(s);
        ide_set_irq(s->bus);
}

This code from /qemu/hw/ide/core.c is executed when the SET_FEATURE
command request is made. I have tested that if this interrupt is not
made, Solaris 10 will complain accordingly with a unique error message.

I don't believe the quick interrupt here is the problem. Solaris 10
will spin for a short time while waiting for the interrupt bit to be
set before continuing with its routine. If it doesn't see the interrupt
bit is set before some timeout, it will print an error about the
missing interrupt and give up loading the driver.

> > pci_cfg_read 53.231 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0
> > offs=0x50 val=0x4
> > ide_ioport_read 35.577 pid=147030 addr=0x7 reg=b'Status' val=0x50
> > bus=0x55b77f922d10 s=0x55b77f922d98
> > ide_ioport_read 29.095 pid=147030 addr=0x7 reg=b'Status' val=0x50
> > bus=0x55b77f922d10 s=0x55b77f922d98
> 
> So these ide_ioport_read calls clear the irq bit...

That's right. The line that I proposed removing in the patch clears
CFG_INTR_CH0 on ide_ioport_read.

> > ide_ioport_write 19.146 pid=147030 addr=0x6 reg=b'Device/Head'
> > val=0xe0 bus=0x55b77f922d10 s=0x55b77f922d98
> > pci_cfg_read 9.468 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0
> > offs=0x50 val=0x0
> > pci_cfg_read 127.712 pid=147030 dev=b'cmd646-ide' devid=0x3
> > fnid=0x0 offs=0x50 val=0x0
> > pci_cfg_read 101.942 pid=147030 dev=b'cmd646-ide' devid=0x3
> > fnid=0x0 offs=0x50 val=0x0
> 
> ...that would be checked here?

That's right.

Solaris is performing pci_cfg_read on offs=0x50 until it either sees
the interrupt bit set or times out. If it times out, you get a fatal
error for the driver. The behaviour is not expected and aggressively
checked against by the Solaris kernel. From what I can tell, Linux and
OpenBSD don't check if the bit is set before clearing it.

> What I don't get is why ide_ioport_read is called at all and from
> where if 
> that's meant to emulate legacy ide ISA ioport reads and we have a
> PCI 
> device accessed via PCI regs?

Taken from the ATA specification:
All commands that do not include read- or write-data transfers generate
a single interrupt when the command completes. Resets do not generate
an interrupt.

There will be an interrupt whether the command is successful or not. If
the host wants to know if an error occured it needs to inspect the
status register. Solaris might be doing this. As the trace shows, there
is no error and nothing is out of the ordinary.

There are two devices. The PCI/IDE controller (CMD646) and the ATA
compliant drive. The command, feature, and status registers belong to
the drive. If you want to configure the drive in some way or interact
with it you will use the ioport_read/write interface. CFR_INTR_CH0 and
ARTTIM23_INTR_CH1 are PCI registers in the PCI configuration space that
belongs to the PCI/IDE controller (CMD646). It makes sense to me that
both are used.

> There's a possibility that software may want to clear bits without
> reading 
> the current value so having a way to do that can be explained.

I agree that this might be a possibility. I also think its very normal
for kernel drivers to drop the return value from operations when they
are only interested in the side-effect.

> I'm afraid I don't understand the problem enough either to be able
> to 
> help. Maybe you could try to find out where is ide_ioport_read called
> in 
> the above and if that's correct to call it there. Also the CMD646U
> docs 
> mention irq in a lot of regs (all say write to clear) but I don't 
> understand their relation to each other and irq raised by the drive.

I agree and I think that's part of the problem. The documentation does
not explicitly mention their relation to each other. I can't see
anything that suggests that reading the status register on the drive
will unset bits in the pci configuration space of the controller. They
are seperate devices.

> So maybe in DMA mode the BM* regs should be used and in legacy mode
> these 
> interrupts would go to ISA IRQ14 and 15 and cleared on read as per
> the IDE 
> spec while in native mode PCI INTA is raised and not cleared but the
> chip 
> docs don't say anything about this so it's only guessing.

This might be true but I'm suspicious. In native mode the host should
be checking the PCI registers to identify what device was responsible
for the interrupt because it's multiplexed. This should be done before
inspecting the status register of the device that caused the interrupt.

> Does anyone
> have 
> more info on this?

Thanks,
Jasper Lowell.

On Sun, 2020-02-23 at 16:16 +0100, BALATON Zoltan wrote:
> On Sun, 23 Feb 2020, address@hidden wrote:
> > ide_exec_cmd 0.461 pid=147030 bus=0x55b77f922d10
> > state=0x55b77f922d98 cmd=0xef
> 
> The command is run here if I'm not mistaken Does this set the irq
> right 
> away on QEMU where on real hadware this may take some time? Not sure
> if 
> that's a problem but trying to understand what's happening.
> 
> > pci_cfg_read 53.231 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0
> > offs=0x50 val=0x4
> > ide_ioport_read 35.577 pid=147030 addr=0x7 reg=b'Status' val=0x50
> > bus=0x55b77f922d10 s=0x55b77f922d98
> > ide_ioport_read 29.095 pid=147030 addr=0x7 reg=b'Status' val=0x50
> > bus=0x55b77f922d10 s=0x55b77f922d98
> 
> So these ide_ioport_read calls clear the irq bit...
> 
> > ide_ioport_write 19.146 pid=147030 addr=0x6 reg=b'Device/Head'
> > val=0xe0 bus=0x55b77f922d10 s=0x55b77f922d98
> > pci_cfg_read 9.468 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0
> > offs=0x50 val=0x0
> > pci_cfg_read 127.712 pid=147030 dev=b'cmd646-ide' devid=0x3
> > fnid=0x0 offs=0x50 val=0x0
> > pci_cfg_read 101.942 pid=147030 dev=b'cmd646-ide' devid=0x3
> > fnid=0x0 offs=0x50 val=0x0
> 
> ...that would be checked here?
> 
> > It looks like I've made mistakes in previous comments about the
> > error
> > and what the problem might be. Excuse my inexperience. Rather than
> > spinning on ARTTIM23_INTR_CH1 it might be the case that Solaris 10
> > is
> > spinning on CFR_INTR_CH0. I think this because of the following
> > trace:
> > 
> > pci_cfg_read 53.231 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0
> > offs=0x50 val=0x4
> > 
> > The two reads on the io status register show that DRDY (drive ready
> > indicator bit) and DSC (drive seek complete bit). This doesn't look
> > unusual to me. The error bit is also not set which is reassuring.
> 
> What I don't get is why ide_ioport_read is called at all and from
> where if 
> that's meant to emulate legacy ide ISA ioport reads and we have a
> PCI 
> device accessed via PCI regs? Should the device behave differently
> in 
> legacy and native mode with respect of clearing irq bit on register
> reads?
> 
> > I read through some of
> > ftp://ftp.seagate.com/pub/acrobat/reference/111-1c.pdf and I'm
> > confused
> > by the discussion regarding interrupts and the status register.
> > 
> > > INTRQ is cleared when the host reads the status register.
> > 
> > My understand is that INTRQ is the signal from pin 31 on the drive
> > and
> > that the status register is on the drive. I understand the quoted
> > statement as when the host (CMD646) reads the status register of
> > the
> > drive, the drive will lower the interrupt on this pin.
> > 
> > The CMD646 has CFR_INTR_CH0 and ARTTIM23_INTR_CH1 in it's PCI
> > configuration space. This is necessary to determine the source of
> > an
> > interrupt when the CMD646 ports are in PCI IDE Native Mode. Are we
> > saying that when the drive lowers the interrupt, the CMD646 sees
> > this
> > and then clears CFR_INTR_CH0 and ARTTIM23_INTR_CH1 automatically?
> > If
> > this were the case then I don't know why there is an interface to
> > clear
> > them by writing to them.
> 
> There's a possibility that software may want to clear bits without
> reading 
> the current value so having a way to do that can be explained.
> 
> > Also, if reading the ioport status register was enough to clear
> > CFR_INTR_CH0 and ARTTIM23_INTR_CH1 (specific to CMD646) I can't
> > reason
> > why Linux, Solaris, and OpenBSD would have specific routines to
> > clear
> > them (following the CMD646 documentation) rather than just reading
> > the
> > ioport status register.
> 
> But the doc not mentioning irq bits should be cleared on read and
> drivers 
> do it by writing after reading is sufficient evidence that CMD646
> likely 
> does not clear bits on read.
> 
> > With the patch, the tracing output changes to this:
> > ide_ioport_read 128.512 pid=162907 addr=0x7 reg=b'Status' val=0x0
> > bus=0x55909512bd10 s=0x55909512c168
> > ide_ioport_write 22.622 pid=162907 addr=0x6 reg=b'Device/Head'
> > val=0xe0 bus=0x55909512bd10 s=0x55909512c168
> > ide_ioport_write 21.330 pid=162907 addr=0x1 reg=b'Features' val=0x3
> > bus=0x55909512bd10 s=0x55909512bd98
> > ide_ioport_write 13.926 pid=162907 addr=0x2 reg=b'Sector Count'
> > val=0x42 bus=0x55909512bd10 s=0x55909512bd98
> > ide_ioport_write 9.278 pid=162907 addr=0x7 reg=b'Command' val=0xef
> > bus=0x55909512bd10 s=0x55909512bd98
> > ide_exec_cmd 0.921 pid=162907 bus=0x55909512bd10
> > state=0x55909512bd98 cmd=0xef
> > pci_cfg_read 40.647 pid=162907 dev=b'cmd646-ide' devid=0x3 fnid=0x0
> > offs=0x50 val=0x4
> > ide_ioport_read 40.445 pid=162907 addr=0x7 reg=b'Status' val=0x50
> > bus=0x55909512bd10 s=0x55909512bd98
> > ide_ioport_read 31.580 pid=162907 addr=0x7 reg=b'Status' val=0x50
> > bus=0x55909512bd10 s=0x55909512bd98
> > ide_ioport_write 17.923 pid=162907 addr=0x6 reg=b'Device/Head'
> > val=0xe0 bus=0x55909512bd10 s=0x55909512bd98
> > pci_cfg_read 10.931 pid=162907 dev=b'cmd646-ide' devid=0x3 fnid=0x0
> > offs=0x50 val=0x4
> > pci_cfg_read 19.136 pid=162907 dev=b'cmd646-ide' devid=0x3 fnid=0x0
> > offs=0x50 val=0x4
> > pci_cfg_write 26.650 pid=162907 dev=b'cmd646-ide' devid=0x3
> > fnid=0x0 offs=0x50 val=0x4
> 
> The difference here is that status bits still there after
> ide_ioport_read 
> when it gets it via pci_cfg_read than writes to that reg to clear it.
> 
> > Now there is no fatal error and Solaris does the expected by
> > setting
> > CFR_INTR_CH0 to clear it. QEMU then updates the interrupt status in
> > cmd646_pci_config_write.
> > 
> > I'm still unconvinced that we should be lowering interrupts when
> > the
> > ide ioport status register is read. I might be misunderstanding the
> > documentation though (definitely possible). If I am
> > misunderstanding
> > the documentation, could you or Mark clarify what I'm getting wrong
> > here.
> 
> I'm afraid I don't understand the problem enough either to be able
> to 
> help. Maybe you could try to find out where is ide_ioport_read called
> in 
> the above and if that's correct to call it there. Also the CMD646U
> docs 
> mention irq in a lot of regs (all say write to clear) but I don't 
> understand their relation to each other and irq raised by the drive.
> I've 
> found these:
> 
> 0x3c INTLINE R/W Interrupt line default=14
> 
> 0x50 CFR R Configuration register
>       bit2 1=interrupt pending, write 1 will clear this bit (in read
> only reg?)
> 
> 0x57 ARTTIM23 R/W Drive2/3 Control/Status Register
>       bit4 IDE drive 2/3 interrupt status (write 1 to clear)
> 
> 0x71 or BAR4+1 MRDMODE DMA Master Read Mode Select
>       bit2 Primary channel IDE interrupt (write 1 to clear)
>       bit3 Secondary channel IDE interrupt (write 1 to clear)
>       bit4 Primary channel interrupt enable
>            0 = propagate primary channel IDE interrupts to IRQ14 or
> INTA# pin (default)
>            1 = block primary channel IDE interrupts (NOTE: this does
> not affect the function of bit 2)
>       bit5 same as bit4 for but for Secondary channel and IRQ15
> 
> 0x72 or BAR4+2 R/W BMIDESR0 Bus Master IDE Status Reg 0 for primary
> channel
>       bit2 interrupt generated on IDE bus for DMA transfer (write 1
> to clear)
> 
> 0x7a or BAR4+0xa R/W BMIDESR1 Bus Master IDE Status Reg 0 for
> secondary channel
>       bit2 like BMIDESR0 but for channel 1 I think
> 
> So maybe in DMA mode the BM* regs should be used and in legacy mode
> these 
> interrupts would go to ISA IRQ14 and 15 and cleared on read as per
> the IDE 
> spec while in native mode PCI INTA is raised and not cleared but the
> chip 
> docs don't say anything about this so it's only guessing. Does anyone
> have 
> more info on this?
> 
> Regards,
> BALATON Zoltan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]