[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Sparc-softmmu -- Debugging Results and Suggestions Requ
Re: [Qemu-devel] Sparc-softmmu -- Debugging Results and Suggestions Request
Sat, 19 May 2012 09:03:03 +0000
On Thu, May 17, 2012 at 11:07 PM, Paul Wilhelm <address@hidden> wrote:
> I've been trying to debug a problem with Solaris 8 running on sparc-softmmu.
> The syslog daemon in very unreliable (about 7 of 8 starts of the syslog
> daemon end in a daemon hang - the daemon can be "killed" and restarted
> Background: I looked at the syslogd.c code on the Oracle web site to see
> what syslogd is doing. As part of initialization, syslogd tries to parse
> syslog.conf. To read the syslog.conf file, syslogd creates a pipe for output
> from m4. m4 is used to parse the syslog.conf file. Output from m4 is put
> into the pipe that is then read by the parent process.
> Here is what I've done so far / learned:
> After boot, I log in and stop the syslogd.
> I then truss (using -a -f and -sall flags) syslogd with the "-d" flag.
> On Qemu, the syslog daemon stops after the child process exits. No
> information generated by the child process and put into the pipe gets to the
> parent process before the hang. When I send SIGINT twice (hit ctrl-c twice
> -- one ctrl-c does not unblock pipe), the parent job sees the data in the
> pipe and completes reading data in the pipe before the syslog daemon exits
> due to the SIGINT.
> I thought the issue might be related to something that m4 was doing, so I
> replaced it with a shell script that output text like that actually output
> by m4 (I manually parsed the syslog.conf file). I saw the same behaviour -
> the syslogd parent process hung about the time the child process exited.
> I tried this on real Sparc hardware with the same OS. On real Sparc hardware
> the data appears in the pipe for use by the parent process about the time
> the child process exits.
> I thought the Qemu parent process might not be getting a SIGCLD or the
> SIGCLD might not be sent by the child when it exits. So I have tried sending
> SIGCLD manually using "kill". If I send SIGCLD twice (once does not unblock
> the pipe, but I do see system activity from truss with this first signal),
> the pipe is unblocked. The results are not consistent after the pipe is
> unblocked. Syslogd may post messages to the log file, it may take injecting
> new messages using "logger" to cause the backlog of messages to get to the
> log file, or I may need to restart syslog daemon (and SIGCLD again) to get
> messages to the log file.
> What to do next?
> I am not sure what to do next to help isolate what is going on (or not going
> on that needs to be). It looks like something with signals is not working
> correctly. I could try putting together a simple program to create a pipe
> much as is done in syslogd to try to replicate the issue with pipes in a
> simple way. But, I'm not sure how to dig deeper even if I were able to
> replicate the issue with a small program.
> Another thought is to create a Linux / Sparc32 machine to see if this issue
> is apparent there, as well. Having a simple program as noted above might
> help with this.
The signal or pipe handling code in the OS probably uses a corner case
of some instruction which is emulated incorrectly, or maybe the
no-fault mode in MMU could be the usual suspect.
In the former case, you could try enabling -d in_asm and check the log
if near the signal something unusual happens. The log is going the be
For the no-fault mode, you could try changing MMU_NF handling code in
Alternatively, if you have access to x86 Solaris, you could try to
make a solaris-user emulator so that only syslogd process would be
emulated. I've sent rough initial patches for that earlier.
> Other Info:
> SunOS Release 5.8 Version Generic_108528-11 32-bit