Hi,
quick follow-up. *As always* you find a problem right after asking for
help :). The updated patch does not cause BSOD on Windows guests, but
neither does it fix the actual problem (of the program seg-faulting)
I would really appreciate any feedback on the proposed patch below - the
difference to the previous patch is that we clear out undefined flags
and only keep the Z-flag (and update the C-flag)
Thanks guys!
-Clemens
On Wed, Apr 9, 2014 at 10:33 AM, Clemens Kolbitsch
<address@hidden <mailto:address@hidden>> wrote:
Hi guys,
I have to revive a rather old thread [1,2]. A quick summary of the
issue:
TCG emulates the BT (bit-test) instruction using a SAR to re-compute
eflags. While SAR affects all flags, BT only affects the C-flag and
leaves the Z-flag unaffected/unchanged [3].
According to the x86 manual, the current behavior is correct
(although not perfect performance-wise), but NOT correct according
to newer Intel CPU instruction sets (such as x86-64). What has
caused some confusion in the past it seems is that AMD's manual
mentions that all flags other than the C-flag are "undefined" (just
like in Intel's old x86 manual).
A patch has been suggested (although not applied) before [2] and I
was trying to update and try the patch to the QEMU 2.0 RC2, but it
seems not to be working (causes a BSOD on Windows guests).
BTW: I have a program that seg-faults on Windows clients (32 as well
as 64 bit guest OSes) in vanilla QEMU 2.0 RC2 (just like in previous
versions), so this is not just of theoretical importance :)
Can somebody say if there is something that I'm doing obviously wrong?
*NON-FUNCTIONAL PATCH!*
--- ../orig/qemu-2.0.0-rc2/target-i386/translate.c 2014-04-08
12:38:58.000000000 -0700
+++ target-i386/translate.c 2014-04-09 10:08:43.252084947 -0700
@@ -6710,8 +6710,14 @@
tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 << (3 + ot)) - 1);
switch(op) {
case 0:
- tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]);
- tcg_gen_movi_tl(cpu_cc_dst, 0);
+ /* whatever the last CC-op is - recompute now so we can
OR-in
+ * updated results */
+ gen_update_cc_op(s);
+ gen_compute_eflags(s);
+ tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
+ tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C);
+ tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4);
+ set_cc_op(s, CC_OP_EFLAGS);
break;
case 1:
tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
@@ -6734,8 +6740,8 @@
tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
break;
}
- set_cc_op(s, CC_OP_SARB + ot);
if (op != 0) {
+ set_cc_op(s, CC_OP_SARB + ot);
if (mod != 3) {
gen_op_st_v(s, ot, cpu_T[0], cpu_A0);
} else {
Thanks, I'd really appreciate any input you can provide.
-Clemens
[1] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00442.html
[2] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00764.html
[3] page
148
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf