Re: [PATCH v2 5/7] sm501: Replace hand written implementation with pixma

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 5/7] sm501: Replace hand written implementation with pixma

From:	BALATON Zoltan
Subject:	Re: [PATCH v2 5/7] sm501: Replace hand written implementation with pixman where possible
Date:	Tue, 26 May 2020 15:35:21 +0200 (CEST)
User-agent:	Alpine 2.22 (BSF 395 2020-01-19)

On Tue, 26 May 2020, Gerd Hoffmann wrote:

On Thu, May 21, 2020 at 09:39:44PM +0200, BALATON Zoltan wrote:

Besides being faster this should also prevent malicious guests to
abuse 2D engine to overwrite data or cause a crash.

         uint32_t src_base = s->twoD_source_base & 0x03FFFFFF;
-        uint8_t *src = s->local_mem + src_base;

-                    val = *(_pixel_type *)&src[index_s];                      \


Well, the advantage of *not* using pixman is that you can easily switch
the code to use offsets instead of pointers, then apply the mask to the
*final* offset to avoid oob data access:

The mask applied to src_base is not to prevent overflow but to implementregister limits. Only these bits are valid if I remember correctly, soeven if I use offsets I need to check for overflow. This patch basicallydoes that by changing parameters to unsigned to prevent them beingnegative, checking values we multiply by to prevent them to be zero andthen calculating first and last offset and check if they are within vram.(Unless of course I've made a mistake somewhere.) This should preventoverflow with one check and does not need to apply a mask at every step.The vram size can also be different so it's not a fixed mask anyway.

If not using pixman then I'd need to reimplement optimised 2D ops thatwill likely never be as good as pixman and no point in doing it severaltimes for every device model so I'd rather try to use pixman wherepossible unless a better library is available.

   val = *(_pixel_type*)(&s->local_mem[(s->twoD_source_base + index_s) & 
0x03FFFFFF]);

+        if ((rop_mode && rop == 0x5) || (!rop_mode && rop == 0x55)) {
+            /* Invert dest, is there a way to do this with pixman? */


PIXMAN_OP_XOR maybe?


Maybe, but looking at the pixman source I couldn't decide if

UN8x4_MUL_UN8_ADD_UN8x4_MUL_UN8 (s, dest_ia, d, src_ia);

seen here:
https://cgit.freedesktop.org/pixman/tree/pixman/pixman-combine32.c#n396
is really the same as s ^ d.

+            if (rtl && ((db >= sb && db <= se) || (de >= sb && de <= se))) {
+                /* regions may overlap: copy via temporary */


The usual way for a hardware blitter is to have a direction bit, i.e.
the guest os can ask to blit in top->bottom or bottom->top scanline
ordering.  The guest can use that to make sure the blit does not

Yes, this is the rtl above (right to left) and AmigaOS sets this most ofthe time so only relying on that to detect overlaps is not efficient.

overwrite things.  But note the guest can also intentionally use
overlapping regions, i.e. memset(0) the first scanline, then use a blit
with overlap to clear the whole screen.  The later will surely break if
you blit via temporary image ...

Fortunately no guest code seems to do that so unless we find one needingit I don't worry much about such rare cases. It would be best if pixmansupported this but while I've found patches were submitted they did notget merged so far so using a temporary seems to be the simplest way thatworks well enough for now.

+                pixman_blt((uint32_t *)&s->local_mem[src_base],
+                           (uint32_t *)&s->local_mem[dst_base],
+                           src_pitch * (1 << format) / sizeof(uint32_t),
+                           dst_pitch * (1 << format) / sizeof(uint32_t),
+                           8 * (1 << format), 8 * (1 << format),
+                           src_x, src_y, dst_x, dst_y, width, height);


See above, i'm not convinced pixman is the best way here.
When using pixman I'd suggest:

 (1) src = pixman_image_create_bits_no_clear(...);
 (2) dst = pixman_image_create_bits_no_clear(...);
 (3) pixman_image_composite(PIXMAN_OP_SRC, src, NULL, dst, ...);
 (4) pixman_image_unref(src);
 (5) pixman_image_unref(dst);

pixman_blt() is probably doing basically the same.

Actually not the same, pixman_blt is faster operating directly on pointerswhile we need all the pixman_image overhead to use pixman_image_composite.Blitter is used for a lot of small ops (I've seen AmigaOS even call itwith 1 pixel regions) so going through pixman_image every time does notseem to be efficient. To implement more complex ops this may be needed soI may try to figure that out later but I'd need some test cases to test ifthe results are correct. The current patches do the same as before (exceptfor some rare overlapping cases as you noted above but we haven't observedany yet) and fix the overflows so this was the best I could do in the timeI had. Maybe I try to improve this later but don't plan to rewrite it now.

 The advantage of not
using pixman_blt() is that

 (a) you can also use pixman ops other than PIXMAN_OP_SRC, and
 (b) you can have a helper function for (1)+(2) which very carefully
     applies sanity checks to make sure the pixman image created stays
     completely inside s->local_mem.
 (c) you have the option to completely rearrange the code flow, for
     example update the src pixman image whenever the guest touches
     src_base or src_pitch or format instead of having a
     create/op/unref cycle on every blitter op.

From traces I think most guest would write bltter related regs on every op

so probably not worth the hassle to try to update regions on registeraccess and we could do it on every op, possibly optimising 1 pixel blitsand small regions via some special cases but even then simple copy imageis probably the most common op that might worth doing via pixman_blt asit's expected to be frequently used so the less overhead is the better.Therefore I'd only use image_composite for more complex ops but that's toomuch effort for a relatively unused device model. Maybe for ati-vga I'lltry to make it better but first should fix microengine for that so driverscan talk to it. I'd rather spend my limited free time on that than furtherimproving sm501 unless some bugs show up.

+        pixman_fill((uint32_t *)&s->local_mem[dst_base],
+                    dst_pitch * (1 << format) / sizeof(uint32_t),
+                    8 * (1 << format), dst_x, dst_y, width, height, color);


 (1) src = pixman_image_create_solid(...), otherwise same as above ;)

Same argument as composite_image and for fill we don't even have anyadvantage so while for composite implementing other ops is a reason to notuse pixman_blt I see no reason to not go the fastest way for fill.


Regards,
BALATON Zoltan

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v2 0/7] Misc display/sm501 clean ups and fixes, BALATON Zoltan, 2020/05/21
- [PATCH v2 3/7] sm501: Use BIT(x) macro to shorten constant, BALATON Zoltan, 2020/05/21
- [PATCH v2 7/7] sm501: Remove obsolete changelog and todo comment, BALATON Zoltan, 2020/05/21
- [PATCH v2 2/7] sm501: Shorten long variable names in sm501_2d_operation, BALATON Zoltan, 2020/05/21
- [PATCH v2 5/7] sm501: Replace hand written implementation with pixman where possible, BALATON Zoltan, 2020/05/21
  - Re: [PATCH v2 5/7] sm501: Replace hand written implementation with pixman where possible, Gerd Hoffmann, 2020/05/26
    - Re: [PATCH v2 5/7] sm501: Replace hand written implementation with pixman where possible, BALATON Zoltan <=
    - Re: [PATCH v2 5/7] sm501: Replace hand written implementation with pixman where possible, Gerd Hoffmann, 2020/05/27
    - Re: [PATCH v2 5/7] sm501: Replace hand written implementation with pixman where possible, BALATON Zoltan, 2020/05/27
- [PATCH v2 4/7] sm501: Clean up local variables in sm501_2d_operation, BALATON Zoltan, 2020/05/21
- [PATCH v2 6/7] sm501: Optimize small overlapping blits, BALATON Zoltan, 2020/05/21
- [PATCH v2 1/7] sm501: Convert printf + abort to qemu_log_mask, BALATON Zoltan, 2020/05/21
- Re: [PATCH v2 0/7] Misc display/sm501 clean ups and fixes, no-reply, 2020/05/21
- Re: [PATCH v2 0/7] Misc display/sm501 clean ups and fixes, Gerd Hoffmann, 2020/05/28

Prev by Date: [PULL 13/14] hw/mips/mips_int: De-duplicate KVM interrupt delivery
Next by Date: Re: [PATCH 00/14] hw/mips: patch queue for 2020-05-26
Previous by thread: Re: [PATCH v2 5/7] sm501: Replace hand written implementation with pixman where possible
Next by thread: Re: [PATCH v2 5/7] sm501: Replace hand written implementation with pixman where possible
Index(es):
- Date
- Thread