emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Performance bottleneck in ns_draw_fringe_bitmap


From: Ben Simms
Subject: Performance bottleneck in ns_draw_fringe_bitmap
Date: Wed, 26 Jun 2024 13:56:43 +0200

Hi all, I recently started using Emacs (ns) HEAD on an ARM macos sonoma system.

I've noticed that ns_draw_fringe_bitmap is a fairly large performance sink when using pixel scrolling (to the point of 99% of cpu time being inside this function, with Emacs drawing at approx 5Hz). The slowness here isn't as obvious when not pixel scrolling, presumably because Emacs never tries to redraw at 60+Hz otherwise.

I have performed some profiling and discovered that in my observed worse case situation, of the 99% of cpu time spent in ns_draw_fringe_bitmap, approx 50% is spent in [NSBezierPath copy], and approx 30% in [NSBezierPath fill].

I have used the following benchmark with emacs -Q to attempt to reproduce my encountered performance issue, however I cannot reproduce exactly the extreme case I experience in my config, but I have used this benchmark to validate a patch that solves the slowdown I'm experiencing:

(defun scroll-up-benchmark ()
  (interactive)
  (let ((oldgc gcs-done)
        (oldtime (float-time)))
      (dotimes (_ 10) (pixel--whistlestop-pixel-up (* 5 (pixel-line-height)))  (pixel-scroll-pixel-down (* 5 (pixel-line-height))))
      (princ (format "GCs: %d Elapsed time: %f seconds\n"
                        (- gcs-done oldgc) (- (float-time) oldtime)) #'external-debugging-output)))

(defun add-fringes ()
  (interactive)
  (dotimes (_ 20)
    (newline-and-indent)
    (insert "A")
    (goto-char (line-beginning-position))
    (let ((s "x")
          (fringe-overlay (make-overlay (point) (1+ (point)))))
      (put-text-property 0 1 'display (list 'left-fringe 'left-triangle) s)
      (overlay-put fringe-overlay 'before-string s))
    (goto-char (line-end-position))))

(scroll-bar-mode -1)
(menu-bar-mode -1)
(pixel-scroll-mode)
(pixel-scroll-precision-mode)
(setq
      pixel-scroll-precision-use-momentum t)

(dotimes (_ 20)
  (add-fringes))
(dotimes (_ 5)
  (end-of-buffer)
  (condition-case nil (while t (scroll-down))
    (error nil))
  (scroll-up-benchmark))


On Emacs (e4e1d0cd0) this reports the following:

GCs: 14 Elapsed time: 5.449032 seconds
GCs: 14 Elapsed time: 5.209006 seconds
GCs: 13 Elapsed time: 5.187779 seconds
GCs: 13 Elapsed time: 5.178472 seconds
GCs: 13 Elapsed time: 5.184741 seconds

The profiler output for this is the following:

Weight                 Self Weight                                 Symbol Name
11.30 Gc  100.0%                 1.00 Mc                                 Fredisplay
11.30 Gc   99.9%                 -                                 redisplay_preserve_echo_area
11.12 Gc   98.3%                 1.00 Mc                                  redisplay_internal
6.97 Gc   61.6%                 -                                   internal_condition_case_1
6.97 Gc   61.6%                 -                                    redisplay_window_1
6.97 Gc   61.6%                 -                                     redisplay_window
5.60 Gc   49.5%                 -                                      try_window
1.25 Gc   11.0%                 1.00 Mc                                      display_mode_lines
111.68 Mc    0.9%                 -                                      update_frame_tool_bar
6.00 Mc    0.0%                 -                                      gui_consider_frame_title
2.00 Mc    0.0%                 1.00 Mc                                      update_window_fringes
1.00 Mc    0.0%                 1.00 Mc                                      reconsider_clip_changes
1.00 Mc    0.0%                 -                                      cursor_row_fully_visible_p
1.00 Mc    0.0%                 1.00 Mc                                      ___chkstk_darwin
1.00 Mc    0.0%                 1.00 Mc                                      redisplay_tab_bar
1.00 Mc    0.0%                 1.00 Mc                                      try_window_id
3.64 Gc   32.2%                 -                                   update_frame
3.47 Gc   30.6%                 -                                    update_window_tree
3.47 Gc   30.6%                 1.00 Mc                                     update_window
2.15 Gc   19.0%                 -                                      gui_update_window_end
2.11 Gc   18.6%                 -                                       draw_window_fringes
2.09 Gc   18.4%                 1.00 Mc                                        draw_row_fringe_bitmaps
20.00 Mc    0.1%                 -                                        set_buffer_internal_1
40.00 Mc    0.3%                 -                                       display_and_set_cursor
1.00 Mc    0.0%                 1.00 Mc                                       gui_draw_vertical_border
586.18 Kc    0.0%                 -                                       unblock_input
1.28 Gc   11.3%                 3.00 Mc                                      update_window_line
25.00 Mc    0.2%                 3.00 Mc                                      scrolling_window
9.39 Mc    0.0%                 3.00 Mc                                      update_window_fringes
1.00 Mc    0.0%                 1.00 Mc                                      redraw_overlapped_rows
1.00 Mc    0.0%                 1.00 Mc                                      redraw_overlapping_rows
152.53 Mc    1.3%                 -                                    update_begin
18.72 Mc    0.1%                 -                                    update_end
397.05 Mc    3.5%                 -                                   prepare_menu_bars
91.41 Mc    0.8%                 -                                   echo_area_display
5.00 Mc    0.0%                 -                                   ns_frame_up_to_date
4.01 Mc    0.0%                 -                                   start_polling
4.00 Mc    0.0%                 1.00 Mc                                   unbind_to
3.00 Mc    0.0%                 2.00 Mc                                   run_window_change_functions
2.00 Mc    0.0%                 -                                   hscroll_windows
1.00 Mc    0.0%                 1.00 Mc                                   XCONS
1.00 Mc    0.0%                 -                                   Fgethash
1.00 Mc    0.0%                 -                                   clear_desired_matrices
1.00 Mc    0.0%                 1.00 Mc                                   mark_window_display_accurate_1
1.00 Mc    0.0%                 -                                   ns_set_doc_edited
1.00 Mc    0.0%                 1.00 Mc                                   clear_garbaged_frames
181.29 Mc    1.6%                 -                                  flush_frame
1.00 Mc    0.0%                 -                                  unbind_to


11.90 Gc  100.0%           -                     Fredisplay
11.90 Gc   99.9%           -                     redisplay_preserve_echo_area
11.66 Gc   97.9%           2.00 Mc                      redisplay_internal
7.48 Gc   62.8%           -                       internal_condition_case_1
7.48 Gc   62.8%           -                        redisplay_window_1
7.48 Gc   62.8%           3.00 Mc                         redisplay_window
6.35 Gc   53.3%           -                          try_window
6.23 Gc   52.3%           37.03 Mc                           display_line
73.70 Mc    0.6%           -                           start_display
44.01 Mc    0.3%           -                           partial_line_height
1.00 Mc    0.0%           1.00 Mc                           append_space_for_newline
1.00 Mc    0.0%           1.00 Mc                           gui_produce_glyphs
977.48 Mc    8.2%           1.00 Mc                          display_mode_lines
125.80 Mc    1.0%           -                          update_frame_tool_bar
10.00 Mc    0.0%           -                          gui_consider_frame_title
7.00 Mc    0.0%           3.00 Mc                          update_window_fringes
1.00 Mc    0.0%           -                          cursor_row_fully_visible_p
1.00 Mc    0.0%           -                          unbind_to
1.00 Mc    0.0%           -                          WINDOWP
1.00 Mc    0.0%           -                          window_wants_mode_line
1.00 Mc    0.0%           1.00 Mc                          window_scroll_margin
3.24 Gc   27.2%           -                       update_frame
3.06 Gc   25.7%           -                        update_window_tree
3.06 Gc   25.7%           1.00 Mc                         update_window
1.58 Gc   13.2%           1.00 Mc                          gui_update_window_end
1.54 Gc   12.9%           -                           draw_window_fringes
1.50 Gc   12.6%           -                            draw_row_fringe_bitmaps
1.50 Gc   12.6%           2.00 Mc                             draw_fringe_bitmap
608.35 Kc    0.0%           608.35 Kc                             FRAME_RIGHT_FRINGE_WIDTH
34.01 Mc    0.2%           -                            set_buffer_internal_1
38.00 Mc    0.3%           1.00 Mc                           display_and_set_cursor
1.44 Gc   12.0%           -                          update_window_line
31.00 Mc    0.2%           2.00 Mc                          scrolling_window
12.00 Mc    0.1%           6.00 Mc                          update_window_fringes
1.00 Mc    0.0%           1.00 Mc                          window_wants_mode_line
1.00 Mc    0.0%           1.00 Mc                          window_text_bottom_y
1.00 Mc    0.0%           1.00 Mc                          redraw_overlapped_rows
214.49 Kc    0.0%           214.49 Kc                          clip_to_bounds
159.74 Mc    1.3%           -                        update_begin
20.00 Mc    0.1%           -                        update_end
20.00 Mc    0.1%           -                         ns_update_end
815.15 Mc    6.8%           -                       prepare_menu_bars
106.17 Mc    0.8%           -                       echo_area_display
10.00 Mc    0.0%           -                       ns_frame_up_to_date
3.00 Mc    0.0%           1.00 Mc                       run_window_change_functions
3.00 Mc    0.0%           -                       ns_set_doc_edited
1.00 Mc    0.0%           -                       specbind
1.00 Mc    0.0%           -                       start_polling
1.00 Mc    0.0%           -                       update_overlay_arrows
1.00 Mc    0.0%           -                       hscroll_windows
237.72 Mc    1.9%           -                      flush_frame
188.97 Kc    0.0%           -                      unbind_to
2.00 Mc    0.0%           -                     swallow_events

And with a patch I have developed that uses masked bitmaps instead of beziers for drawing fringes:

GCs: 14 Elapsed time: 5.091162 seconds
GCs: 14 Elapsed time: 4.825966 seconds
GCs: 13 Elapsed time: 4.793364 seconds
GCs: 13 Elapsed time: 4.785960 seconds
GCs: 13 Elapsed time: 4.782470 seconds

With the following profiler output:

8.55 Gc  100.0% - Fredisplay
8.45 Gc   98.9% - redisplay_preserve_echo_area
8.28 Gc   96.9% 1.00 Mc  redisplay_internal
5.35 Gc   62.5% -   internal_condition_case_1
5.35 Gc   62.5% -    redisplay_window_1
5.35 Gc   62.5% 1.00 Mc     redisplay_window
4.61 Gc   53.8% 1.11 Mc      try_window
639.37 Mc    7.4% -      display_mode_lines
86.85 Mc    1.0% -      update_frame_tool_bar
8.00 Mc    0.0% 1.00 Mc      gui_consider_frame_title
2.00 Mc    0.0% 2.00 Mc      update_window_fringes
2.00 Mc    0.0% 2.00 Mc      try_window_id
1.00 Mc    0.0% -      window_wants_mode_line
1.00 Mc    0.0% 1.00 Mc    push_handler
2.37 Gc   27.7% -   update_frame
2.22 Gc   25.9% -    update_window_tree
2.22 Gc   25.9% 3.00 Mc     update_window
1.12 Gc   13.0% -      gui_update_window_end
1.10 Gc   12.8% 1.00 Mc       draw_window_fringes
1.08 Gc   12.6% -        draw_row_fringe_bitmaps
1.08 Gc   12.6% 1.00 Mc         draw_fringe_bitmap
1.08 Gc   12.5% 12.48 Mc          draw_fringe_bitmap_1
905.12 Mc   10.5% 1.03 Mc           ns_draw_fringe_bitmap
667.92 Mc    7.8% -            CGContextDrawImageWithOptions
81.91 Mc    0.9% 1.00 Mc            NSRectFill
45.65 Mc    0.5% -            NSColorSetWithFillAndStroke
29.07 Mc    0.3% 2.00 Mc            ns_row_rect
22.53 Mc    0.2% -            ns_focus
11.00 Mc    0.1% 2.00 Mc            +[NSColor(EmacsColor) colorWithUnsignedLong:]
8.00 Mc    0.0% 1.00 Mc            ns_unfocus
8.00 Mc    0.0% -            -[_NSTaggedPointerColor CGColor]
8.00 Mc    0.0% 2.00 Mc            -[__NSDictionaryM objectForKey:]
4.00 Mc    0.0% 4.00 Mc            NSUnionRect
3.00 Mc    0.0% 3.00 Mc            objc_msgSend
2.00 Mc    0.0% 2.00 Mc            gui_define_fringe_bitmap
2.00 Mc    0.0% 2.00 Mc            CGContextSetCompositeOperation
1.00 Mc    0.0% -            objc_autorelease
1.00 Mc    0.0% 1.00 Mc            CGContextClipToRect
1.00 Mc    0.0% -            CGGStateSetFillColor
1.00 Mc    0.0% 1.00 Mc            _objc_rootAutorelease
1.00 Mc    0.0% 1.00 Mc            NSIntersectionRect
1.00 Mc    0.0% 1.00 Mc            CGContextSaveGState
1.00 Mc    0.0% 1.00 Mc            NSMakeRect
1.00 Mc    0.0% 1.00 Mc            -[NSObject autorelease]
1.00 Mc    0.0% 1.00 Mc            CGDataProviderIsZombie
1.00 Mc    0.0% 1.00 Mc            DYLD-STUB$$NSRectFill
1.00 Mc    0.0% 1.00 Mc            objc_msgSend$colorWithUnsignedLong:
129.38 Mc    1.5% 11.34 Mc           lookup_named_face
8.00 Mc    0.0% -           window_box_right
4.31 Mc    0.0% 1.00 Mc           window_wants_tab_line
5.00 Mc    0.0% 1.00 Mc           window_wants_header_line
3.09 Mc    0.0% 3.00 Mc           builtin_lisp_symbol
2.00 Mc    0.0% 1.00 Mc           window_box_left
2.00 Mc    0.0% 1.00 Mc           FRAME_INTERNAL_BORDER_WIDTH
1.55 Mc    0.0% 1.55 Mc           FACE_FROM_ID_OR_NULL
1.00 Mc    0.0% -           WINDOWP
1.00 Mc    0.0% 1.00 Mc           EQ
260.83 Kc    0.0% 260.83 Kc           EQ
1.00 Mc    0.0% 1.00 Mc           get_fringe_bitmap_data
1.00 Mc    0.0% 1.00 Mc          ns_draw_fringe_bitmap
1.00 Mc    0.0% 1.00 Mc         FRAME_RIGHT_FRINGE_WIDTH
20.00 Mc    0.2% 1.00 Mc        set_buffer_internal_1
19.00 Mc    0.2% 1.00 Mc       display_and_set_cursor
1.07 Gc   12.4% 2.00 Mc      update_window_line
21.25 Mc    0.2% 6.00 Mc      scrolling_window
7.00 Mc    0.0% 2.00 Mc      update_window_fringes
646.38 Kc    0.0% -      window_text_bottom_y
139.87 Mc    1.6% -    update_begin
16.01 Mc    0.1% -    update_end
484.59 Mc    5.6% -   prepare_menu_bars
63.12 Mc    0.7% 365.24 Kc   echo_area_display
3.13 Mc    0.0% -   hscroll_windows
2.51 Mc    0.0% -   ns_set_doc_edited
2.00 Mc    0.0% 2.00 Mc   overlay_arrows_changed_p
2.00 Mc    0.0% 1.00 Mc   run_window_change_functions
1.00 Mc    0.0% 1.00 Mc   reset_outermost_restrictions
651.79 Kc    0.0% 651.79 Kc   ___chkstk_darwin
1.00 Mc    0.0% 1.00 Mc   mark_window_display_accurate_1
1.00 Mc    0.0% -   unbind_to
1.00 Mc    0.0% -   start_polling
169.35 Mc    1.9% -  flush_frame
1.00 Mc    0.0% -  unbind_to
51.01 Mc    0.5% - detect_input_pending_run_timers
41.13 Mc    0.4% - swallow_events


I've put together a patch that draws using a masked CGImage instead of an NSBezier. However I'm not sure if it's any good as I'm completely unfamiliar with the Emacs codebase and macos/NS graphics APIs. Also I believe NS emacs wants to be compatible with GNUStep and I have no idea if this remains compatible:

From d54c091002aff8f74f8165dd21d65075f028e728 Mon Sep 17 00:00:00 2001
From: Ben Simms <ben@bensimms.moe>
Date: Mon, 24 Jun 2024 23:35:29 +0200
Subject: [PATCH] Draw fringe using bitmaps, not huge beziers

---
 src/nsterm.m | 55 ++++++++++++++++++++++++----------------------------
 1 file changed, 25 insertions(+), 30 deletions(-)

diff --git a/src/nsterm.m b/src/nsterm.m
index 794630de1c..9c781e3bd6 100644
--- a/src/nsterm.m
+++ b/src/nsterm.m
@@ -2903,22 +2903,24 @@ Hide the window (X11 semantics)
 static void
 ns_define_fringe_bitmap (int which, unsigned short *bits, int h, int w)
 {
-  NSBezierPath *p = [NSBezierPath bezierPath];
-
   if (!fringe_bmp)
     fringe_bmp = [[NSMutableDictionary alloc] initWithCapacity:25];
 
-  [p moveToPoint:NSMakePoint (0, 0)];
 
-  for (int y = 0 ; y < h ; y++)
-    for (int x = 0 ; x < w ; x++)
-      {
-        bool bit = bits[y] & (1 << (w - x - 1));
-        if (bit)
-          [p appendBezierPathWithRect:NSMakeRect (x, y, 1, 1)];
-      }
+  for (int i = 0; i < h; i++)
+    bits[i] = ~bits[i];
+
+  CGDataProviderRef provider = CGDataProviderCreateWithData (NULL, bits,
+					   sizeof (unsigned short) * h, NULL);
+  if (provider) {
+    id p = (id)CGImageMaskCreate (w, h, 1, 1,
+                 sizeof (unsigned short),
+                 provider, NULL, 0);
+    CGDataProviderRelease (provider);
+
+    [fringe_bmp setObject:p forKey:[NSNumber numberWithInt:which]];
+  }
 
-  [fringe_bmp setObject:p forKey:[NSNumber numberWithInt:which]];
 }
 
 
@@ -2981,37 +2983,30 @@ Hide the window (X11 semantics)
       NSRectFill (clearRect);
     }
 
-  NSBezierPath *bmp = [fringe_bmp objectForKey:[NSNumber numberWithInt:p->which]];
+  CGImageRef bmp = (CGImageRef)[fringe_bmp objectForKey:[NSNumber numberWithInt:p->which]];
 
   if (bmp == nil
       && p->which < max_used_fringe_bitmap)
     {
       gui_define_fringe_bitmap (f, p->which);
-      bmp = [fringe_bmp objectForKey: [NSNumber numberWithInt: p->which]];
+      bmp = (CGImageRef)[fringe_bmp objectForKey: [NSNumber numberWithInt: p->which]];
     }
 
   if (bmp)
     {
-      NSAffineTransform *transform = [NSAffineTransform transform];
-      NSColor *bm_color;
+      CGRect bounds = CGRectMake (p->x, p->y - p->dh,
+			   CGImageGetWidth (bmp), CGImageGetHeight (bmp));
 
-      /* Because the image is defined at (0, 0) we need to take a copy
-         and then transform that copy to the new origin.  */
-      bmp = [bmp copy];
-      [transform translateXBy:p->x yBy:p->y - p->dh];
-      [bmp transformUsingAffineTransform:transform];
+      NSGraphicsContext *ctx = [NSGraphicsContext currentContext];
+      CGContextRef context = [ctx CGContext];
 
-      if (!p->cursor_p)
-        bm_color = [NSColor colorWithUnsignedLong:face->foreground];
-      else if (p->overlay_p)
-        bm_color = [NSColor colorWithUnsignedLong:face->background];
-      else
-        bm_color = f->output_data.ns->cursor_color;
+      CGContextTranslateCTM (context,
+			     CGRectGetMinX (bounds), CGRectGetMaxY (bounds));
+      CGContextScaleCTM (context, 1, -1);
 
-      [bm_color set];
-      [bmp fill];
-
-      [bmp release];
+      CGContextSetFillColorWithColor (context, [[NSColor colorWithUnsignedLong:face->foreground] CGColor]);
+      bounds.origin = CGPointZero;
+      CGContextDrawImage (context, bounds, bmp);
     }
   ns_unfocus (f);
 }
-- 
2.45.1


--
Ben Simms

reply via email to

[Prev in Thread] Current Thread [Next in Thread]