Skip to content
Snippets Groups Projects
user avatar
Claudiu Manoil authored
The change is significant since it affects the rx hot path.
Paul observed and documented the effects at asm level, see
below:

"It turns out that it does make a difference, since gfar_process_frame
gets inlined, and so the increment code gets moved out of line (I have
marked the if statment with * and the increment code within "-----"):

  ------------------------- as is currently ------------------
     4d14:       80 61 00 18     lwz     r3,24(r1)
     4d18:       7f c4 f3 78     mr      r4,r30
     4d1c:       48 00 00 01     bl      4d1c <gfar_clean_rx_ring+0x10c>
  *  4d20:       2f 83 00 04     cmpwi   cr7,r3,4
     4d24:       40 9e 00 1c     bne-    cr7,4d40
<gfar_clean_rx_ring+0x130>
        ----------------------------
     4d28:       81 3c 01 f8     lwz     r9,504(r28)
     4d2c:       81 5c 01 fc     lwz     r10,508(r28)
     4d30:       31 4a 00 01     addic   r10,r10,1
     4d34:       7d 29 01 94     addze   r9,r9
     4d38:       91 3c 01 f8     stw     r9,504(r28)
     4d3c:       91 5c 01 fc     stw     r10,508(r28)
        ----------------------------
     4d40:       a0 1f 00 24     lhz     r0,36(r31)
     4d44:       81 3f 00 00     lwz     r9,0(r31)
     4d48:       7f a4 eb 78     mr      r4,r29
     4d4c:       7f e3 fb 78     mr      r3,r31

  -------------------------- unlikely ------------------------
     4d14:       80 61 00 18     lwz     r3,24(r1)
     4d18:       7f c4 f3 78     mr      r4,r30
     4d1c:       48 00 00 01     bl      4d1c <gfar_clean_rx_ring+0x10c>
  *  4d20:       2f 83 00 04     cmpwi   cr7,r3,4
     4d24:       41 9e 03 94     beq-    cr7,50b8
<gfar_clean_rx_ring+0x4a8>
     4d28:       a0 1f 00 24     lhz     r0,36(r31)
     4d2c:       81 3f 00 00     lwz     r9,0(r31)
     4d30:       7f a4 eb 78     mr      r4,r29
     4d34:       7f e3 fb 78     mr      r3,r31
[...]
     50b8:       81 3c 01 f8     lwz     r9,504(r28)
     50bc:       81 5c 01 fc     lwz     r10,508(r28)
     50c0:       31 4a 00 01     addic   r10,r10,1
     50c4:       7d 29 01 94     addze   r9,r9
     50c8:       91 3c 01 f8     stw     r9,504(r28)
     50cc:       91 5c 01 fc     stw     r10,508(r28)
     50d0:       4b ff fc 58     b       4d28 <gfar_clean_rx_ring+0x118>

So, the increment does actually get moved ~1k away."

Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
bd9e89f2
History
Name Last commit Last update