Author |
Topic: Setting a register to zero if it's < zero (Read 1432 times) |
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Setting a register to zero if it's < zero
« Reply #35 on: Oct 9th, 2011, 10:23pm » |
|
on Oct 9th, 2011, 3:27pm, David Williams wrote:| I'm thinking of calling it "DesaturateColour"! Isn't that more sensible? |
|
Yes!
If you're looking for ways to tidy up your code, please note that this:
Code:
sub ebp, 1
shl ebp, 2
add ebp, esi can be replaced by this (just 4 bytes):
Code: There's no significant speed impact, because it's not in a loop, but in terms of elegance there's no contest!
Richard.
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: Setting a register to zero if it's < zero
« Reply #36 on: Oct 9th, 2011, 10:28pm » |
|
on Oct 9th, 2011, 10:23pm, Richard Russell wrote: Code:
There's no significant speed impact, because it's not in a loop, but in terms of elegance there's no contest! |
|
There was no excuse for me to miss that one, really, especially since I had read this article not long ago:
http://bb4w.wikispaces.com/Using+the+lea+instruction
Thanks again.
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Setting a register to zero if it's < zero
« Reply #37 on: Oct 10th, 2011, 01:10am » |
|
on Oct 9th, 2011, 10:17pm, David Williams wrote:| For those who've been following the discussion, here's a quick demo of GFXLIB_ColourDesaturate (compiled EXE) |
|
Here's a MMX version of GFXLIB_ColourDrain:
Code:
; REM. SYS GFXLIB_ColourDrain%, pBitmap%, numPixels%, f%
;
; Parameters -- pBitmap%, numPixels%, f%
;
; pBitmap% - points to base address of 32-bpp ARGB bitmap
; numPixels% - number of pixels in bitmap
;
; f% (''colour-drain'' factor) is 12.20 fixed-point integer; range (0.0 to 1.0)*2^20 (Note 2^20 = &100000)
;
; f% is clamped (by this routine) to 0 or 2^20 (&100000)
;
pushad
; ESP!36 = pBitmap%
; ESP!40 = numPixels%
; ESP!44 = f% (= f * 2^20)
mov esi, [esp + 36] ; esi = pBitmap%
mov ebp, [esp + 40] ; numPixels%
lea ebp, [esi + ebp*4]
mov edi, [esp + 44] ; edi = f%
;REM. if f% < 0 then f% = 0
cmp edi, 0 ; f% < 0 ?
jge _.fgtzero%
xor edi, edi ; f% = 0
._.fgtzero%
;REM. if f% >= 2^20 (&100000) then f% = 2^20-1
cmp edi, 2^20 ; f% > 2^20 ?
jl _.flt2p20%
mov edi, 2^20-1 ; f% = 2^20-1
._.flt2p20%
shr edi, 5
movd mm6, edi
pshufw mm6, mm6, %11000000
movq mm7, [_.matrix%]
._.loop%
punpcklbw mm0,[esi]
punpckhbw mm1,[esi]
psrlw mm0,8
psrlw mm1,8
movq mm2,mm0
movq mm3,mm1
pmaddwd mm0,mm7
pmaddwd mm1,mm7
pshufw mm4,mm0,%01001110
pshufw mm5,mm1,%01001110
paddd mm4,mm0
paddd mm5,mm1
pslld mm4,1
pslld mm5,1
pshufw mm4,mm4,%01010101
pshufw mm5,mm5,%01010101
psubw mm4,mm2
psubw mm5,mm3
pmulhw mm4,mm6
pmulhw mm5,mm6
psllw mm4,1
psllw mm5,1
paddw mm4,mm2
paddw mm5,mm3
packuswb mm4,mm5
movq [esi],mm4
add esi, 8 ; next pixel address
cmp esi, ebp
jb _.loop%
popad
emms
ret 12
._.matrix%
dw 0.114 * 2^15
dw 0.587 * 2^15
dw 0.299 * 2^15
dw 0 I haven't compared its speed with yours, but I would expect it to be faster. As I'm by no means an MMX expert it may well be that it can be improved.
Richard.
|
| « Last Edit: Oct 10th, 2011, 08:33am by admin » |
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: Setting a register to zero if it's < zero
« Reply #38 on: Oct 10th, 2011, 06:55am » |
|
on Oct 10th, 2011, 01:10am, Richard Russell wrote:| Here's a MMX version of GFXLIB_ColourDrain: |
|
Gratefully received. :)
Okay, I had to make one little correction because I discovered that only one (or two?) pixels were being processed in the image.
The MMX version (MMXDesaturateColour) is nearly twice as fast (on my Centrino Duo laptop) as the non-MMX (GR) version.
1000 full-image operations on a 640x480 ARGB32 bitmap took:
4.84 s. (MMX version) 9.22 s (GR version)
The test (compiled EXE) can be downloaded here:
www.bb4wgames.com/misc/mmxdesaturatecolour_vs_colourdrain.zip
As I mentioned yesterday, I'll be dropping the Fisher-Price routine name (ColourDrain) and calling it DesaturateColour.
I won't just grab your MMX code and learn nothing from it, that you can be assured.
Thanks for the code.
David.
---
For the sake of completeness only, I'll list the source for the timed test here:
Code: HIMEM = LOMEM + 5*&100000
HIMEM = (HIMEM + 3) AND -4
PROCfixWindowSize
ON ERROR PROCerror( REPORT$, TRUE )
WinW% = 640
WinH% = 480
VDU 23, 22, WinW%; WinH%; 8, 16, 16, 0 : OFF
INSTALL @lib$ + "GFXLIB2"
PROCInitGFXLIB( d{}, 0 )
INSTALL @lib$ + "GFXLIB_modules\ColourDrain"
PROCInitModule
INSTALL @lib$ + "GFXLIB_modules\MMXDesaturateColour"
PROCInitModule
GetTickCount% = FNSYS_NameToAddress( "GetTickCount" )
flowers% = FNLoadImg( @dir$ + "flowers_640x480.JPG", 0 )
flowers_copy% = FNmalloc( 4 * 640*480 )
timeA_0% = 0
timeA_1% = 0
timeB_0% = 0
timeB_1% = 0
PRINT
PRINT " Conducting timed tests (MMXDesaturateColour vs. ColourDrain)"'
PRINT " (1000 colour desaturations of a 640x480 ARGB32 bitmap)"'
SYS "GetCurrentProcess" TO hprocess%
SYS "SetPriorityClass", hprocess%, &80
PRINT " Timing MMXDesaturateColour..."
df = 0.01
f = 0.0
G% = GFXLIB_MMXDesaturateColour%
SYS GetTickCount% TO timeA_0%
FOR I% = 1 TO 1000
SYS GFXLIB_DWORDCopy%, flowers%, flowers_copy%, 640*480
SYS G%, flowers_copy%, 640*480, f*&100000
f += df
IF f >= 1.0 THEN f = 0.0
NEXT I%
SYS GetTickCount% TO timeA_1%
PRINT " Timing ColourDrain..."
df = 0.01
f = 0.0
G% = GFXLIB_ColourDrain%
SYS GetTickCount% TO timeB_0%
FOR I% = 1 TO 1000
SYS GFXLIB_DWORDCopy%, flowers%, flowers_copy%, 640*480
SYS G%, flowers_copy%, 640*480, f*&100000
f += df
IF f >= 1.0 THEN f = 0.0
NEXT I%
SYS GetTickCount% TO timeB_1%
SYS "GetCurrentProcess" TO hprocess%
SYS "SetPriorityClass", hprocess%, &20
timeA = (timeA_1% - timeA_0%) / 1000
timeB = (timeB_1% - timeB_0%) / 1000
SOUND OFF : SOUND 1, -10, 226, 1
COLOUR 11 : ON
PRINT '" Results" : PRINT " -------"'
PRINT " MMXDesaturateColour took "; timeA; " s."'
PRINT " ColourDrain took "; timeB; " s."''
COLOUR 3 : PRINT " Finished!";
REPEAT UNTIL INKEY(1)=0
END
:
:
:
:
DEF PROCfixWindowSize
LOCAL GWL_STYLE, WS_THICKFRAME, WS_MAXIMIZEBOX, ws%
GWL_STYLE = -16
WS_THICKFRAME = &40000
WS_MAXIMIZEBOX = &10000
SYS "GetWindowLong", @hwnd%, GWL_STYLE TO ws%
SYS "SetWindowLong", @hwnd%, GWL_STYLE, ws% AND NOT (WS_THICKFRAME+WS_MAXIMIZEBOX)
ENDPROC
:
:
:
:
DEF PROCerror( msg$, L% )
OSCLI "REFRESH ON" : ON
COLOUR 1, &FF, &FF, &FF
COLOUR 1
PRINT TAB(1,1)msg$;
IF L% THEN
PRINT " at line "; ERL;
ENDIF
VDU 7
REPEAT UNTIL INKEY(1)=0
ENDPROC
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Setting a register to zero if it's < zero
« Reply #39 on: Oct 10th, 2011, 08:32am » |
|
on Oct 10th, 2011, 06:55am, David Williams wrote:| Okay, I had to make one little correction because I discovered that only one (or two?) pixels were being processed in the image. |
|
Ah yes, was that the edi that should have been an esi? Oddly, it worked here despite the error.
There's another change you should really make. The code as listed affects all four bytes of the resulting pixel (including the most-significant 'alpha' byte). Presumably you would prefer it to leave that byte unchanged, in which case you should alter the third line here as shown:
Code: shr edi, 5
movd mm6, edi
pshufw mm6, mm6, %11000000 Quote:| As I mentioned yesterday, I'll be dropping the Fisher-Price routine name (ColourDrain) and calling it DesaturateColour |
|
I know, but I only had the original version to work from. It seemed safer not to make any unnecessary changes.
Richard.
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: Setting a register to zero if it's < zero
« Reply #40 on: Oct 13th, 2011, 7:40pm » |
|
GFXLIB_MMXDesaturateColour & GFXLIB_BoxBlur3x3:
http://www.bb4wgames.com/misc/mmxdesaturatecolour_example2c.zip (EXE; 163 Kb)
I can imagine using that kind of effect on the title page of some creepy RPG just before the game begins.
David.
======================================
Code:
*ESC OFF
REM Make 3 MB available for this program
M%=3 : HIMEM = LOMEM + M%*&100000
MODE 8 : OFF
INSTALL @lib$ + "GFXLIB2" : PROCInitGFXLIB
INSTALL @lib$ + "GFXLIB_modules\MMXDesaturateColour" : PROCInitModule
INSTALL @lib$ + "GFXLIB_modules\BoxBlur3x3" : PROCInitModule
bm% = FNLoadImg( @lib$ + "GFXLIB_media\bg1_640x512x8.bmp", 0 )
*REFRESH OFF
REPEAT
REM. Display the image normally for two seconds
SYS GFXLIB_BPlot%, dispVars{}, bm%, 640, 512, 0, 0
PROCdisplay
WAIT 200
FOR I% = 1 TO 280
SYS GFXLIB_BoxBlur3x3%, dispVars.bmBuffAddr%, dispVars.bmBuffAddr%, 640, 512
IF I% MOD 2 = 0 THEN SYS GFXLIB_MMXDesaturateColour%, dispVars.bmBuffAddr%, 640*512, 0.01*&100000
PROCdisplay
NEXT I%
UNTIL FALSE
|
|
Logged
|
|
|
|
|