BBC BASIC for Windows

BBC BASIC for Windows

General

General Board (Moderator: admin)

GFXLIB

« Previous Topic | Next Topic »

Pages: 1 ... 3 4 5 6 7 9

Author

Topic: GFXLIB (Read 2252 times)

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #73 on: May 23^rd, 2009, 9:50pm »

on May 23^rd, 2009, 9:26pm, Richard Russell wrote:

What kind of trouble do you anticipate?

You might remember that some time ago, in an early pre-release version GFXLIB, so many variables were declared that when the main program was compiled with the 'Abbreviate names' option set, one of the variables in the assembler section of the main program was renamed to (IIRC) esi or edi, which of course happens to be a register name! You did mention that well over a thousand variables would have to be declared before such 'collisions' (with names of registers) occurs, and you also asked why on earth I needed to declare so many variables in the first place. The number of variables was drastically reduced, but the numbers are creeping up again...

I was going to suggest (or had I already suggested?) that perhaps you could modify the relevant code in the compiler to not replace variables with register names.

on May 23^rd, 2009, 9:26pm, Richard Russell wrote:

What kind of trouble do you anticipate?

Have you considered less drastic solutions? For example you could arrange to assemble the code using CALL filename$ (which would mean the memory occupied by the 'source' would be required only transitorily) and even - with care - discard the memory used by your temporary variables.

Yes, I started to jot down ideas (actually, I made a start on the code a few weeks ago) for a possible fully modulized GFXLIB II, whereby the user can install the routines he or she requires. There would be a core set of routines mostly for internal use by GFXLIB, and the rest can be chosen as and when.

Regards,

David.

Logged

admin
Administrator

member is offline

Posts: 1145

Re: GFXLIB
« Reply #74 on: May 24^th, 2009, 10:09am »

Quote:

Yes, I remember that, but I don't believe the cruncher can ever create one of the 32-bit extended register names (eax, ebx, ecx...) because, since they start with the valid hexdecimal character e (in *LOWERCASE mode), they are specifically disallowed.

The first valid register name created by the cruncher is 'GS' which is the 1273rd variable (I think). That really is a bug, because register names like SI and SP are already explicitly tested for and disallowed. I'll make a note to correct that if I ever release another version.

In the meanwhile I'm sure you can keep your number of label names below 1273 by sensible use of macros (with 'local' or 'private' labels as appropriate) or even using array elements as labels as documented on the Wiki.

Richard.

Logged

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #75 on: May 25^th, 2009, 12:34am »

on May 24^th, 2009, 10:09am, Richard Russell wrote:

Yes, right you are. I had tried to find the e-mail that I originally sent to you which mentioned the actual register name, but it appears that Hotmail has either deleted it from their system, or has made it unavailable to me (I doubt they actually erase any e-mails from their servers).

on May 24^th, 2009, 10:09am, Richard Russell wrote:

In the meanwhile I'm sure you can keep your number of label names below 1273 by sensible use of macros (with 'local' or 'private' labels as appropriate) or even using array elements as labels as documented on the Wiki.

Array elements as labels sounds like a good idea, so I'll consider going that route; I'll consult the Wiki.

David.

Logged

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #76 on: May 25^th, 2009, 12:42am »

A quick demo of a new routine called PlotBMColumn (plots a single 1-pixel-wide column of pixels from a bitmap):

http://www.bb4w-games.com/138519651/gfxlib_vplot_demo.zip (500 Kb)

If it seems a bit sluggish then bear in mind that the BB4W interpreter is doing a lot of work!

This routine will form the basis of several other routines.

« Last Edit: Jan 20^th, 2012, 11:39pm by David Williams »

Logged

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #77 on: May 28^th, 2009, 02:48am »

GFXLIB's alpha blending routines are set to become a hell of a lot faster thanks to this sweet bit of
code I recently discovered on Avery Lee's VirtualDub site (http://tinyurl.com/obqpyt):

Code:

unsigned blend2(unsigned src, unsigned dst) {
unsigned alpha = src >> 24;
alpha += (alpha > 0);

unsigned srb = src & 0xff00ff;
unsigned sg = src & 0x00ff00;
unsigned drb = dst & 0xff00ff;
unsigned dg = dst & 0x00ff00;

unsigned orb = (drb + (((srb - drb) * alpha + 0x800080) >> 8)) & 0xff00ff;
unsigned og = (dg + (((sg - dg ) * alpha + 0x008000) >> 8)) & 0x00ff00;

return orb+og;
}

It works very well, and my ASM implementation of the above code is a big improvement over how
GFXLIB's routines currently perform the task (although here the code is modified to work with
a constant alpha value, rather than a per-pixel one as is done in the original code):

Code:

        .blend
        
        ;REM. ESP+4  -> src pxl (RGB32)
        ;REM. ESP+8  -> dst pxl (RGB32)
        ;REM. ESP+12 -> alpha (0-255)
        
        mov ebp, [esp + 12]          ; alpha value (0 to 255)
        mov esi, [esp + 4]           ; src pxl &xxRRGGBB
        mov edi, [esp + 8]           ; dest pxl &xxRRGGBB
        
        mov eax, esi                 ; copy ESI
        and eax, &FF00FF             ; EAX = srb
        and esi, &00FF00             ; ESI = sg
        
        mov edx, edi                 ; copy EDI
        and edi, &FF00FF             ; EDI = drb
        and edx, &00FF00             ; EDX = dg
        
        ;REM. EAX = srb
        ;REM. ESI = sg
        ;REM. EDI = drb
        ;REM. EDX = dg
        
        sub eax, edi                 ; srb - drb
        sub esi, edx                 ; sg - dg
        
        imul eax, ebp                ; (srb - drb)*alpha
        imul esi, ebp                ; (sg - dg)*alpha
        
        add eax, &800080             ; (srb - drb)*alpha + &800080
        add esi, &008000             ; (sg - dg)*alpha + &008000
        
        shr eax, 8                   ; ((srb - drb)*alpha + &800080) >> 8
        shr esi, 8                   ; ((sg - dg)*alpha + &008000) >> 8
        
        add eax, edi                 ; drb + ((srb - drb)*alpha + &800080) >> 8
        add esi, edx                 ; dg + ((sg - dg)*alpha + &008000) >> 8
        
        and eax, &FF00FF             ; (drb + ((srb - drb)*alpha + &800080) >> 8) AND &FF00FF
        and esi, &00FF00             ; (dg + ((sg - dg)*alpha + &008000) >> 8) AND &00FF00
        
        add eax, esi
        
        ret 12

If anyone can spot any optimisations that can be made, perhaps shaving off an instruction or two,
then please let me know!

Regards,

David.

« Last Edit: May 28^th, 2009, 02:50am by David Williams »

Logged

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #78 on: Jun 2^nd, 2009, 11:16pm »

More trivial nonsense:

~~http://www.bb4w-games.com/138519651/example48.zip~~ (529Kb)

« Last Edit: Jan 20^th, 2012, 11:40pm by David Williams »

Logged

admin
Administrator

member is offline

Posts: 1145

Re: GFXLIB
« Reply #79 on: Jun 3^rd, 2009, 08:21am »

Quote:

More trivial nonsense:
http://www.bb4w-games.com/138519651/example48.zip (529Kb)

Very nice.

Richard.

Logged

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #80 on: Jun 3^rd, 2009, 5:45pm »

Example 49:

~~http://www.bb4w-games.com/138519651/example49.zip~~ (557Kb)

Just demo'ing a few new routines, namely:

GFXLIB_BrushBlur
GFXLIB_BPlotBMRowList
GFXLIB_PlotBMColumnList
GFXLIB_Line

David.

« Last Edit: Jan 20^th, 2012, 11:40pm by David Williams »

Logged

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #81 on: Jun 6^th, 2009, 8:01pm »

Another day, another new routine for an already bloated GFXLIB. This one's called GFXLIB_RotateScaleTile:

~~http://www.bb4w-games.com/138519651/rotatescaletile.zip~~

It uses a quite fast method of rotating a bitmap, although my code is far from optimal. Still, this
demo averages only 16% CPU load on my laptop which is rather good in my opinion.

David.

« Last Edit: Jan 20^th, 2012, 11:41pm by David Williams »

Logged

Michael Hutton
Developer

member is offline
Avatar

Gender:

Posts: 248

Re: GFXLIB
« Reply #82 on: Jun 7^th, 2009, 6:41pm »

Another great routine! When is the next version of GFXLIB coming out?

Michael

Logged

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #83 on: Jun 7^th, 2009, 8:34pm »

on Jun 7^th, 2009, 6:41pm, Michael Hutton wrote:

Another great routine! When is the next version of GFXLIB coming out?

Thanks, Michael.

The next version (i.e., the second publicly released version) will probably be out
in a few months; plenty of work to do until then.

Currently trying to get an MMX-powered 'alpha blending' routine up and running.

David.

Logged

admin
Administrator

member is offline

Posts: 1145

Re: GFXLIB
« Reply #84 on: Jun 7^th, 2009, 9:58pm »

Quote:

Currently trying to get an MMX-powered 'alpha blending' routine up and running.

I presume you are aware of (and are probably using) this document:

ftp://download.intel.com/ids/mmx/MMX_App_Alpha_Blending.pdf

Richard.

Logged

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #85 on: Jun 7^th, 2009, 10:31pm »

on Jun 7^th, 2009, 9:58pm, Richard Russell wrote:

I presume you are aware of (and are probably using) this document:

ftp://download.intel.com/ids/mmx/MMX_App_Alpha_Blending.pdf

Yes, thanks, I've downloaded that document twice now... first time was some months ago,
and then again a few days ago. I was put off by the fact that it 'outputs' 16-bit 5:5:5 colour
values, rather than 32-bit 8:8:8:8.

I'm trying to adapt the alpha-blending code I posted here a few days ago -- it should be easy,
or at least it will be when I've worked out how to do MMX multiplies.

David.

Logged

admin
Administrator

member is offline

Posts: 1145

Re: GFXLIB
« Reply #86 on: Jun 8^th, 2009, 09:10am »

Quote:

I was put off by the fact that it 'outputs' 16-bit 5:5:5 colour values, rather than 32-bit 8:8:8:8.

To be precise, both the 'background' input and the output are 16-bpp, whereas the 'foreground' input is 32-bpp ARGB. No doubt this is because, when that article was written, 16-bpp was a common setting for PC displays.

It's not too difficult to strip out the code that unpacks and packs the 16-bpp pixels and replace it with code for 32-bpp RGB (with the 'alpha' byte unused). The resulting code is simpler, too.

Later: I've modified the Intel code to work with 32-bpp input and output. On my PC it's taking 2.2ms for a 640x480 image. Are you interested in it, or have you got your own working?

Richard.

« Last Edit: Jun 8^th, 2009, 1:46pm by admin »

Logged

David Williams
Developer

member is offline
Avatar

meh

Gender:

Posts: 452

Re: GFXLIB
« Reply #87 on: Jun 8^th, 2009, 4:05pm »

on Jun 8^th, 2009, 09:10am, Richard Russell wrote:

Later: I've modified the Intel code to work with 32-bpp input and output. On my PC it's taking 2.2ms for a 640x480 image. Are you interested in it... ?

In one word: Yes !

I am definately interested in it, thank you.

Now...

I recently knocked-up an MMX version of a supposedly optimised '50%' alpha blender (simply averages
the RGB32 colour values of corresponding foreground and background pixels). Here's the inner loop
from the non-MMX version:

Code:

        mov edx, [edi + 4*esi]                  ; load RGB32 pixel from source bitmap
        mov ebx, [ecx + 4*esi]                  ; load RGB32 pixel from dest addr
        and edx, &FEFEFE
        and ebx, &FEFEFE
        shr edx, 1
        shr ebx, 1
        add edx, ebx
        mov [ecx + 4*esi], edx                  ; write RGB32 pixel to destination bitmap buffer

Here's my MMX version, operating on four pixels per iteration of the inner X-loop:

Code:

        .GFXLIB_MMXBPlotAvgNC__xloop
        
        mov ebx, ecx
        shl ebx, 4
        
        movq mm1, [edi + ebx + 0]             ; load 2 pxls from bg      \
        movq mm2, [esi + ebx + 0]             ; load 2 pxls from srcBm    \
        ;                                     ;                            > 4 pixels
        movq mm3, [edi + ebx + 8]             ; load 2 pxls from bg       /
        movq mm4, [esi + ebx + 8]             ; load 2 pxls from srcBm   /
        
        pand mm1, mm0
        pand mm2, mm0
        pand mm3, mm0
        pand mm4, mm0
        
        psrld mm1, 1
        psrld mm2, 1
        psrld mm3, 1
        psrld mm4, 1
        
        paddd mm1, mm2
        paddd mm3, mm4
        
        movq [edi + ebx + 0], mm1
        movq [edi + ebx + 8], mm3
        
        dec ecx
        jge GFXLIB_MMXBPlotAvgNC__xloop

To my amazement, it's really no faster (or just marginally so) than the non-MMX version.

I was expecting something approaching a 2x speed improvement. :-(

David.

Logged

Pages: 1 ... 3 4 5 6 7 9


« Previous Topic \| Next Topic »