Author |
Topic: GFXLIB (Read 2218 times) |
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #3 on: Aug 28th, 2008, 4:45pm » |
|
Quote:| Perhaps the folk at MS could have a word with a certain Ms. Wilson whose legendary colour-matching algorithm (as employed in her image mastering software ChangeFSI) was reportedly very fast. |
|
I'm guessing that Sophie's method may have been fast because of her unique ARM coding skills rather than a clever algorithm. Your approach sounds like the right one; what makes it so slow? I would imagine MMX instructions may well be of value in doing the computations.
Richard.
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB
« Reply #4 on: Aug 28th, 2008, 5:28pm » |
|
on Aug 28th, 2008, 4:45pm, Richard Russell wrote:I'm guessing that Sophie's method may have been fast because of her unique ARM coding skills rather than a clever algorithm. Your approach sounds like the right one; what makes it so slow? I would imagine MMX instructions may well be of value in doing the computations.
Richard. |
|
GFXLIB_Plot32as8 calls an external colour matching function for each pixel that it plots, and so time is wasted in performing this CALL (since this flushes the pipeline, I believe), and more clock cycles eaten up by register preservation (PUSHAD) in said external function. There's six memory accesses (reads) per plotted pixel although three of these are almost certainly read from cached locations (ESP+offset), and I'm hoping the palette entries get cached quickly since they're accessed multiple times in most cases.
Here is the code pasted straight out of GFXLIB:
Code:
.GFXLIB_ColourMatch
; SYS GFXLIB_ColourMatch, palAddr, numCols, R`, G`, B`
pushad
; ESP+36 = palAddr
; ESP+40 = numCols
; ESP+44 = R`
; ESP+48 = G`
; ESP+52 = B`
;----*----*----*----*----*----*----*----|
mov edi, &7FFFFFFF ; EDI = least squares max sum (initially set to &7FFFFFFF)
xor ecx, ecx ; ECX = least squares index
mov edx, [esp + 36] ; EDX = palette addr
xor ebp, ebp ; EBP = loop counter (palette index)
.GFXLIB_ColourMatch__lp
movzx eax, BYTE [edx + 4*ebp + 2] ; load palette R byte
movzx ebx, BYTE [edx + 4*ebp + 1] ; load palette G byte
movzx esi, BYTE [edx + 4*ebp + 0] ; load palette B byte
sub eax, [esp + 44] ; = R-R`
sub ebx, [esp + 48] ; = G-G`
sub esi, [esp + 52] ; = B-B`
imul eax, eax ; = (R-R`)^2
imul ebx, ebx ; = (G-G`)^2
imul esi, esi ; = (B-B`)^2
add eax, ebx ; = (R-R`)^2 + (G-G`)^2
add eax, esi ; = (R-R`)^2 + (G-G`)^2 + (B-B`)^2
cmp eax, edi ; compare current sum with least squares sum
jge GFXLIB_ColourMatch__skip
mov edi, eax ; least squares sum = current sum
mov ecx, ebp ; lsq index = ebp
.GFXLIB_ColourMatch__skip
inc ebp
cmp ebp, [esp + 40] ; compare loop counter with numCols
jne GFXLIB_ColourMatch__lp
mov BYTE [varsblk], cl ; store final lsq index
popad
mov al, BYTE [varsblk]
ret (5*4)
I'm thinking those IMULs could be pre-calc'd (squares looked-up from a table), but perhaps that might prove more 'expensive'.
Regards,
David.
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #5 on: Aug 28th, 2008, 10:27pm » |
|
Quote:| time is wasted in performing this CALL (since this flushes the pipeline, I believe) |
|
Are you sure? I can't see any mention of that in the Intel Architecture Optimization Reference Manual. Inlining CALLs is recommended, but only for 'peripheral' reasons:
• Parameter passing overhead can be eliminated. • In a compiler, inlining a function exposes more opportunity for optimization. • If the inlined routine contains branches, the additional context of the caller may improve branch prediction within the routine. • A mispredicted branch can lead to larger performance penalties inside a small function than if that function is inlined.
I doubt that any of these apply significantly in your case. In general the CPU is "optimized specifically for calls and returns" (e.g. the trace cache) so I don't think you need worry too much about the overhead.
Richard.
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB (Fast Text Drawing)
« Reply #6 on: Aug 29th, 2008, 02:53am » |
|
The next release of GFXLIB will feature some fast text drawing subroutines.
Here's a demo:
http://www.bb4w-games.com/fastfontdemo.zip
The screen redraw is supposed to be sync'd with the monitor's VBlank, but if the synchronisation is not good then please don't form the impression that the text drawing routine is slow!
Regards,
David.
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB (demo of PlotDissolve3 routine)
« Reply #7 on: Aug 31st, 2008, 8:21pm » |
|
The next public release of GFXLIB will include a new routine called PlotDissolve3.
Watch this demo to see what it does:
http://www.bb4w-games.com/plotdissolve3demo.zip
The routine is currently very suboptimal -- it calls Richard's pseudo-random number generator every d**ned pixel, so some kind of shortcut needs to be devised even if that means A) huge table of random numbers, or B) a faster but lower quality random number generator.
(Not suggesting Richard's routine is slow -- it isn't -- just that I'm happy to compromise high quality pseudo-randomness for speed in this case).
Regards,
David.
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #8 on: Aug 31st, 2008, 8:46pm » |
|
Quote:Watch this demo to see what it does |
|
You may not like David Tennant as Doctor Who, but at least you have the satisfaction of knowing that BBC BASIC for Windows may end up having a significant (retrospective) contribution to make to Jon Pertwee's depiction of the role! For more details see the September 2008 edition of Everyday Practical Electronics (page 16).
Richard.
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB (full screen demo)
« Reply #9 on: Sep 4th, 2008, 12:38am » |
|
A simple full screen demo:
http://www.bb4w-games.com/fullscreendemo.zip
I was surprised to get the 'ideal' (VBlank-sync'd) frame rate of 60 fps on my 1.86GHz Centrino-based laptop. However, the CPU load was rather high at approx. 50%. Also, the VBlank synchronisation isn't perfect, but it's better than no sync, IMO.
Regards,
David.
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #11 on: Sep 4th, 2008, 2:36pm » |
|
Quote:| Some very fast -- albeit low quality nearest-neighbour -- bitmap scaling |
|
This appears to be broken on my PC: the 'GFXLIB' text, which I presume is intended to be in the foreground, is partially hidden most of the time:

Richard.
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB
« Reply #12 on: Sep 4th, 2008, 4:51pm » |
|
on Sep 4th, 2008, 2:36pm, Richard Russell wrote:This appears to be broken on my PC: the 'GFXLIB' text, which I presume is intended to be in the foreground, is partially hidden most of the time: |
|
Oops... yes, I had REM'd out the *REFRESH statement and forgot to un-REM it prior to compilation.
It should work o.k. now.
http://www.bb4w-games.com/fastscalingdemo.zip
David.
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB (text squashing)
« Reply #13 on: Sep 5th, 2008, 12:05am » |
|
This will be the last GFXLIB demo for a month or two because I really must get the documentation and example programs written...
http://www.bb4w-games.com/textsquash.zip
I intend to release the next version of GFXLIB (with lots of new routines plus decent docs) by the end of this month, or early October. I hope then that it'll not just be me and Simon writing games based on it 
Check out Simon's game 'Blast' which promises some frantic arcade action (you'll probably need to extract the files from the ZIP folder first before running it):
http://www.bb4w-games.com/blast.zip
Regards,
David.
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #14 on: Sep 5th, 2008, 08:22am » |
|
Quote:| Check out Simon's game 'Blast' which promises some frantic arcade action (you'll probably need to extract the files from the ZIP folder first before running it) |
|
Do you happen to know why he doesn't package all the 'resource' files into the executable? Personally I can't be bothered to download the zip and find somewhere suitable to extract all the files.
Your programs are so much easier to run; I don't even have to download them (explicitly), I just 'open' the link in your post then double-click on the executable. Wonderful!
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB (Scaled game graphics)
« Reply #15 on: Sep 8th, 2008, 12:47am » |
|
I wanted to try an experiment with a view to perhaps creating a game that is largely independent of screen resolution. The method used in this demo (link below) involves the pre-scaling of bitmaps using simple nearest-neighbour scaling, and then these pre-scaled bitmaps are drawn in the usual way using the reasonably fast standard GFXLIB_Plot routine.
http://www.bb4w-games.com/scaledgamegraphicsdemo.zip
The demonstration 'game' doesn't do much -- use the arrow keys to move around and collect objects. Not much fun... but then, the point of the program is to demonstrate an idea/concept, not to entertain 
You have to re-start the program in order to change the resolution.
Regards,
David.
|
|
|
|
81RED
Guest
|
 |
Re: GFXLIB
« Reply #16 on: Sep 10th, 2008, 07:02am » |
|
on Sep 5th, 2008, 08:22am, Richard Russell wrote:Do you happen to know why he doesn't package all the 'resource' files into the executable? Personally I can't be bothered to download the zip and find somewhere suitable to extract all the files.
Your programs are so much easier to run; I don't even have to download them (explicitly), I just 'open' the link in your post then double-click on the executable. Wonderful! |
|
To quote what I wrote to David on that subject: "Can only speak from personal experience, but in my end of the world, users Have a nasty tendency to download stuff directly to their desktop. Now having a Blast.exe that "explodes" into 29 additional items on said desktop Is not the ideal way to make friends " And I could go on and on about that particular subject, but I guess I'm as opposed to programs that uncritically and without warning clutters up the folder they happen to be run in, as you are to unzipping anything.
Simon
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #17 on: Sep 10th, 2008, 09:14am » |
|
Quote:| Now having a Blast.exe that "explodes" into 29 additional items on said desktop Is not the ideal way to make friends |
|
I suggested that you "package the resource files into the executable", not that you 'explode' 29 items onto the desktop. One doesn't follow from the other!
For a start, I would always recommend putting the resource files into a single sub-directory, not keeping them in the same directory as the executable. Thus if one were to download the executable to the desktop and run it there the most that would happen is that a single additional folder icon would appear.
Arguably the appearance of that icon isn't in itself a bad thing, since it would draw attention to what is in any case bad practice - putting an executable file on the desktop. However it could easily be removed by setting the resource directory's attributes to 'hidden' early in your program.
But what I think is more important is that David's method of embedding all the resource files means that you don't have to (explicitly) download the programs at all. To run one of his programs I just 'open' it from the web site - the downloading and extraction of resource files to a temporary directory happens 'behind the scenes'. Literally his programs are four mouse-clicks away from a message on this forum.
Anyway it's ultimately up to you. I've marvelled at David's programs but I've not even looked at yours because I can't be bothered with the hassle of downloading, extracting and subsequently deleting it.
Richard.
|
|
Logged
|
|
|
|
|