BBC BASIC for Windows
« Fast PRNG (suitable for games + demos) »

Welcome Guest. Please Login or Register.
Apr 5th, 2018, 11:25pm



ATTENTION MEMBERS: Conforums will be closing it doors and discontinuing its service on April 15, 2018.
Ad-Free has been deactivated. Outstanding Ad-Free credits will be reimbursed to respective payment methods.

If you require a dump of the post on your message board, please come to the support board and request it.


Thank you Conforums members.

BBC BASIC for Windows Resources
Online BBC BASIC for Windows documentation
BBC BASIC for Windows Beginners' Tutorial
BBC BASIC Home Page
BBC BASIC on Rosetta Code
BBC BASIC discussion group
BBC BASIC for Windows Programmers' Reference

« Previous Topic | Next Topic »
Pages: 1  Notify Send Topic Print
 thread  Author  Topic: Fast PRNG (suitable for games + demos)  (Read 389 times)
admin
Administrator
ImageImageImageImageImage


member is offline

Avatar




PM


Posts: 1145
xx Re: Fast PRNG (suitable for games + demos)
« Reply #7 on: Apr 7th, 2010, 1:30pm »

on Apr 7th, 2010, 09:53am, David Williams wrote:
Here's the proof:

Really? Two points to make: firstly, the reason the assembler code is so much slower than RND is because you've failed to ensure the necessary gap between the data and code to avoid cache thrashing.

Modify this line as shown and you should find the assembler code runs much more quickly (depending on your CPU):

Code:
      DIM code% 45, gap% 2047 

Secondly, your so-called 'fast' PRNG runs no faster than the assembler version of RND listed on the Wiki! Try this extended program, which compares BASIC RND, your FastRnd and the assembler Rnd. On my PC, Rnd and FastRnd take virtually the same time (sometimes one is slightly faster, sometimes the other):

Code:
      MODE 8
      PROCasm
      
      DIM seed% 4
      !seed% = RND
      
      PRINT "Running test 1 (BASIC RND)..."
      A% = 0
      SYS "GetTickCount" TO timeA0%
      FOR I% = 1 TO 1000000
        IF RND(2)=1 THEN
          A% += 1
        ELSE
          A% -= 1
        ENDIF
      NEXT I%
      SYS "GetTickCount" TO timeA1%
      
      PRINT "Running test 2 (FastRnd)..."
      A% = 0
      F% = FastRnd%
      C% = seed%
      D% = 2
      SYS "GetTickCount" TO timeB0%
      FOR I% = 1 TO 1000000
        IF USR(F%)=1 THEN
          A% += 1
        ELSE
          A% -= 1
        ENDIF
      NEXT I%
      SYS "GetTickCount" TO timeB1%
      
      PRINT "Running test 3 (assembler Rnd)..."
      A% = 0
      R% = Rnd%
      C% = seed%
      D% = 2
      SYS "GetTickCount" TO timeC0%
      FOR I% = 1 TO 1000000
        IF USR(R%)=1 THEN
          A% += 1
        ELSE
          A% -= 1
        ENDIF
      NEXT I%
      SYS "GetTickCount" TO timeC1%
      
      PRINT "BASIC RND took ";timeA1%-timeA0%;" ms."
      PRINT "FastRnd took ";timeB1%-timeB0%; " ms."
      PRINT "Assembler Rnd took ";timeC1%-timeC0%; " ms."
      END
      
      DEF PROCasm
      LOCAL I%, L%, P%, code%
      DIM code% 100, L% -1, gap% 2047
      
      FOR I% = 8 TO 10 STEP 2
        P% = code%
        [OPT I%
        
        .FastRnd%
        
        ; REM. seed% = 36969*(seed% AND 65535) + (seed% >> 16)
        
        ; ecx = ^seed
        ; edx = range
        
        mov eax, [ecx]
        mov ebx, eax
        shr ebx, 16
        and eax, 65535
        imul eax, 36969
        add eax, ebx
        mov [ecx], eax
        and eax, &FFFF
        imul eax, edx
        shr eax, 16
        add eax, 1
        ret
        
        .Rnd%
        
        ; REM. Identical algorithm to RND
        
        ; ecx = ^seed
        ; edx = range
        
        mov  esi,ecx
        mov  eax,[esi]
        mov  cl,[esi+4]
        mov  ebx,eax
        shr  cl,1
        rcr  ebx,1
        rcl  cl,1
        shl  eax,12
        xor  ebx,eax
        mov  eax,ebx
        shr  eax,20
        xor  eax,ebx
        mov  [esi+4],cl
        mov  [esi],eax
        mul  edx
        lea  eax,[edx+1]
        ret
        ]
      NEXT I%
      ENDPROC 

Richard.
« Last Edit: Apr 7th, 2010, 4:29pm by admin » User IP Logged

David Williams
Developer

member is offline

Avatar

meh


PM

Gender: Male
Posts: 452
xx Re: Fast PRNG (suitable for games + demos)
« Reply #8 on: Apr 7th, 2010, 6:44pm »

on Apr 7th, 2010, 1:30pm, Richard Russell wrote:
Really? Two points to make: firstly, the reason the assembler code is so much slower than RND is because you've failed to ensure the necessary gap between the data and code to avoid cache thrashing.


On my system (Intel Centrino), at least on this occasion, the inclusion of a 2Kb (or 4Kb) gap makes no difference to execution speed (I do normally include a gap, actually).


on Apr 7th, 2010, 1:30pm, Richard Russell wrote:
Secondly, your so-called 'fast' PRNG runs no faster than the assembler version of RND listed on the Wiki! Try this extended program ...


Yes, I confirm that the performance of assembler RND and FastRnd are about the same.

BB4W's standard RND function is still the clear winner on my system.

Many of my ASM programs include a function which returns a random sign integer (+1 or -1), and so I think it's possible to shave a few instructions off FastRnd to make it a dedicated, fast pseudo-random binary (50/50) decision maker. That's really all I'd use it for. For other things, I'd probably use (and have several times used) your RND (and associated) assembler routines.


David.


User IP Logged

admin
Administrator
ImageImageImageImageImage


member is offline

Avatar




PM


Posts: 1145
xx Re: Fast PRNG (suitable for games + demos)
« Reply #9 on: Apr 7th, 2010, 10:18pm »

on Apr 7th, 2010, 6:44pm, David Williams wrote:
Yes, I confirm that the performance of assembler RND and FastRnd are about the same.

Incidentally, they are within one byte of being the same size too (and I could have made the assembler Rnd slightly smaller than FastRnd by changing the registers in which the parameters are supplied, thus avoiding the initial mov esi,ecx).

Quote:
BB4W's standard RND function is still the clear winner on my system.

The actual code executed is of course the same as the assembler version. The difference in execution speed comes from the overhead of the USR function.

Quote:
I think it's possible to shave a few instructions off FastRnd to make it a dedicated, fast pseudo-random binary (50/50) decision maker.

If all you want is a fast binary decision maker I reckon I could write one significantly faster than either FastRnd or Rnd. What minimum sequence length do you require?

Richard.
User IP Logged

David Williams
Developer

member is offline

Avatar

meh


PM

Gender: Male
Posts: 452
xx Re: Fast PRNG (suitable for games + demos)
« Reply #10 on: Apr 8th, 2010, 7:32pm »

on Apr 7th, 2010, 10:18pm, Richard Russell wrote:
If all you want is a fast binary decision maker I reckon I could write one significantly faster than either FastRnd or Rnd.


Thanks, that would be interesting to see.


on Apr 7th, 2010, 10:18pm, Richard Russell wrote:
What minimum sequence length do you require?


If for use with gaming (which is what I'd be inclined to use it for), wouldn't a sequence length in the range 2^16 to 2^32 suffice?


David.
User IP Logged

admin
Administrator
ImageImageImageImageImage


member is offline

Avatar




PM


Posts: 1145
xx Re: Fast PRNG (suitable for games + demos)
« Reply #11 on: Apr 8th, 2010, 11:05pm »

on Apr 8th, 2010, 7:32pm, David Williams wrote:
Thanks, that would be interesting to see.


It may be possible to do better than this, but as a naive first attempt:

Code:
      mov eax,[esp]
      mov ebx,eax
      shr ebx,28
      adc eax,eax
      shr ebx,2
      and ebx,1
      xor eax,ebx
      mov [esp],eax
      and eax,1
      ret 

This returns 0 or 1 on a pseudo-random basis, with a sequence length of 2^31-1. Here the 'state' is stored on the stack, but of course could be anywhere.

Richard.
« Last Edit: Apr 9th, 2010, 02:50am by admin » User IP Logged

admin
Administrator
ImageImageImageImageImage


member is offline

Avatar




PM


Posts: 1145
xx Re: Fast PRNG (suitable for games + demos)
« Reply #12 on: Apr 9th, 2010, 02:56am »

Further, if you're happy to test the sign of the result as your 'binary decision' then you can eliminate the and eax,1 (effectively you're testing the MSB rather than the LSB):

Code:
      mov eax,[esp]
      mov ebx,eax
      shr ebx,28
      adc eax,eax
      shr ebx,2
      and ebx,1
      xor eax,ebx
      mov [esp],eax
      ret  

This returns >0 or <0 on a pseudo-random basis, with a sequence length of 2^31-1.

Richard.
« Last Edit: Apr 9th, 2010, 03:03am by admin » User IP Logged

David Williams
Developer

member is offline

Avatar

meh


PM

Gender: Male
Posts: 452
xx Re: Fast PRNG (suitable for games + demos)
« Reply #13 on: Apr 9th, 2010, 7:00pm »

on Apr 8th, 2010, 11:05pm, Richard Russell wrote:
It may be possible to do better than this, but as a naive first attempt:

Code:
      mov eax,[esp]
      mov ebx,eax
      shr ebx,28
      adc eax,eax
      shr ebx,2
      and ebx,1
      xor eax,ebx
      mov [esp],eax
      and eax,1
      ret 


This returns 0 or 1 on a pseudo-random basis, ...



Thanks very much for that bit of code (which I haven't yet tested but I feel certain will work just fine and is a keeper!).

I'm just trying to figure out how I might turn a pseudo-random 0 or 1 into a random sign (-1 or +1) in an efficient way without resorting to a CMP (or TST) + jump instruction.


David.
User IP Logged

David Williams
Developer

member is offline

Avatar

meh


PM

Gender: Male
Posts: 452
xx Re: Fast PRNG (suitable for games + demos)
« Reply #14 on: Aug 20th, 2014, 03:09am »

For a current project, I needed fast (but not 'high quality') generation of pseudo-random floats between 0.0 and 1.0. I'm sharing this code here in case it may possibly be of some interest to others.


David.


Code:
      REM Fast -- but not high quality -- PRNG (Pseudo-Random Number Generator)
      REM Suited to certain game applications, but NOT statistical simulations.
      REM It is NOT intended to replace BB4W's RND function (that would be silly),
      REM but it is meant to be called from other assembly language routines.
      REM Returns a 64-bit float between 0.0 and 1.0 (exclusive, probably!)
      REM
      REM Calling this RNG from BASIC is slower than simply using RND(1)
      REM (which also returns values in the range 0 to 1)
      REM
      REM Thanks to George Marsaglia for his 'congruent RNG' on which this code is based:
      REM
      REM     x = 69069*x + 362437
      REM
      REM Obscenely simple, isn't it?

      *FLOAT 64

      MODE 8 : OFF

      DIM code% 2048+128

      FOR I% = 0 TO 2 STEP 2
        P% = code%
        [OPT I%
  
        .rand
        sub   esp, 12
        xor   edx, edx
        imul  eax, DWORD [seed], 69069
        mov   DWORD [esp+4], edx
        add   eax, 362437
        mov   DWORD [esp], eax
        fild  QWORD [esp]
        fmul  QWORD [const]
        mov   DWORD [seed], eax
        mov   eax, DWORD [esp+16]
        fstp  QWORD [eax]
        add   esp, 12
        ret   4
  
        ;
        ; aligned 2Kb code/data separation
        ;
        .gap% : ] : P%+=2048 : P%=(P%+15) AND -16 : [OPT I%
  
        .const dd 1048576 : dd 1039138816 ; 64-bit double representation of 1/(2^32 - 1)
        .seed  dd 1234567897              ; initial seed value
  
        ]
      NEXT I%

      N% = ^n#
      F% = rand

      FOR I% = 1 TO 1000000
        SYSF%,N% : X%=1280*n#
        SYSF%,N% : Y%=1024*n#
        PLOTX%,Y%
      NEXT

      CLS

      FOR I% = 1 TO 1000000
        SYSF%,N% : X%=1280*n#
        SYSF%,N% : Y%=1024*n#
        SYSF%,N% : GCOL 16*n#
        PLOTX%,Y%
      NEXT

 
« Last Edit: Aug 20th, 2014, 03:15am by David Williams » User IP Logged

Pages: 1  Notify Send Topic Print
« Previous Topic | Next Topic »

| |

This forum powered for FREE by Conforums ©
Terms of Service | Privacy Policy | Conforums Support | Parental Controls