1. Unregistered profile_time() (was: Re: copying global stuff to one file
- Posted by "Christian Cuvier" <Christian.CUVIER at agriculture.gouv.fr> Nov 08, 2004
- 530 views
>> > i want to release a cool library i made which profiles code. here's an >> > example output: >> >> Okay, are you planning to do this using the profile_time function >> within registered Euphoria? > > No. > I have already written this profile library. It simply measures how much time > was spent in a block of code and how many times it was run. Ofcourse I have to > put two procedures around the block of code I want to profile, a little more work > than using with profile, but there are some other advantages (besides i made me a > macro in msdev which automatically puts this code in, so it's not much trouble): > Does it work under Linux and al. as well? - Under DOS32, you can trap int#70 and sort CMOS clocks ticks out. - Under WIN32, there's a couple of performance API functions that work just like time(), almost as easy. It's precise up to the microsecond, may not be functional on 486- machines. - Under Linux/BSD, I don't know. I can't understand some earlier posts on this list about profile_time() not available in Win32, among which one from RC. I'm using my own start() and stop() precision stopwarch to profile_time() a possible next submission for the contest. And I didn't reverse engineer anything for that purpose, would take too much trouble, and am not using modified source of any kind. CChris
2. Re: Unregistered profile_time() (was: Re: copying global stuff to one file
- Posted by Tone Škoda <tskoda at email.si> Nov 08, 2004
- 489 views
Christian Cuvier wrote: > > >> > i want to release a cool library i made which profiles code. here's an > >> > example output: > >> > >> Okay, are you planning to do this using the profile_time function > >> within registered Euphoria? > > > > No. > > I have already written this profile library. It simply measures how much > > time was spent in a > block of code and how many times it was run. Ofcourse I have to put two > procedures around the block of code I want to profile, > a little more work than using with profile, but there are some other > advantages (besides i made me a macro in msdev which > automatically puts this code in, so it's not much trouble):</font></i> > > > > Does it work under Linux and al. as well? It is generic. It uses time(). What's al.? > - Under DOS32, you can trap int#70 and sort CMOS clocks ticks out. > - Under WIN32, there's a couple of performance API functions that work just > like time(), almost as easy. It's precise up to the microsecond, may not be > functional on 486- machines. > - Under Linux/BSD, I don't know. time() is precise enough for me. > > I can't understand some earlier posts on this list about profile_time() not > available in Win32, among which one from RC. I'm using my own start() and > stop() precision stopwarch to profile_time() a possible next submission for > the contest. And I didn't reverse engineer anything for that purpose, would > take too much trouble, and am not using modified source of any kind. I don't know about that, haven't tried profile_time for a long time, because it is not available in unregistered version.
3. Re: Unregistered profile_time() (was: Re: copying global stuff to one file
- Posted by "Christian Cuvier" <Christian.CUVIER at agriculture.gouv.fr> Nov 08, 2004
- 490 views
>>Does it work under Linux and al. as well? > > > It is generic. It uses time(). Ok, I had understood otherwise from your previous post. When you want to time() portions of code that are executed a large number of times, but are not contiguous, time()'s resolution starts looking quite coarse. > What's al.? > al. short for "alia". Read: Linux or other more or less Unix-like OSes I don't know enough about. CChris
4. Re: Unregistered profile_time() (was: Re: copying global stuff to one file
- Posted by CoJaBo <cojabo at suscom.net> Nov 08, 2004
- 459 views
- Last edited Nov 09, 2004
Christian Cuvier wrote: > > >>Does it work under Linux and al. as well? > > > > > > It is generic. It uses time(). > > Ok, I had understood otherwise from your previous post. When you want to > time() portions of code that are executed a large number of times, but are not > > contiguous, time()'s resolution starts looking quite coarse. Especially on newer computers, I used time to measure performance in CJBN server on the old 500Mhz, 256MB RAM system I have, but it alway times 0 on my new 28Ghz 2GB RAM 64-bit laptop... Is there a more accurate library somwere(I only need windows and maybe linux, since I rarely use DOSe anymore)? > > > What's al.? > > > > al. short for "alia". Read: Linux or other more or less Unix-like OSes I don't > > know enough about. > > CChris > >
5. Re: Unregistered profile_time() (was: Re: copying global stuff to one file
- Posted by "Christian Cuvier" <Christian.CUVIER at agriculture.gouv.fr> Nov 09, 2004
- 514 views
> Subject: Re: Unregistered profile_time() (was: Re: copying global stuff to one > file > > > posted by: CoJaBo <cojabo at suscom.net> > > Christian Cuvier wrote: > >>> >>>>> >>Does it work under Linux and al. as well? >>> >>>> > >>>> > It is generic. It uses time(). >> >>> >>> Ok, I had understood otherwise from your previous post. When you want to >>> time() portions of code that are executed a large number of times, but are >>> not >>> contiguous, time()'s resolution starts looking quite coarse. > > Especially on newer computers, I used time to measure performance > in CJBN server on the old 500Mhz, 256MB RAM system I have, but it alway times > 0 on my new 28Ghz 2GB RAM 64-bit laptop... > Is there a more accurate library somwere(I only need windows and maybe > linux, since I rarely use DOSe anymore)? > Here's what I use:
constant k32=open_dll("kernel32.dll"), qpf=define_c_func(k32,"QueryPerformanceFrequency",{C_UINT},C_INT}, qpc=define_c_proc(k32,"QueryPerformanceCounter",{C_UINT}} --the latter is a function actually, but we don't care about the boolean --status code it returns if k32=-1 then --you're in trouble, or under Linux, or... elsif qpf=-1 or qpc=-1 then --your Windows version doesn't support this end if constant p232=power(2,32) --helper function to retrieve results function int64ptr_to_atom(atom ptr) sequence s s=peek4s({ptr,2}) return p232*s[2]+s[1] end function constant timeFactorPtr=allocate(8), timeRC=c_func(qpf,{timeFactorPtr}) if timeRC=0 then --your hardware doesn't support hi-res timers end if --at this point, the timer is functional constant timeFactor=int64ptr_to_atom(timeFactorPtr) --counts are given in ticks, and there are timeFactor ticks per second --now some variables constant maxSections=6 --whatever positive integer suits you sequence perfptr,total,times,started perfptr=repeat(0,maxSections) total=perfptr --total execution time times=total --number of runs started=times --flags for i=1 to maxSections do perfptr[i]=allocate(16) end for --each structure will store two pairs of integers, for a total of 4*4=16 bytes --ok, now the two procedures that start/end a timed section procedure start(integer section) c_proc(qpc,perfptr[section]) started[section]=1 end procedure procedure stop(integer section) c_proc(qpc,perfptr[section]+8) if not started[section] then return end if --start time not valid started[section]=0 times[section]+=1 total[section] += (int64ptr_to_atom(perfptr[section]+8)-int64ptr_to_atom(perfptr[section]))/timeFactor end procedure --not sure you gain any real precision (in theory, you do) by keeping --total[section] as a number of counts rather than an actual time. --And you'd have to implement addition for int64s using the 31-bit Eu integer type, --- not the best idea I'd think of. --however: 1/ atoms don't lose arithmetic accuracy whileless than power(2,53); --you can implement addition for int64s: constant p229=power(2,29) --largest Eu-integer power of 2 type int32(atom x) return integer(remainder(x,p229)) end type type int64(object x) return integer(x) or (sequence(x) and length(x)=2 and int32(x[1]) and int32(x[2])) end type function addandwrap(int64 x,int64 y) sequence hibitsx,hibitsy hibitsx=floor(x/p229) hibitsy=floor(y/p229) y=remainder(y,p229)+remainder(x,p229) hibitsy+=hibits(x) hibitsy+=floor(y/p229) y=remainder(y,p229) hibitsy[2]+=floor(hibitsy[1]/8) --8=p232/p229 hibitsy[2]=remainder(hibitsy[2],8) --there's the wrap, hibitsy[1]=remainder(hibitsy[1],8) return y+p229*hibitsy end function --this replaces a pair of machine code instructions that give you a wrap flag --in CF as a bonus <sigh and shudder> --your code there, with all the start() and stop() --your output routines there, to retrieve and inspect the results
That's all it takes. profile() and profile_time() under Windows, both for free. You can be quite creative, as a section may have several start points and/or several end points, or may start after it stops (in which case you'll miss one run out of a zillion iterations). Forgot to say that, to watch a section, you must insert a start() statement before each starting statement, and a stop() after each ending statement. That means one of each kind per section most of the time, but... see previous comment. Enjoy! CChris
6. Re: Unregistered profile_time() (was: Re: copying global stuff to one file
- Posted by Tone Škoda <tskoda at email.si> Nov 09, 2004
- 480 views
- Last edited Nov 10, 2004
Christian Cuvier wrote: > Here's what I use: > > }}} <eucode> > constant k32=open_dll("kernel32.dll"), > qpf=define_c_func(k32,"QueryPerformanceFrequency",{C_UINT},C_INT}, > qpc=define_c_proc(k32,"QueryPerformanceCounter",{C_UINT}} > --the latter is a function actually, but we don't care about the boolean > --status code it returns > > if k32=-1 then --you're in trouble, or under Linux, or... > elsif qpf=-1 or qpc=-1 then --your Windows version doesn't support this > end if > > constant p232=power(2,32) > > --helper function to retrieve results > function int64ptr_to_atom(atom ptr) > sequence s > s=peek4s({ptr,2}) > return p232*s[2]+s[1] > end function > > constant timeFactorPtr=allocate(8), > timeRC=c_func(qpf,{timeFactorPtr}) > if timeRC=0 then --your hardware doesn't support hi-res timers > end if > --at this point, the timer is functional > constant timeFactor=int64ptr_to_atom(timeFactorPtr) > --counts are given in ticks, and there are timeFactor ticks per second > > --now some variables > constant maxSections=6 --whatever positive integer suits you > sequence perfptr,total,times,started > perfptr=repeat(0,maxSections) > total=perfptr --total execution time > times=total --number of runs > started=times --flags > for i=1 to maxSections do perfptr[i]=allocate(16) end for > --each structure will store two pairs of integers, for a total of 4*4=16 bytes > > --ok, now the two procedures that start/end a timed section > > procedure start(integer section) > c_proc(qpc,perfptr[section]) > started[section]=1 > end procedure > > procedure stop(integer section) > c_proc(qpc,perfptr[section]+8) > if not started[section] then return end if --start time not valid > started[section]=0 > times[section]+=1 > total[section] += > > (int64ptr_to_atom(perfptr[section]+8)-int64ptr_to_atom(perfptr[section]))/timeFactor > end procedure > > --not sure you gain any real precision (in theory, you do) by keeping > --total[section] as a number of counts rather than an actual time. > --And you'd have to implement addition for int64s using the 31-bit Eu integer > type, > --- not the best idea I'd think of. > --however: 1/ atoms don't lose arithmetic accuracy whileless than power(2,53); > --you can implement addition for int64s: > > constant p229=power(2,29) --largest Eu-integer power of 2 > type int32(atom x) > return integer(remainder(x,p229)) > end type > > type int64(object x) > return integer(x) or > (sequence(x) and length(x)=2 and int32(x[1]) and int32(x[2])) > end type > > function addandwrap(int64 x,int64 y) > sequence hibitsx,hibitsy > hibitsx=floor(x/p229) > hibitsy=floor(y/p229) > y=remainder(y,p229)+remainder(x,p229) > hibitsy+=hibits(x) > hibitsy+=floor(y/p229) > y=remainder(y,p229) > hibitsy[2]+=floor(hibitsy[1]/8) --8=p232/p229 > hibitsy[2]=remainder(hibitsy[2],8) --there's the wrap, > hibitsy[1]=remainder(hibitsy[1],8) > return y+p229*hibitsy > end function > --this replaces a pair of machine code instructions that give you a wrap flag > --in CF as a bonus <sigh and shudder> > > --your code there, with all the start() and stop() > --your output routines there, to retrieve and inspect the results > </eucode> {{{ > > That's all it takes. profile() and profile_time() under Windows, both for > free. > > You can be quite creative, as a section may have several start points and/or > several end points, or may start after it stops (in which case you'll miss one > > run out of a zillion iterations). > > Forgot to say that, to watch a section, you must insert a start() statement > before each starting statement, and a stop() after each ending statement. That > > means one of each kind per section most of the time, but... see previous > comment. > > Enjoy! > > CChris i looks like time() returns only two decimal places precise. so for example: time(): 0.560000 your win32 api: 0.5550946964 is this the only difference? your code looks rather complex. isn't it possible to just make one wrapper function timew32(), which would work exactly the same like time() but more precise? what does addandwrap() do?
7. Re: Unregistered profile_time() (was: Re: copying global stuff to one file
- Posted by Tone Škoda <tskoda at email.si> Nov 09, 2004
- 480 views
- Last edited Nov 10, 2004
I uploaded my profile lib. I copied almost all stuff from other of my include files to TSProfile.e. It was 500 lines, a lot. http://www10.brinkster.com/tskoda/euphoria.asp#TSProfile
8. Re: Unregistered profile_time() (was: Re: copying global stuff to one file
- Posted by CChris <christian.cuvier at agriculture.gouv.fr> Nov 10, 2004
- 535 views
Tone Škoda wrote: > > Christian Cuvier wrote: > [snip] > > > > --not sure you gain any real precision (in theory, you do) by keeping > > --total[section] as a number of counts rather than an actual time. > > --And you'd have to implement addition for int64s using the 31-bit Eu > > integer > > type, > > --- not the best idea I'd think of. > > --however: 1/ atoms don't lose arithmetic accuracy whileless than > > power(2,53); > > --you can implement addition for int64s: > > > > constant p229=power(2,29) --largest Eu-integer power of 2 > > type int32(atom x) > > return integer(remainder(x,p229)) > > end type > > > > type int64(object x) > > return integer(x) or > > (sequence(x) and length(x)=2 and int32(x[1]) and int32(x[2])) > > end type > > > > function addandwrap(int64 x,int64 y) > > sequence hibitsx,hibitsy > > hibitsx=floor(x/p229) > > hibitsy=floor(y/p229) > > y=remainder(y,p229)+remainder(x,p229) > > hibitsy+=hibits(x) > > hibitsy+=floor(y/p229) > > y=remainder(y,p229) > > hibitsy[2]+=floor(hibitsy[1]/8) --8=p232/p229 > > hibitsy[2]=remainder(hibitsy[2],8) --there's the wrap, > > hibitsy[1]=remainder(hibitsy[1],8) > > return y+p229*hibitsy > > end function > > --this replaces a pair of machine code instructions that give you a wrap > > flag > > --in CF as a bonus <sigh and shudder> > > > > --your code there, with all the start() and stop() > > --your output routines there, to retrieve and inspect the results > > </eucode> {{{ > > > > That's all it takes. profile() and profile_time() under Windows, both for > > free. > > > > You can be quite creative, as a section may have several start points and/or > > > > several end points, or may start after it stops (in which case you'll miss > > one > > run out of a zillion iterations). > > > > Forgot to say that, to watch a section, you must insert a start() statement > > before each starting statement, and a stop() after each ending statement. > > That > > means one of each kind per section most of the time, but... see previous > > comment. > > > > Enjoy! > > > > CChris > > > i looks like time() returns only two decimal places precise. > > so for example: > time(): 0.560000 > your win32 api: 0.5550946964 > > is this the only difference? > I have no idea. Rob, or anyone with access to the source code, could answer to that. > your code looks rather complex. isn't it possible to just make one wrapper > function > timew32(), which would work exactly the same like time() but more precise? It is possible:
constant hirestime=allocate(8), k32=open_dll("kernel32.dll"), qpf=define_c_func(k32,"QueryPerformanceFrequency",{C_UINT},C_INT}, qpc=define_c_proc(k32,"QueryPerformanceCounter",{C_UINT}} constant p232=power(2,32) --helper function to retrieve results function int64ptr_to_atom(atom ptr) sequence s s=peek4s({ptr,2}) return p232*s[2]+s[1] end function constant timeFactorPtr=allocate(8), timeRC=c_func(qpf,{timeFactorPtr}), timeFactor=int64ptr_to_atom(timeFactorPtr) function timew32() return int64ptr_to_atom(hirescounter)/timeFactor end function
but I need the more sophisticated function for the sort of timing I'm performing, so I provided it. > what does addandwrap() do? > You may not have read the comments; try again please. addandwrap() is useful only if you're keen on keeping time as an long long integer rather than a decimal number of seconds. And it is far less elegant in Eu than in C (yes, this is rare enough to be mentioned), as I noted in these comments. By the way, addandwrap() adds two large integers and tries to keep the result as an integer as much it can. Regards CChris