better flatten()?

new topic     » topic index » view thread      » older message » newer message

I decided it was about time I added flatten() to Phix, so I had a look at the one in std\sequence.e...

public function flatten(sequence s, object delim = "") 
sequence ret 
object x 
integer len 
integer pos 
sequence temp 
 
    if atom(delim) then 
        delim = {delim} 
    end if 
    ret = s 
    pos = 1 
    len = length(ret) 
    while pos<=len do 
        x = ret[pos] 
        if sequence(x) then 
            if length(delim)=0 then 
                ret = ret[1..pos-1] & flatten(x) & ret[pos+1..$] 
            else 
                temp = ret[1..pos-1] & flatten(x) 
                if pos!=length(ret) then 
                    ret = temp &  delim & ret[pos+1..$] 
                else 
                    ret = temp & ret[pos+1..$] 
                end if 
            end if 
            len = length(ret) 
        else 
            pos += 1 
        end if 
    end while 
 
    return ret 
end function 

Hmmm... after staring at it incredulously for a while, I finally grasped that what it was doing was to flatten(x) then quite pointlessly check that the just processed flatten(x)[2..$] were indeed all atoms and do nothing to them. Unless it is part of the plan that delim can be a nested sequence which you want to it to repeatedly flatten, which is equally ridiculous. It is also shunting the end of s/ret around on every iteration for no good reason. Anyway, I decided it was utter shite, so I wrote my own. Not entirely surprisingly (to me) it ran between 50 and 250 times faster.

Does anyone want to check that this does what the docs say it should and is indeed significantly faster?

function flatten2(sequence s, object delim = "") 
sequence res = "" 
object si 
 
    if length(s)=0 then return s end if 
    si = s[1] 
    if atom(si) then 
        res = append(res,si) 
    else 
        res = flatten2(si) 
    end if 
    for i=2 to length(s) do 
        res &= delim 
        si = s[i] 
        if atom(si) then 
            res = append(res,si) 
        else 
            res &= flatten2(si) 
        end if 
    end for 
    return res 
end function 

It seems much clearer to me as well. The specific test I used was:

sequence st = repeat(repeat(repeat("1",1000),100-0),100-0) 
atom t0, t1, t2 
sequence t 
for i=1 to 3 do 
    t0 = time() 
    t = flatten(st,'\n') 
    t1 = time() 
    t = flatten2(st,'\n') 
    t2 = time() 
    printf(1,"f:%3.2fs, f2:%3.2fs\n",{t1-t0,t2-t1}) 
end for 

The results on Phix varied depending on which level of repeat had 1000, whereas on eui (a somewhat dated r3412) it was about 25 times faster wherever the 1000 was.

I aslo tried

sequence st = repeat("this",100000) 

and got 248s for flatten vs 0.02s for flatten2... which is 12,400 times faster.

Pete

EDIT: I just spotted that neither flatten nor flatten2() pass delim when making their recursive calls - should they?

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu