better flatten()?
- Posted by petelomax Jul 26, 2015
- 1625 views
I decided it was about time I added flatten() to Phix, so I had a look at the one in std\sequence.e...
public function flatten(sequence s, object delim = "") sequence ret object x integer len integer pos sequence temp if atom(delim) then delim = {delim} end if ret = s pos = 1 len = length(ret) while pos<=len do x = ret[pos] if sequence(x) then if length(delim)=0 then ret = ret[1..pos-1] & flatten(x) & ret[pos+1..$] else temp = ret[1..pos-1] & flatten(x) if pos!=length(ret) then ret = temp & delim & ret[pos+1..$] else ret = temp & ret[pos+1..$] end if end if len = length(ret) else pos += 1 end if end while return ret end function
Hmmm... after staring at it incredulously for a while, I finally grasped that what it was doing was to flatten(x) then quite pointlessly check that the just processed flatten(x)[2..$] were indeed all atoms and do nothing to them. Unless it is part of the plan that delim can be a nested sequence which you want to it to repeatedly flatten, which is equally ridiculous. It is also shunting the end of s/ret around on every iteration for no good reason. Anyway, I decided it was utter shite, so I wrote my own. Not entirely surprisingly (to me) it ran between 50 and 250 times faster.
Does anyone want to check that this does what the docs say it should and is indeed significantly faster?
function flatten2(sequence s, object delim = "") sequence res = "" object si if length(s)=0 then return s end if si = s[1] if atom(si) then res = append(res,si) else res = flatten2(si) end if for i=2 to length(s) do res &= delim si = s[i] if atom(si) then res = append(res,si) else res &= flatten2(si) end if end for return res end function
It seems much clearer to me as well. The specific test I used was:
sequence st = repeat(repeat(repeat("1",1000),100-0),100-0) atom t0, t1, t2 sequence t for i=1 to 3 do t0 = time() t = flatten(st,'\n') t1 = time() t = flatten2(st,'\n') t2 = time() printf(1,"f:%3.2fs, f2:%3.2fs\n",{t1-t0,t2-t1}) end for
The results on Phix varied depending on which level of repeat had 1000, whereas on eui (a somewhat dated r3412) it was about 25 times faster wherever the 1000 was.
I aslo tried
sequence st = repeat("this",100000)
and got 248s for flatten vs 0.02s for flatten2... which is 12,400 times faster.
Pete
EDIT: I just spotted that neither flatten nor flatten2() pass delim when making their recursive calls - should they?