Re: better flatten()?

new topic     » goto parent     » topic index » view thread      » older message » newer message

I noticed a peculiar example in the documentation for flatten:

said...

Example 3:

Using the delimiter argument. 
s = flatten({"abc", "def", "ghi"}, ", ") 
-- s is "abc, def, ghi" 

Pete's flatten2() accomplishes this correctly, but Spock's flattenX() does not. flattenX() also leaves a trailing delimiter which should not be part of the correct output.

s = flatten2({"abc", "def", "ghi"}, ", ") 
-- s is "abc, def, ghi" 
 
s = flattenX({"abc", "def", "ghi"}, ", ") 
-- s is "a, b, c, d, e, f, g, h, i, " 

However, taking this a step further, I noticed that flatten() and flatten2() do not seem to handle nested strings correctly. Notice how the nested strings get merged together:

s = flatten({"abc", "def", "ghi", {"jkl", "mno", "pqr"}, "stu", "vwx", "yz"}, ", ") 
-- s is "abc, def, ghi, jklmnopqr, stu, vwx, yz" 
 
s = flatten2({"abc", "def", "ghi", {"jkl", "mno", "pqr"}, "stu", "vwx", "yz"}, ", ") 
-- s is "abc, def, ghi, jklmnopqr, stu, vwx, yz" 

So I made my own attempt to better handle this behavior and I wrote two functions: flatten_all() and flatten_seq().

-- 
-- string type borrowed from std/types.e 
-- 
type string( object x ) 
     
    if not sequence(x) then 
        return 0 
    end if 
     
    for i = 1 to length(x) do 
        if not integer(x[i]) then 
            return 0 
        end if 
        if x[i] < 0 then 
            return 0 
        end if 
        if x[i] > 255 then 
            return 0 
        end if 
    end for 
     
    return 1 
end type 
 
-- 
-- an array of only string objects 
-- 
type string_array( object x ) 
     
    if atom( x ) then return 0 end if 
     
    for i = 1 to length( x ) do 
        if not string( x[i] ) then 
            return 0 
        end if 
    end for 
     
    return 1 
end type 
 
-- 
-- flatten a sequence into its raw atoms 
-- 
function flatten_all( object s1, object delim = "" ) 
     
    if atom( s1 ) then return {s1} end if 
     
    sequence s2 = {} 
     
    for i = 1 to length( s1 ) do 
        s2 &= flatten_all( s1[i] ) 
    end for 
     
    return join( s2, delim ) 
end function 
 
-- 
-- flatten a sequence, preserving nested strings 
-- 
function flatten_seq( sequence s1, object delim = "" ) 
     
    sequence s2 = {} 
     
    for i = 1 to length( s1 ) do 
         
        if string( s1[i] ) then 
            -- append string item 
            s2 &= {s1[i]} 
             
        elsif string_array( s1[i] ) then 
            -- append the whole array 
            s2 &= s1[i] 
             
        else 
            -- append the raw atoms 
            s2 &= flatten_all( s1[i] ) 
             
        end if 
         
    end for 
     
    return join( s2, delim ) 
end function 
s = flatten_all({"abc", "def", "ghi", {"jkl", "mno", "pqr"}, "stu", "vwx", "yz"}, ", ") 
-- s is "a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z" 
 
s = flatten_seq({"abc", "def", "ghi", {"jkl", "mno", "pqr"}, "stu", "vwx", "yz"}, ", ") 
-- s is "abc, def, ghi, jkl, mno, pqr, stu, vwx, yz" 

These are nearly as fast as flatten2() or flattenX() and I believe they produce the most "correct" output thus far.

I did some testing on large random sequences and flatten_seq() seems to be just as fast as flatten2(), while flatten_all() is about 3-5 times slower.

I'll admit I'm probably taking a performance hit by using join() but it made the code come out a lot cleaner.

-Greg

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu