Re: How to get list of output of find command of Linux

new topic     » goto parent     » topic index » view thread      » older message » newer message
ghaberek said...
petelomax said...

Maybe something like this? http://rosettacode.org/wiki/Generator/Exponential#Phix (but with only one task)

Sort of? I don't think you'd need to use a task for that though. I was thinking of a recursive version of something like FindFirstFile, which basically uses a cursor to continue generating items until it's reach the end of the list.

I had some time to sit down and knock out an example of what I was describing. Basically, my first version above was an eager scanner (it gets all of the files at once), this is a lazy scanner (it only gets more files when it needs to).

To use it, simple call dir_open() to start scanning the directory, then loop until dir_next() returns an empty sequence, and call dir_close() to clean up when you're done.

This is the most efficient method in terms of both speed -- you don't have to wait for it to find all the files before starting the loop, and memory -- it only keeps a subset of directories and file paths in memory at any given time.

Edit: actually, this is probably only faster as a matter of perception since you can get started on the body of the loop sooner; so it feels faster. Just running the numbers quickly on my machine, the eager method (see above) may be about 20-30% faster, at least it is on my system. So if memory is of no concern, then scanning the entire directory first is probably faster, but real-world cases will vary wildly. My tests are doing nothing but printing the path to the console. YMMV

Also, the find command is like, a bazillion times faster than any of this. Not sure what it's doing but there's got to be some lower-level operating system or file system calls going on.

-- dir_scan.e 
include std/filesys.e 
include std/wildcard.e 
 
sequence d = {} 
 
enum PATTERN,DIRS,FILES 
 
-- 
-- prepare a new directory scanner instance 
-- 
public function dir_open( sequence path = ".", sequence pattern = "*" ) 
 
    if path[$] != SLASH then path &= SLASH end if 
    path = canonical_path( path ) 
 
    integer id = find( {}, d ) 
 
    if id = 0 then 
        d = append( d, {} ) 
        id = length( d ) 
    end if 
 
    d[id] = {pattern,{path},{}} -- PATTERN,DIRS,FILES 
 
    return id 
end function 
 
-- 
-- clean up an unused directory scanner 
-- 
public procedure dir_close( integer id ) 
    d[id] = {} 
end procedure 
 
-- 
-- get the next available file in the queue 
-- (returns "" when queues are exhausted) 
-- 
public function dir_next( integer id ) 
 
    sequence pattern = d[id][PATTERN] 
 
    if length( d[id][FILES] ) != 0 then 
        -- we have files, return the next one 
 
        -- pop a file off the queue 
        sequence next_file = d[id][FILES][1] 
        d[id][FILES] = d[id][FILES][2..$] 
 
        return next_file 
 
    end if 
 
    while length( d[id][DIRS] ) != 0 do 
        -- we need to find more files 
 
        -- pop a directory off the queue 
        sequence next_dir = d[id][DIRS][1] 
        d[id][DIRS] = d[id][DIRS][2..$] 
 
        -- get the directory items 
        object items = dir( next_dir ) 
        if atom( items ) then 
            continue 
        end if 
 
        for i = 1 to length( items ) do 
 
            -- get the item details 
            sequence item_name = items[i][D_NAME] 
            sequence item_attr = items[i][D_ATTRIBUTES] 
            sequence full_path = next_dir & item_name 
 
            if find( item_name, {".",".."} ) then 
                -- skip these items 
                continue 
 
            elsif find( 'd', item_attr ) then 
                -- add this to the directory queue 
                d[id][DIRS] = append( d[id][DIRS], full_path & SLASH ) 
 
            elsif wildcard:is_match( pattern, item_name ) then 
                -- add this to the files queue 
                d[id][FILES] = append( d[id][FILES], full_path ) 
 
            end if 
 
        end for 
 
        if length( d[id][FILES] ) != 0 then 
            -- we have files now, return the next one 
 
            -- pop a file off the queue 
            sequence next_file = d[id][FILES][1] 
            d[id][FILES] = d[id][FILES][2..$] 
 
            return next_file 
        end if 
 
    end while 
 
    return "" 
end function 
-- find.ex 
include dir_scan.e 
 
procedure main() 
 
    integer id = dir_open( "/home/greg", "*.e" ) 
 
    sequence path 
 
    while length( path ) with entry do 
        printf( 1, "%s\n", {path} ) 
    entry 
        path = dir_next( id ) 
    end while 
 
    dir_close( id ) 
 
end procedure 
 
main() 

-Greg

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu