1. Help needed with walk_dir()

It might help if I explain what I want to do first. I would like to select a file from a folder, and then use walk_dir() to select and process all files in the same directory, but in a particular order based on the timestamp of the file. So I will select the oldest and then need the program to select progressively "newer" files which will be processed in turn.

The default in walk_dir() is to visit files in alphabetical order, and the manual says if you want to use a different order you should:

"set the global integer my_dir to the routine id of your own modified dir() function that sorts the directory entries differently. See the default dir() function in file.e."

I've looked at the example in "search.ex" but I'm not sure how to write the dir() function which will achieve what I want. Do I need to use custom_sort()? Any help appreciated, thanks in advance!

new topic     » topic index » view message » categorize

2. Re: Help needed with walk_dir()

Hello Hacker,

I believe you need dir() more than walk_dir().

Walk_dir() is used more for serching in all sub-directories of a folder or drive.


Syntax: include file.e x = dir(st)

Description: Return directory information for the file or directory named by st. If there is no file or directory with this name then -1 is returned. On Windows and DOS st can contain * and ? wildcards to select multiple files.

This information is similar to what you would get from the DOS DIR command. A sequence is returned where each element is a

sequence that describes one file or subdirectory.

If st names a directory you may have entries for "." and "..", just as with the DOS DIR command. If st names a file then x will have just one entry, i.e. length(x) will be 1. If st contains wildcards you may have multiple entries.

Each entry contains the name, attributes and file size as well as the year, month, day, hour, minute and second of the last

modification. You can refer to the elements of an entry with the following constants defined in file.e:

global constant D_NAME = 1, D_ATTRIBUTES = 2, D_SIZE = 3,

D_YEAR = 4, D_MONTH = 5, D_DAY = 6,

D_HOUR = 7,

D_MINUTE = 8, D_SECOND = 9

The attributes element is a string sequence containing characters chosen from:

'd' directory 'r' read only file 'h' hidden file 's' system file 'v' volume-id entry 'a' archive file

A normal file without special attributes would just have an empty

string, "", in this field.

Comments: The top level directory, e.g. c:\ does not have "." or ".." entries.

This function is often used just to test if a file or directory exists.

Under WIN32, st can have a long file or directory name anywhere in the path.

Under Linux, the only attribute currently available is 'd'.

DOS32: The file name returned in D_NAME will be a standard DOS

8.3 name. (See Archive Web page for a better solution).

WIN32: The file name returned in D_NAME will be a long file name.

Example:

d = dir(current_dir())

d might have: { {".", "d", 0 1994, 1, 18, 9, 30, 02}, {"..", "d", 0 1994, 1, 18, 9, 20, 14}, {"fred", "ra", 2350, 1994, 1, 22, 17, 22, 40}, {"sub", "d" , 0, 1993, 9, 20, 8, 50, 12}

}

d[3][D_NAME] would be "fred"

Example Programs: bin\search.ex, bin\install.ex

See Also: wildcard_file, current_dir, open


Copied from F1 Jockey by Dan Everingham.

Don Cole

new topic     » goto parent     » topic index » view message » categorize

3. Re: Help needed with walk_dir()

hacker said...

It might help if I explain what I want to do first. I would like to select a file from a folder, and then use walk_dir() to select and process all files in the same directory, but in a particular order based on the timestamp of the file. So I will select the oldest and then need the program to select progressively "newer" files which will be processed in turn.

The default in walk_dir() is to visit files in alphabetical order, and the manual says if you want to use a different order you should:

"set the global integer my_dir to the routine id of your own modified dir() function that sorts the directory entries differently. See the default dir() function in file.e."

I've looked at the example in "search.ex" but I'm not sure how to write the dir() function which will achieve what I want. Do I need to use custom_sort()? Any help appreciated, thanks in advance!

Basically, walk_dir() will call a function to sort the directory listing. By default it uses sort() which sorts in alpabetical order.

You can write your own sort function from scratch to sort the directory listing differently (or, if you don't want to sort at all, use the following identity function to tell walk_dir() to not sort at all but use the "native" order:)

function identity(object x) 
return x 
end function 
constant identity_id = routine_id("identity") 

You can use custom_sort() with walk_dir() if you want. custom_sort() requires that you write a comparator function (that looks at two directory entries and decides which one is greater than the other), and uses that to determine how to sort the directory listing.

new topic     » goto parent     » topic index » view message » categorize

4. Re: Help needed with walk_dir()

hacker said...

It might help if I explain what I want to do first. I would like to select a file from a folder, and then use walk_dir() to select and process all files in the same directory, but in a particular order based on the timestamp of the file. So I will select the oldest and then need the program to select progressively "newer" files which will be processed in turn.

Here is some example code, written using Euphoria v4 ...

include std/filesys.e 
include std/sort.e 
include std/io.e 
 
function process_file(sequence path_name, sequence item) 
-- this function accepts two sequences as arguments 
-- it displays all C/C++ source files and their sizes 
    if find('d', item[D_ATTRIBUTES]) then 
        return 0 -- Ignore directories 
    end if 
    sequence t 
 
	t = fileext(item[D_NAME]) 
    if not find(t, {"c","h","cpp","hpp","cp"}) then 
        return 0 -- ignore non-C/C++ files 
    end if 
    writefln(1, "[][][]: [] []/[z:2]/[z:2] [z:2]:[z:2]:[z:2].[z:3]",  
    		{path_name, {SLASH}, item[D_NAME], item[D_SIZE], 
             item[D_YEAR], item[D_MONTH], item[D_DAY], 
             item[D_HOUR], item[D_DAY], item[D_SECOND], item[D_MILLISECOND]}) 
    return 0 -- keep going 
end function 
 
function my_dir(sequence path) 
    object d 
	 
    d = dir(path) 
    if atom(d) then 
        return d 
    end if 
    -- Sort in ascending time stamp. 
   return sort_columns(d, {D_YEAR, D_MONTH, D_DAY,  
                           D_HOUR, D_MINUTE, D_SECOND, D_MILLISECOND}) 
end function 
 
integer exit_code  
sequence cmds 
 
cmds = command_line() 
exit_code = walk_dir(cmds[3], routine_id("process_file"), 1, routine_id("my_dir")) 
new topic     » goto parent     » topic index » view message » categorize

5. Re: Help needed with walk_dir()

Thanks Derek and Jim, this is great!

I actually found an example on the old forum, posted by Pete Eberlein.

include sort.e 
include file.e 
 
function by_date(sequence file1, sequence file2) 
-- compare two files by date, for custom_sort 
	return compare(file1[D_YEAR..D_SECOND], file2[D_YEAR..D_SECOND]) 
end function 
 
function dir_oldest_first(sequence path) 
-- Custom directory sorting function for walk_dir(). 
	object d 
	d = dir(path) 
	if atom(d) then 
	    return d end if 
	return custom_sort(routine_id("by_date"), d) 
end function 
 
my_dir = routine_id("dir_oldest_first") -- for walk_dir 
 
function process(sequence path, sequence dirinfo) 
	path = dirinfo[1..1] & reverse(dirinfo[4..6]) & dirinfo[7..9] 
	printf(1,"%s %02d/%02d/%04d %02d:%02d:%02d\n", path) 
	return 0        -- carry on 
end function 
 
if walk_dir(".", routine_id("process"), 0) then 
    puts(1, "walk_dir error") 
end if 


Problem is, I don't really understand how it works, I'll have to trace through the code a few times. I've never really got my head around routine_id() and functions like call_back(), what I really need is a sort of dummies guide to "advanced" euphoria programming.

I'm using version 3.1.1, I notice that in the ver 4.0 example walk_dir() has 4 parameters and not 3...

new topic     » goto parent     » topic index » view message » categorize

6. Re: Help needed with walk_dir()

hacker said...

Thanks Derek and Jim, this is great!

I actually found an example on the old forum, posted by Pete Eberlein.


Wow, that was a long time ago. I don't recognize the code at all.

said...

Problem is, I don't really understand how it works, I'll have to trace through the code a few times. I've never really got my head around routine_id() and functions like call_back(), what I really need is a sort of dummies guide to "advanced" euphoria programming.


If all you want is a single directory, you can just use the dir_oldest_first() function from my example, and ignore walk_dir() altogether. I updated it below to move the routine_id() call to a constant, since I suspect it could allocate memory each time you call it. The routine_id is used as a sort of pointer-to-a-function, that you can use to tell another function to use this function for a certain operation. The custom_sort() is a great example of this - the sorting algorithm stays the same, but you can use a custom comparison function for the items being sorted. So to sort directory items by date, we need a function that compares the date fields from two directory items, and then use the routine_id of that function with custom_sort().

include sort.e 
include file.e 
 
function compare_by_date(sequence file1, sequence file2) 
-- compare two files by date, for custom_sort 
	return compare(file1[D_YEAR..D_SECOND], file2[D_YEAR..D_SECOND]) 
end function 
 
constant routine_id__compare_by_date = routine_id("compare_by_date") 
 
function dir_oldest_first(sequence path) 
-- Custom directory sorting function for walk_dir(). 
	object d 
	d = dir(path) 
	if atom(d) then 
	    return d 
	end if 
	return custom_sort(routine_id__compare_by_date, d) 
end function 
 
said...

I'm using version 3.1.1, I notice that in the ver 4.0 example walk_dir() has 4 parameters and not 3...

Based on Derek's example, the global variable my_dir used by walk_dir() went away in 4.0, and it is now the 4th parameter to walk_dir()

new topic     » goto parent     » topic index » view message » categorize

7. Re: Help needed with walk_dir()

PeteE said...
said...

I'm using version 3.1.1, I notice that in the ver 4.0 example walk_dir() has 4 parameters and not 3...

Based on Derek's example, the global variable my_dir used by walk_dir() went away in 4.0, and it is now the 4th parameter to walk_dir()

Well, its deprecated rather than removed. The my_dir approach still works but its no longer documented and the optional 4th parameter is now the preferred way of doing this.

new topic     » goto parent     » topic index » view message » categorize

8. Re: Help needed with walk_dir()

said...

The routine_id is used as a sort of pointer-to-a-function, that you can use to tell another function to use this function for a certain operation. The custom_sort() is a great example of this - the sorting algorithm stays the same, but you can use a custom comparison function for the items being sorted. So to sort directory items by date, we need a function that compares the date fields from two directory items, and then use the routine_id of that function with custom_sort().

Thanks Peter, this is useful. Having never learned C or lower-level stuff my understanding of pointers is hazy, but I can see that it's needed at times to get the most out of Euphoria.

new topic     » goto parent     » topic index » view message » categorize

9. Re: Help needed with walk_dir()

hacker said...
PeteE said...

The routine_id is used as a sort of pointer-to-a-function...

... Having never learned C or lower-level stuff my understanding of pointers is hazy ...

You can think of routine ids as a kind of bookmark or place holder; its a way that you can call a routine when you don't know it's name.

In Euphoria, every routine is given a number when the application is run. You can find out what that number is by using the routine_id() function, and you can call the 'anonymous' routine by using it's number in the call_proc() or call_func() routine.

For example, the custom_sort function knows how to sort elements in a sequence but it doesn't know how to compare elements to work out which element should go before another element. Instead, it calls a routine that is written by you to get that information; but custom_sort does not know the name of your routine. So it calls your routine 'indirectly' using the call_func() routine with the routine id you initially passed to custom_sort().

new topic     » goto parent     » topic index » view message » categorize

10. Re: Help needed with walk_dir()

Derek,

You say that routine_id() is "sort of" a pointer to a function, is this how it is actually implemented in the C source code for Euphoria?

I have to confess this is making my head hurt, but I'd really like to understand it. I'm not going to attempt reading the source code, but I have a book on C which has been gathering dust on my shelf for years, time to take a look at it I think. I know that "The C Programming Language" is highly recommended, but maybe tough going for newbies like me...

new topic     » goto parent     » topic index » view message » categorize

11. Re: Help needed with walk_dir()

hacker said...

You say that routine_id() is "sort of" a pointer to a function, is this how it is actually implemented in the C source code for Euphoria?

It's a pointer in that it "points to" your routine. It does not refer to a particular place in memory (which is what is commonly meant when someone talks about a pointer). Effectively, you could think of it where the back end maintains a list of the routines, and the routine id is the index into that list. It's basically equivalent to:

 
procedure foo() 
    -- ...do stuff 
end procedure 
 
procedure bar() 
    -- ...do stuff 
end procedure 
 
procedure baz() 
    -- ...do stuff 
end procedure 
 
sequence ROUTINES = {"foo", "bar", "baz"} 
 
function my_routine_id( sequence name ) 
    return find( name, ROUTINES ) 
end function 
 
procedure my_call_proc( integer id ) 
    if id = 1 then 
        foo() 
    elsif id = 2 then 
        bar() 
    elseif id = 3 then 
        baz() 
    else 
        -- crash, bad routine id! 
    end if 
end procedure 
 
integer foo_id 
foo_id = my_routine_id("foo") 
 
my_call_proc( foo_id ) 
 
hacker said...

I have to confess this is making my head hurt, but I'd really like to understand it. I'm not going to attempt reading the source code, but I have a book on C which has been gathering dust on my shelf for years, time to take a look at it I think. I know that "The C Programming Language" is highly recommended, but maybe tough going for newbies like me...

You can also get a better feel by looking at the euphoria based back end. It's still pretty dense, and not terribly easy to jump right into, but there is an implementation of routine id built in pure euphoria in there.

Matt

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu