1. dir()


Is it possible to know the size of the sequence dir() will return before i call dir()? Some of these dirs have 500,000+ files in them, and the memory Eu is using to dir() isn't returned to the OS when that var is cleared, pushing Eu to use 300megabytes or more of memory. This is unacceptable.

useless

new topic     » topic index » view message » categorize

2. Re: dir()

useless said...


Is it possible to know the size of the sequence dir() will return before i call dir()? Some of these dirs have 500,000+ files in them, and the memory Eu is using to dir() isn't returned to the OS when that var is cleared, pushing Eu to use 300megabytes or more of memory. This is unacceptable.

useless

Hmm, I do not have an answer to the question you have posed.

However, I think the reason the memory to dir() is not being cleared immediately is because the Windows binaries (eui.exe, euiw.exe, etc) are by default compiled using MANAGED_MEM. (This is the default for two reasons: a) it allows the binaries to run under Windows 9x, b) the caching speeds up programs even when running on Windows XP)

Binaries using ESIMPLE_MALLOC (aka System Memory) should not exhibit this problem. (You can tell the difference by running euiw.exe with no parameters and checking to see if it outputs "Using Managed Memory" or "Using System Memory".)

I don't know where to get one from, as afaik there are no System-Memory eubins. Someone with Windows would probably have to compile from source.

new topic     » goto parent     » topic index » view message » categorize

3. Re: dir()

Hi

I agree this is problemmatical. I thought the memory was returned to the system after use (hence no need to clear up allocated memory). Perhaps there should be an explicit command to clear and return memory allocated to these large sequences.

Chris

new topic     » goto parent     » topic index » view message » categorize

4. Re: dir()

ChrisB said...

Hi

I agree this is problemmatical. I thought the memory was returned to the system after use (hence no need to clear up allocated memory). Perhaps there should be an explicit command to clear and return memory allocated to these large sequences.

Chris

Well, the memory is kept in cache and reused, so if the sequence variable is cleared (with var = {}) and then reused on another dir(), we still only use 300MB instead of 600MB.

I'm not sure I like adding a Windows-specific keyword for this situation (it only happens on Windows afaik).

System memory does return the memory to the system immediately after use.

new topic     » goto parent     » topic index » view message » categorize

5. Re: dir()

useless said...


Is it possible to know the size of the sequence dir() will return before i call dir()? Some of these dirs have 500,000+ files in them, and the memory Eu is using to dir() isn't returned to the OS when that var is cleared, pushing Eu to use 300megabytes or more of memory. This is unacceptable.

useless

The Euphoria 4 dir_size() function should do what you want. This will report the number of top level folders and files in a specified folder. This would add to run time but you would know the number of entries that would be returned by dir().

new topic     » goto parent     » topic index » view message » categorize

6. Re: dir()

LarryMiller said...

The Euphoria 4 dir_size() function should do what you want. This will report the number of top level folders and files in a specified folder. This would add to run time but you would know the number of entries that would be returned by dir().

But this won't help as dir_size() calls dir() (via default_dir() through walk_dir()).

new topic     » goto parent     » topic index » view message » categorize

7. Re: dir()

useless said...


Is it possible to know the size of the sequence dir() will return before i call dir()? Some of these dirs have 500,000+ files in them, and the memory Eu is using to dir() isn't returned to the OS when that var is cleared, pushing Eu to use 300megabytes or more of memory. This is unacceptable.

Assuming that you're on Windows, I think you could use FindFirstFile and FindNextFile with wildcards.

Matt

new topic     » goto parent     » topic index » view message » categorize

8. Re: dir()

I feel like I'm tooting my own horn here...

My win_dir routine will not consume lots of memory when "walking" a directory. It calls your routine each time it encounters a single file, it does not collect all the files up at once then call your routine for each one. Maximum memory usage is in the range of about 1-4 KB, give-or-take some given the length of the current path. blink

-Greg

new topic     » goto parent     » topic index » view message » categorize

9. Re: dir()

ghaberek said...

I feel like I'm tooting my own horn here...

My win_dir routine will not consume lots of memory when "walking" a directory. It calls your routine each time it encounters a single file, it does not collect all the files up at once then call your routine for each one. Maximum memory usage is in the range of about 1-4 KB, give-or-take some given the length of the current path. blink

-Greg



The key words to me not using it is it's *your* code that's "walking" the directory, not mine. Why doesn't win_dir() allow that? such as:

thisdir = win_dir(path,ndxptr) 



to get the 1st ndxptr or the 250,346th one?

There's actually several apps that use this approach, which i found after Matt reminded me of the calls. Seem there's like a dozen Eu apps with variants on FindFirstFile(), and a dozen variants of that call in the winapi. Problem is, if you call their dir functions, they emulate the Eu built-in dir(), and return the same massive structure.

I still have not jumped into windows "call backs", and the instant i see anything containing "call back", i close the document. When i call code, i want something from it besides it recursively calling my code, for me to call it again, for it to call me, for me to call it, etc etc. In the dos days, i very quickly ate up all the stack space doing things like that. It's the big reason i left Turbo Pascal for Euphoria, and i don't want to get into that pit again.

useless

new topic     » goto parent     » topic index » view message » categorize

10. Re: dir()

Kat said...
ghaberek said...

My win_dir routine will not consume lots of memory when "walking" a directory.

Why doesn't win_dir() allow that? such as:

thisdir = win_dir(path,ndxptr) 



to get the 1st ndxptr or the 250,346th one?

I'll start working on the Eu library to implement these ideas. Thanks for the suggestions.

new topic     » goto parent     » topic index » view message » categorize

11. Re: dir()



Derek,

win32tutor, issued in 1999, for win32lib 0.45r and Eu v 2.2 has an example of FindFirstFileA , but oddly it is used only to get the timestamp of only the file in question. There's no FindNextFileA, or the _W versions of the api calls.

useless

new topic     » goto parent     » topic index » view message » categorize

12. Re: dir()

Hi, this is an function I made some time ago to remove a directory with all the content inside including subdirectories, you may be able to adjust it for your needs (this wraps Ansi apis)

include machine.e 
include dll.e 
 
without warning 
 
constant 
    wKernel32           = open_dll("kernel32.dll"), 
    wDeleteFile         = define_c_func(wKernel32, "DeleteFileA", {C_POINTER}, C_LONG), 
    wFindFirstFile      = define_c_func(wKernel32, "FindFirstFileA", {C_POINTER, C_POINTER}, C_POINTER), 
    wFindNextFile       = define_c_func(wKernel32, "FindNextFileA", {C_POINTER, C_POINTER}, C_LONG), 
    wFindClose          = define_c_func(wKernel32, "FindClose", {C_POINTER}, C_LONG), 
    wRemoveDirectory    = define_c_func(wKernel32, "RemoveDirectoryA", {C_POINTER}, C_LONG), 
 
    sizeOfDWORD = 4, 
    sizeOfFILETIME = sizeOfDWORD * 2, 
    wMAX_PATH   = 260, 
     
    sizeOfWIN32_FIND_DATA = sizeOfDWORD * 7 + sizeOfFILETIME * 3 + wMAX_PATH + 14, 
    WIN32_FIND_DATA_dwFileAttributes = 0, 
    WIN32_FIND_DATA_cFileName =  sizeOfFILETIME * 3 + sizeOfDWORD * 5, 
 
    wFILE_ATTRIBUTE_DIRECTORY   = 16, 
    wINVALID_HANDLE_VALUE       = -1 
 
atom rid_removeDirectoryCheckFile 
 
function peek_string(atom addr, atom maxbytes)  -- Read a 0 (null) delimited string from memory 
    sequence text 
    atom chr 
    maxbytes += addr 
    text = {} 
    if addr > 0 then 
        chr = peek(addr) 
        while chr != 0 and addr < maxbytes do 
            text &= chr 
            addr += 1 
            chr = peek(addr) 
        end while 
    end if 
    return text 
end function 
 
function poke_string(atom addr, atom maxbytes, sequence string) -- Poke a null terminated string @ address 
    atom len 
    len = length(string) 
    if addr != 0 and len < maxbytes then 
        poke(addr, string) 
        poke(addr+len,0) 
        return 1 
    end if 
    return 0 
end function 
 
-- It will stop processing on error returning 0 
-- name must be a path 
global function removeDirectory(sequence name) 
 
    atom fHandle, pwin32finddata, retval, pname, pfile 
    retval = 0 
     
    pname = allocate(wMAX_PATH) 
    -- Allocate a WIN32_FIND_DATA structure that's used by FindFile 
    pwin32finddata = allocate(sizeOfWIN32_FIND_DATA) 
     
    if pwin32finddata != 0 and pname != 0 then 
        mem_set(pwin32finddata,0,sizeOfWIN32_FIND_DATA) 
        if length(name) then 
            if name[$] != '\\' then 
                name &= "\\" 
            end if 
            pfile = allocate_string(name & "*") 
            fHandle = c_func(wFindFirstFile,{pfile, pwin32finddata})    -- There's always at least one directory "." 
            free(pfile) 
            if fHandle != wINVALID_HANDLE_VALUE then 
                if call_func(rid_removeDirectoryCheckFile, {name, pwin32finddata, pname}) then 
                    while c_func(wFindNextFile,{fHandle, pwin32finddata}) do -- While there's content inside the directory 
                        retval = call_func(rid_removeDirectoryCheckFile, {name, pwin32finddata, pname}) 
                        if not retval then 
                            exit 
                        end if 
                    end while 
                    if not c_func(wFindClose, {fHandle}) then   -- Close Find handle 
                        puts(1,"Error when calling findClose in removeDirectory()\n") 
                    end if 
                end if 
                if poke_string(pname, wMAX_PATH, name) then 
                    retval = c_func(wRemoveDirectory,{pname}) 
                else 
                    puts(1,"poke_string failed in removeDirectory(). Data: " & name & "\n") 
                end if 
            end if 
        end if 
        free(pwin32finddata) 
        free(pname) 
    end if 
    return retval 
end function 
 
-- Used by removeDirectory. This is the routine that checks and performs deletion of files inside 
-- a directory. 
function removeDirectoryCheckFile(sequence oriname, atom pwin32finddata, atom pname) 
    sequence filename 
    atom retval 
    retval = 1 
    filename = peek_string(pwin32finddata+WIN32_FIND_DATA_cFileName, wMAX_PATH) 
    if not equal(filename,".") and not equal(filename,"..") then -- Skip special directories 
        -- If it's a directory, call removeDirectory recursively 
        if and_bits( 
                peek4u(pwin32finddata+WIN32_FIND_DATA_dwFileAttributes), 
                wFILE_ATTRIBUTE_DIRECTORY 
                ) != 0 then 
            retval = removeDirectory(oriname & filename) 
 
        -- Else, if it's an archive, then delete it 
        else 
            if poke_string(pname, wMAX_PATH, oriname & filename) then 
                retval = c_func(wDeleteFile,{pname}) 
            else 
                puts(1,"poke_string failed in removeDirectoryCheckFile(). Data: " & oriname & filename & "\n") 
            end if 
        end if 
    end if 
    return retval 
end function 
 
rid_removeDirectoryCheckFile = routine_id("removeDirectoryCheckFile") 
 
if wKernel32 = 0 then 
    puts(1,"Error in open_dll(\"kernel32.dll\")\n") 
    abort(1) 
end if 


Regards,

Guillermo

new topic     » goto parent     » topic index » view message » categorize

13. Re: dir()

useless said...

The key words to me not using it is it's *your* code that's "walking" the directory, not mine. Why doesn't win_dir() allow that? such as:

thisdir = win_dir(path,ndxptr) 



to get the 1st ndxptr or the 250,346th one?

There's actually several apps that use this approach, which i found after Matt reminded me of the calls. Seem there's like a dozen Eu apps with variants on FindFirstFile(), and a dozen variants of that call in the winapi. Problem is, if you call their dir functions, they emulate the Eu built-in dir(), and return the same massive structure.

I still have not jumped into windows "call backs", and the instant i see anything containing "call back", i close the document. When i call code, i want something from it besides it recursively calling my code, for me to call it again, for it to call me, for me to call it, etc etc. In the dos days, i very quickly ate up all the stack space doing things like that. It's the big reason i left Turbo Pascal for Euphoria, and i don't want to get into that pit again.

useless

The problem is that FindFirstFile and FindNextFile return a "find handle" and it's expected that one enumerates the entire directory, then closes the handle. I don't know if there's a way to get the number of files and jump around randomly in the list.

-Greg

new topic     » goto parent     » topic index » view message » categorize

14. Re: dir()

ghaberek said...

The problem is that FindFirstFile and FindNextFile return a "find handle" and it's expected that one enumerates the entire directory, then closes the handle. I don't know if there's a way to get the number of files and jump around randomly in the list.

-Greg



So i increment from 1 to 234,346, ok. Or i tell dir() to increment till it finds it, or i tell dir a range of indexes, or i give dir some wildcard. I would be using the existing tools made available in the last 10 years, i do not see the problem?

useless

new topic     » goto parent     » topic index » view message » categorize

15. Re: dir()

useless said...

So i increment from 1 to 234,346, ok. Or i tell dir() to increment till it finds it, or i tell dir a range of indexes, or i give dir some wildcard. I would be using the existing tools made available in the last 10 years, i do not see the problem?

useless

It seemed you wanted something like give_me_file_number( 234346 ), and I don't think that exists.

-Greg

new topic     » goto parent     » topic index » view message » categorize

16. Re: dir()

ghaberek said...
useless said...

So i increment from 1 to 234,346, ok. Or i tell dir() to increment till it finds it, or i tell dir a range of indexes, or i give dir some wildcard. I would be using the existing tools made available in the last 10 years, i do not see the problem?

useless

It seemed you wanted something like give_me_file_number( 234346 ), and I don't think that exists.

-Greg

I agree, such a function does not seem to exist.

Fortunately, kat only wanted a function that returned each file one at a time (and not any particular file or any particular order), which was an easier condition to fulfill.

new topic     » goto parent     » topic index » view message » categorize

17. Re: dir()

jimcbrown said...
ghaberek said...
useless said...

So i increment from 1 to 234,346, ok. Or i tell dir() to increment till it finds it, or i tell dir a range of indexes, or i give dir some wildcard. I would be using the existing tools made available in the last 10 years, i do not see the problem?

useless

It seemed you wanted something like give_me_file_number( 234346 ), and I don't think that exists.

-Greg

I agree, such a function does not seem to exist.

Fortunately, kat only wanted a function that returned each file one at a time (and not any particular file or any particular order), which was an easier condition to fulfill.



Correct.

Since the dir is guaranteed to not be in any sorted order when returned one by one, the best you could do is to ask for the next one.

useless

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu