Euphoria Ticket #324: Gather include files

We need a utility to gather all of the files used by a program. At minimum, generate a list, but even better would be to pull them all together into a directory, along with subdirectories as appropriate. It would probably be useful to exclude the std library by default, and have the ability to exclude certain files as directed by the user.

This would make distributing programs with all of their dependencies much easier, as well as submitting code for bug reports.

As for subdirectories, if the program did:

-- myapp.ex 
include foo/bar.e 

...then the utility would go something like:

$ eudist myapp.ex 
Euphoria distribution helper v1.0 
Outputting files to directory: myapp 
...or something similar. Basically, we should be able to distribute that directory, and someone else (with a functional euphoria install) should be able to run it.

We'd probably also need a way to specify other resources that need to be included.


Type: Feature Request Severity: Normal Category: Bundled Utility
Assigned To: jimcbrown Status: Fixed Reported Release:
Fixed in SVN #: 3860, 3880 View VCS: 3860, 3880 Milestone: 4.0.0RC1

1. Comment by euphoric Nov 05, 2010

Tone Skoda did something like this. See here ("Copy Included Files").

2. Comment by jimcbrown Nov 05, 2010

I have a working parser that will print the required files.

It still needs some polishing.

--global scopes aren't redefined 
    --this MIGHT cause an error, however, euphoria should normally have an 
    --error in that case too. 
--procedure-scope vars aren't saved, this makes code more readable 
    --N/A if shroud is on 
--you can either choose not to combine files in a list, or 
    --you can ignore include files on 1 dir (default: eu's include dir) 
--shrouding symbols is optional. 
--bug fixed: local symbols were being treated as global. 
--bug fixed: strings with \x were changed to \x\x. 
--bug fixed: file c:\myfile.e and myfile.e were the diffrent files. 
--added platform to sequence builtin 
--bug: using routine_id("abort_") gives error 
    --"Unable to resolve routine id for abort" 
    --to fix made routine_id-ing of builtin routines legal. 
-- concat.ex 
-- version 1.0 
-- replacement for Euphoria's bind routine 
include std/get.e 
include std/io.e 
include std/text.e 
include std/console.e 
include std/sequence.e 
include std/filesys.e 
-- code from RDS's ED color coding routines 
sequence charClass 
-- character classes 
    DIGIT       = 1, 
    OTHER       = 2, 
    LETTER      = 3, 
    BRACKET     = 4, 
    QUOTE       = 5, 
    DASH        = 6, 
    WHITE_SPACE = 7, 
    NEW_LINE    = 8 
    charClass = repeat( OTHER, 255 ) 
    charClass['a'..'z'] = LETTER 
    charClass['A'..'Z'] = LETTER 
    charClass['_']      = LETTER 
    charClass['0'..'9'] = DIGIT 
    charClass['[']      = BRACKET 
    charClass[']']      = BRACKET 
    charClass['(']      = BRACKET 
    charClass[')']      = BRACKET 
    charClass['{']      = BRACKET 
    charClass['}']      = BRACKET 
    charClass['\'']     = QUOTE 
    charClass['"']      = QUOTE 
    charClass[' ']      = WHITE_SPACE 
    charClass['\t']     = WHITE_SPACE 
    charClass['\n']     = NEW_LINE 
    charClass['-']      = DASH 
stdlib40 = { 
oldstdlib = { 
euphorialib = { 
    included = {} 
    includedNewNames = {} 
    excludedIncludes = {} 
object outputDir=-1 
    EuPlace = getenv( "EUDIR" ) 
    --Place = { "", EuPlace & "\\", EuPlace & "\\INCLUDE\\" } 
sequence Place = { current_dir()&SLASH, EuPlace & SLASH, EuPlace & SLASH&"include"&SLASH, "" }, 
mainPath = "" 
function findFile( sequence fName ) 
    -- returns where a file is 
    -- looks in the usual places 
    -- look in the usual places 
    if find(fName[length(fName)], {10, 13}) then 
	fName = fName[1..length(fName)-1] 
    end if 
    for i = 1 to length( Place ) do 
	if sequence( dir( Place[i] & fName ) ) then 
	    if platform() = 3 then 
		return Place[i] & fName 
		return upper( Place[i] & fName ) 
	    end if 
	end if 
    end for 
    printf( 1, "Unable to locate file %s.\n", {fName} ) 
end function 
function getIncludeName( sequence data ) 
    -- if the statement is an include statement, return the file name 
    integer at 
    -- include statement missing? 
    if not match( "include ", data ) then 
	return {"","",""} 
    end if 
    -- trim white space 
    while charClass[ data[1] ] = WHITE_SPACE do 
	data = data[2..length( data ) ] 
    end while       
    -- line feed? 
    if find( '\n', data ) then 
	data = data[1..length(data)-1] 
    end if 
    if find( '\r', data ) then 
	data = data[1..length(data)-1] 
    end if 
    sequence includeType 
    -- not first statement? 
    if equal( data[1..8], "include " ) then 
	-- remove statement 
	includeType = data[1..8] 
	data = data[9..length(data)] 
    elsif length(data) > 15 and equal( data[1..15], "public include " ) then 
	-- remove statement 
	includeType = data[1..15] 
	data = data[16..length(data)] 
	-- not an include statement 
	return {"","",""} 
    end if 
    sequence nameSpace = "" 
    -- remove data after space 
    at = find( ' ', data ) 
    if at then 
	nameSpace = data[at..$] 
	data = data[] 
    end if 
    return {data,nameSpace,includeType} 
end function 
function trimer(sequence s) 
    sequence t 
    integer u 
    if s[length(s)] = '\n' then 
	s = s[1..length(s)-1] 
    end if 
    if s[length(s)] = '\r' then 
	s = s[1..length(s)-1] 
    end if 
    --t = reverse(s) 
    --u = find(SLASH, t) 
    --if not u then 
	--return s 
    --end if 
    --t = t[1..u-1] 
    --s = reverse(t) 
    return s 
end function 
function includable(sequence name) 
    return not find(name, excludedIncludes) 
end function 
without warning 
function parseFile( sequence fName ) 
    integer inFile, outFile 
    sequence newIncludeName, includeName, newfName, nameSpace, includeType 
    object data 
	included = append( included, fName ) 
    -- find the file 
    fName = findFile( fName ) 
    inFile = open( fName, "r" ) 
    newfName = filename(fName) 
    if sequence(outputDir) then 
   	 while file_exists( outputDir & SLASH & newfName ) do 
	 	newfName &= sprintf("%d", rand(10)) 
	 end while 
   	 outFile = open( outputDir & SLASH & newfName, "w" ) 
	 outFile = -1 
    end if 
	includedNewNames = append( includedNewNames, newfName ) 
    while 1 do         
	-- read a line 
	data = gets( inFile ) 
	-- end of file? 
	if integer( data ) then 
	end if 
	-- include file? 
	includeName = getIncludeName( data ) 
	includeType = includeName[3] 
	nameSpace = includeName[2] 
	includeName = includeName[1] 
	if length( includeName ) and includable(trimer(includeName)) then 
    -- already part of the file? 
    if find( includeName, included ) then 
	    newIncludeName = includedNewNames[find(includeName, included)] 
	    -- include the file 
	    newIncludeName = parseFile( includeName ) 
    end if   
    if outFile != -1 then 
	    puts( outFile, includeType & newIncludeName & nameSpace & "\n") 
    end if 
    if outFile != -1 then 
	    puts( outFile, data ) 
    end if 
	end if 
    end while 
    close( inFile ) 
    if outFile != -1 then 
    close( outFile ) 
    end if 
    return newfName 
end function 
with warning 
function getListOfFiles(sequence dir) 
	-- TODO implement XXX 
	return "" 
end function 
procedure run()    
    object cmd, inFileName 
    inFileName = -1 
    -- read the command line 
    cmd = command_line() 
    for i = 3 to length(cmd) do 
    	if equal(cmd[i], "-i") then 
		if i = length(cmd) then 
			puts(1, "Expected filename to follow -i!\n") 
			inFileName = cmd[i+1] 
		end if 
    	elsif equal(cmd[i], "-d") then 
		if i = length(cmd) then 
			puts(1, "Expected output dir to follow -d!\n") 
			outputDir = cmd[i+1] 
		end if 
    	elsif equal(cmd[i], "-e") then 
		if i = length(cmd) then 
			puts(1, "Expected excluded include file to follow -e!\n") 
			excludedIncludes = append(excludedIncludes, cmd[i+1]) 
		end if 
    	elsif equal(cmd[i], "-ed") then 
		if i = length(cmd) then 
			puts(1, "Expected excluded include dir to follow -ed!\n") 
			excludedIncludes &= getListOfFiles(cmd[i+1]) 
		end if 
    	elsif equal(cmd[i], "--no-copy") or 
    	      equal(cmd[i], "-nc") then 
	      	outputDir = -1 
	end if 
    end for 
    -- get input file 
    if atom(inFileName) then 
	inFileName = prompt_string( "File to parse? " ) 
	if length( inFileName ) = 0 then 
	end if 
    end if 
    mainPath = pathname(canonical_path(inFileName)) 
    Place &= {mainPath&SLASH} 
    -- process the input file 
    parseFile( inFileName ) 
    printf(1, "%d files were found. These are:\n", {length(included)}) 
    for i = 1 to length(included) do 
    	printf(1, "%s\n", {included[i]}) 
    end for 
end procedure 

3. Comment by jeremy Nov 05, 2010

It would be nice if our bundled utilities all used our new std/cmdline.e so that our bundled utilities have a common feel/operation to them. This will also provide -help support automatically which should be included with any bundled utility.

4. Comment by jimcbrown Nov 05, 2010

Agreed. I'll assign that task to jimcbrown immediately.

5. Comment by DerekParnell Nov 06, 2010

Moved to RC1

6. Comment by jimcbrown Nov 06, 2010

I'm done working on this and marking it as fixed.

What it does not do:

Automatically exclude the stdlib. Right now you can manually exclude the stdlib dir just like any other directory. Also, it's not clear what should be excluded - just the std directory? All of include? What about include/euphoria? Finally, there are cases where you'll want the user to include the version of the stdlib that they are using (in case it's a very old version no longer easily obtainable, or one that might have local modifications, for example).

Copy subdirectory structure. All files are copied flat, but renamed if there is a conflict with an earlier file. I like having a flat option, but otherwise don't have an objection with this feature. It's just more effort than it's worth.

Does not support using eu.cfg to specify include locations. Only EUINC/EUDIR and the -I option are supported. Again, more effort than it's worth, especially since it looks like reusing pathopen.e is not possible (too many dependencies on other parts of the interpreter).


