1. Reading comma delimited .csv data files
- Posted by JAYBEEDEE <daviesjb at liv?ac.u?> Jan 18, 2008
- 613 views
I'm having difficulty in getting Eu to read multiple column data tables created via Excel and saved as comma delimited text files. I'm confused by the get(), gets(), getc() and value() commands. Using gets() I have no problem in reading data in a single column, but if there are more than one, then gets() reads each row (line) as a single element. For example, the csv file might be of the form: Month,, 6,, 8,, 1988,, 2,6,8 5,12,25.7 . etc Note 2 blank lines after 1988,, and blank cells in the grid as ,, I would like to write the data into a sequence emulating 3 horizontal elements and n rows like data{{}{}{}} so that I can extract values using a statement like cell_value=data[row][col] So far I'm defeated! Any suggestions?
2. Re: Reading comma delimited .csv data files
- Posted by c.k.lester <euphoric at ckleste?.co?> Jan 18, 2008
- 600 views
JAYBEEDEE wrote: > For example, the csv file might be of the form: > > Month,, > 6,, > 8,, > 1988,, > > > 2,6,8 > 5,12,25.7 > . > etc So should that data be like { { "Month", "" } ,{ 6, 0 } ,{ 8, 0 } ,{ 1988, 0 } } Why is 1988 in the month column?! Anyway, this is very easy to do. Basically:
fn = open("myExcelOutputFile.csv","r") line = gets( fn ) while sequence(line) do cols = parse(line,",") --<-- search Euphoria archive for this functionality grid = append(grid,cols) line = gets( fn ) end while close(fn)
Now you can get values with grid[row][col]. parse() takes a line of text and separates it using the separator you provide.
3. Re: Reading comma delimited .csv data files
- Posted by CChris <christian.cuvier at agric?lt?re.gouv.fr> Jan 18, 2008
- 621 views
JAYBEEDEE wrote: > > > I'm having difficulty in getting Eu to read multiple column data tables > created > via Excel > and saved as comma delimited text files. > > I'm confused by the get(), gets(), getc() and value() commands. > > Using gets() I have no problem in reading data in a single column, but if > there > are > more than one, then gets() reads each row (line) as a single element. > > For example, the csv file might be of the form: > > Month,, > 6,, > 8,, > 1988,, > > > 2,6,8 > 5,12,25.7 > . > etc > > Note 2 blank lines after 1988,, and blank cells in the grid as ,, > > I would like to write the data into a sequence emulating 3 horizontal elements > and n rows like > > data{{}{}{}} so that I can extract values using a statement like > cell_value=data[row][col] > > So far I'm defeated! Any suggestions? Euphoria doesn't have standard routines to read formatted input, contrary to most other languages. Use the strtok library by Kat (in the archive) to split a text string using commas as delimiters. gets(your_file) will return say "8,," (a full line as a text string), and applying the right function in strtok to this will split it to {"8","",""} (a sequence of substrings some of which may be empty). Now, if any of these substrings is known to represent a number, you can call value() to perform the conversion. FYI, win32lib has routines (w32split(), w32TextToNumber() and others) to do the job. A simple splitting function could be coded like this:
function split(sequence s) integer pos,prev_pos sequence result pos=find(',',s) if not pos then return {s} -- no splitting took place end if result={} prev_pos=0 while pos do result=append(result,s[prev_pos+1..pos-1]) -- another substring -- substrings may well be empty -- a sustring extends from prev delim+1 to next delim-1 -- find next prev_pos=pos pos=find_from(',',s,prev_pos+1) end while -- get tail substring and return the whole array return append(result,s[prev_pos+1..$]) end function
CChris
4. Re: Reading comma delimited .csv data files
- Posted by JAYBEEDEE <daviesjb at liv??c.uk> Jan 19, 2008
- 617 views
CChris wrote: > > JAYBEEDEE wrote: > > > > > > I'm having difficulty in getting Eu to read multiple column data tables > > created > > via Excel > > and saved as comma delimited text files. > > > > I'm confused by the get(), gets(), getc() and value() commands. > > > > Using gets() I have no problem in reading data in a single column, but if > > there > > are > > more than one, then gets() reads each row (line) as a single element. > > > > For example, the csv file might be of the form: > > > > Month,, > > 6,, > > 8,, > > 1988,, > > > > > > 2,6,8 > > 5,12,25.7 > > . > > etc > > > > Note 2 blank lines after 1988,, and blank cells in the grid as ,, > > > > I would like to write the data into a sequence emulating 3 horizontal > > elements > > and n rows like > > > > data{{}{}{}} so that I can extract values using a statement like > > cell_value=data[row][col] > > > > So far I'm defeated! Any suggestions? > > Euphoria doesn't have standard routines to read formatted input, contrary to > most other languages. Use the strtok library by Kat (in the archive) to split > a text string using commas as delimiters. > > gets(your_file) will return say "8,," (a full line as a text string), and > applying > the right function in strtok to this will split it to {"8","",""} (a sequence > of substrings some of which may be empty). > > Now, if any of these substrings is known to represent a number, you can call > value() to perform the conversion. > > FYI, win32lib has routines (w32split(), w32TextToNumber() and others) to do > the job. > > A simple splitting function could be coded like this: > }}} <eucode> > function split(sequence s) > integer pos,prev_pos > sequence result > > pos=find(',',s) > if not pos then > return {s} -- no splitting took place > end if > result={} > prev_pos=0 > while pos do > result=append(result,s[prev_pos+1..pos-1]) -- another substring > -- substrings may well be empty > -- a sustring extends from prev delim+1 to next delim-1 > -- find next > prev_pos=pos > pos=find_from(',',s,prev_pos+1) > end while > -- get tail substring and return the whole array > return append(result,s[prev_pos+1..$]) > end function > </eucode> {{{ > CChris Thanks, Chris I had tried using find and find_from to slice up the strings but always got a zero result. I note that you included the comma in single quotes ',' whereas I used double quotes ",". Was this where I went wrong? There doesn't seem to be anything in the Euphoria Manual about this distinction. Your code looks is if it should do the job, but I haven't tried it yet. Incidentally - is there an index or database of "include" files and the procedures they contain? Searching the Archives comes up with a lot of chat, and files with unhelpful titles such as "Routines I wish had been included with Euphoria", but no indication as to what they contain. Makes life hard for us newbies.
5. Re: Reading comma delimited .csv data files
- Posted by ChrisBurch2 <crylex at freeuk.c?.?k> Jan 19, 2008
- 626 views
JAYBEEDEE wrote: > > CChris wrote: > > > > JAYBEEDEE wrote: > > > > > > > > > I'm having difficulty in getting Eu to read multiple column data tables > > > created > > > via Excel > > > and saved as comma delimited text files. > > > > > > I'm confused by the get(), gets(), getc() and value() commands. > > > > > > Using gets() I have no problem in reading data in a single column, but if > > > there > > > are > > > more than one, then gets() reads each row (line) as a single element. > > > > > > For example, the csv file might be of the form: > > > > > > Month,, > > > 6,, > > > 8,, > > > 1988,, > > > > > > > > > 2,6,8 > > > 5,12,25.7 > > > . > > > etc > > > > > > Note 2 blank lines after 1988,, and blank cells in the grid as ,, > > > > > > I would like to write the data into a sequence emulating 3 horizontal > > > elements > > > and n rows like > > > > > > data{{}{}{}} so that I can extract values using a statement like > > > cell_value=data[row][col] > > > > > > So far I'm defeated! Any suggestions? > > > > Euphoria doesn't have standard routines to read formatted input, contrary to > > most other languages. Use the strtok library by Kat (in the archive) to > > split > > a text string using commas as delimiters. > > > > gets(your_file) will return say "8,," (a full line as a text string), and > > applying > > the right function in strtok to this will split it to {"8","",""} (a > > sequence > > of substrings some of which may be empty). > > > > Now, if any of these substrings is known to represent a number, you can call > > value() to perform the conversion. > > > > FYI, win32lib has routines (w32split(), w32TextToNumber() and others) to do > > the job. > > > > A simple splitting function could be coded like this: > > }}} <eucode> > > function split(sequence s) > > integer pos,prev_pos > > sequence result > > > > pos=find(',',s) > > if not pos then > > return {s} -- no splitting took place > > end if > > result={} > > prev_pos=0 > > while pos do > > result=append(result,s[prev_pos+1..pos-1]) -- another substring > > -- substrings may well be empty > > -- a sustring extends from prev delim+1 to next delim-1 > > -- find next > > prev_pos=pos > > pos=find_from(',',s,prev_pos+1) > > end while > > -- get tail substring and return the whole array > > return append(result,s[prev_pos+1..$]) > > end function > > </eucode> {{{ > > CChris > Thanks, Chris > > I had tried using find and find_from to slice up the strings but always got > a zero result. > I note that you included the comma in single quotes ',' whereas I used double > quotes ",". Was this where I went wrong? There doesn't seem to be anything > in the Euphoria Manual > about this distinction. Yes, I had a great deal of difficulty with that distinction too when I first started with Eu. "," represents a string, or list of things, and ',' represents one thing. so "," is a list of one comma only, whereas ',' is only a comma. To take it a small step further ",," is valid, but ',,' is invalid > > Your code looks is if it should do the job, but I haven't tried it yet. > > Incidentally - is there an index or database of "include" files and the > procedures > they contain? What a fantastic idea! > Searching the Archives comes up with a lot of chat, and files with unhelpful > titles > such as "Routines I wish had been included with Euphoria", but no indication > as to what they contain. > Makes life hard for us newbies. Perservere - the rewards are great. Chris
6. Re: Reading comma delimited .csv data files
- Posted by CChris <christian.cuvier at ag?iculture.g?uv.fr> Jan 19, 2008
- 594 views
JAYBEEDEE wrote: > > CChris wrote: > > > > JAYBEEDEE wrote: > > > > > > > > > I'm having difficulty in getting Eu to read multiple column data tables > > > created > > > via Excel > > > and saved as comma delimited text files. > > > > > > I'm confused by the get(), gets(), getc() and value() commands. > > > > > > Using gets() I have no problem in reading data in a single column, but if > > > there > > > are > > > more than one, then gets() reads each row (line) as a single element. > > > > > > For example, the csv file might be of the form: > > > > > > Month,, > > > 6,, > > > 8,, > > > 1988,, > > > > > > > > > 2,6,8 > > > 5,12,25.7 > > > . > > > etc > > > > > > Note 2 blank lines after 1988,, and blank cells in the grid as ,, > > > > > > I would like to write the data into a sequence emulating 3 horizontal > > > elements > > > and n rows like > > > > > > data{{}{}{}} so that I can extract values using a statement like > > > cell_value=data[row][col] > > > > > > So far I'm defeated! Any suggestions? > > > > Euphoria doesn't have standard routines to read formatted input, contrary to > > most other languages. Use the strtok library by Kat (in the archive) to > > split > > a text string using commas as delimiters. > > > > gets(your_file) will return say "8,," (a full line as a text string), and > > applying > > the right function in strtok to this will split it to {"8","",""} (a > > sequence > > of substrings some of which may be empty). > > > > Now, if any of these substrings is known to represent a number, you can call > > value() to perform the conversion. > > > > FYI, win32lib has routines (w32split(), w32TextToNumber() and others) to do > > the job. > > > > A simple splitting function could be coded like this: > > }}} <eucode> > > function split(sequence s) > > integer pos,prev_pos > > sequence result > > > > pos=find(',',s) > > if not pos then > > return {s} -- no splitting took place > > end if > > result={} > > prev_pos=0 > > while pos do > > result=append(result,s[prev_pos+1..pos-1]) -- another substring > > -- substrings may well be empty > > -- a sustring extends from prev delim+1 to next delim-1 > > -- find next > > prev_pos=pos > > pos=find_from(',',s,prev_pos+1) > > end while > > -- get tail substring and return the whole array > > return append(result,s[prev_pos+1..$]) > > end function > > </eucode> {{{ > > CChris > Thanks, Chris > > I had tried using find and find_from to slice up the strings but always got > a zero result. > I note that you included the comma in single quotes ',' whereas I used double > quotes ",". Was this where I went wrong? There doesn't seem to be anything > in the Euphoria Manual > about this distinction. > It obviously didn't help. ',' is a quoted character, which is a plain number, 44, just written in such a way that you don't need to look up an ASCII table or write Chr$(,) or whatever it is in Basic. "," is a sequence, and is the same as {44}. Since the line you retrieve from gets() is made of byte sized integers, you should represent , by a byte sized integer, and accordingly write it either ',' or 44. You'll find this topic covered in section 2.1.2, "Character strings and individual characters", in the reference manual. > Your code looks is if it should do the job, but I haven't tried it yet. > > Incidentally - is there an index or database of "include" files and the > procedures > they contain? > Searching the Archives comes up with a lot of chat, and files with unhelpful > titles > such as "Routines I wish had been included with Euphoria", but no indication > as to what they contain. > Makes life hard for us newbies. This issue has been raised from time to time, but nothing has been done so far. With like 2,000 entries currently in the Archive, this has to be a pretty big community job - with the support of RDS -, and requires a very orderly follow-up as updates or new files come in every other day. But that project would have its uses indeed. CChris
7. Re: Reading comma delimited .csv data files
- Posted by Robert Craig <rds at RapidEu?hori?.com> Jan 19, 2008
- 626 views
CChris wrote: > > JAYBEEDEE wrote: > > Incidentally - is there an index or database of "include" files and the > > procedures > > they contain? > > Searching the Archives comes up with a lot of chat, and files with unhelpful > > titles > > such as "Routines I wish had been included with Euphoria", but no indication > > as to what they contain. > > Makes life hard for us newbies. > > This issue has been raised from time to time, but nothing has been done so > far. Actually I did something about it a couple of years ago, but the information got buried in the "Extra Stuff from RDS" section. unzip.txt gets automatically updated at the end of each month. I've now copied the link to the main page in the Search area on the right side. See the new link: "Search 1,700 contributed programs (files contained in .zip/.tar)" http://www.rapideuphoria.com/unzip.txt This is just file names though, not routine names. The .zips/.tgz's are in alphabetical order. You can also see the file dates and sizes, so you can try to find the latest version of an include file. > With like 2,000 entries currently in the Archive, this has to be a pretty big > community job - with the support of RDS -, and requires a very orderly > follow-up > as updates or new files come in every other day. But that project would have > its uses indeed. In addition to the normal *file description* search, http://www.rapideuphoria.com/archive.htm there is also Aku's source file *content* search for the whole Archive, (also on the main Euphoria page). It seems to be about one year out of date: http://www.kejut.com/prog/eusearch.php?aksi=cari&carian=parse&Submit=EuSearch&filterJenisE=1 Maybe Aku (are you out there?) can tell us how to maintain this. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
8. Re: Reading comma delimited .csv data files
- Posted by Aku <akusaya at ?mx?net> Jan 20, 2008
- 598 views
Robert Craig wrote: > In addition to the normal *file description* search, > <a > href="http://www.rapideuphoria.com/archive.htm">http://www.rapideuphoria.com/archive.htm</a> > there is also Aku's source file *content* search for the > whole Archive, (also on the main Euphoria page). > It seems to be about one year out of date: > > <a > href="http://www.kejut.com/prog/eusearch.php?aksi=cari&carian=parse&Submit=EuSearch&filterJenisE=1">http://www.kejut.com/prog/eusearch.php?aksi=cari&carian=parse&Submit=EuSearch&filterJenisE=1</a> > > Maybe Aku (are you out there?) can tell us how to maintain this. Hi! Wow, time flies so fast, I didn't realize it has been more than one year since last update. I thought it was just several months ago. So I have just updated it from the archive :) I also added a new feature which is grouping of duplicate files. Therefore, same files in different archive (contributions) will only be shown once, but the file names in which the keywords appear will be shown and can be opened. Actually what I did was: 1. mirror www.rapideuphoria.com using wget 2. extract all zip, tgz, tar, rar files 3. put all file contents to mysql database 4. put a fulltext index on the file contents Is it possible for someone to maintain this?
9. Re: Reading comma delimited .csv data files
- Posted by Robert Craig <rds at Rap?dEuphoria.co?> Jan 20, 2008
- 616 views
Aku wrote: > Wow, time flies so fast, I didn't realize it has been more than one year > since last update. I thought it was just several months ago. > > So I have just updated it from the archive :) Thanks. > I also added a new feature which is grouping of duplicate files. > Therefore, same files in different archive (contributions) will only be > shown once, but the file names in which the keywords appear will be shown > and can be opened. > > Actually what I did was: > 1. mirror www.rapideuphoria.com using wget > 2. extract all zip, tgz, tar, rar files > 3. put all file contents to mysql database > 4. put a fulltext index on the file contents Great. > Is it possible for someone to maintain this? I hope so, but if no one steps forward in the next month or so, maybe I can develop yet another search facility for the site, perhaps using a Euphoria database instead of SQL, and maybe adapting the EUforum message search or the contributed programs search. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
10. Re: Reading comma delimited .csv data files
- Posted by Kat <kat12 at co?sahs.n?t> Jan 21, 2008
- 625 views
- Last edited Jan 22, 2008
JAYBEEDEE wrote: > > > I'm having difficulty in getting Eu to read multiple column data tables > created > via Excel > and saved as comma delimited text files. > > I'm confused by the get(), gets(), getc() and value() commands. > > Using gets() I have no problem in reading data in a single column, but if > there > are > more than one, then gets() reads each row (line) as a single element. > > For example, the csv file might be of the form: > > Month,, > 6,, > 8,, > 1988,, > > > 2,6,8 > 5,12,25.7 > . > etc > > Note 2 blank lines after 1988,, and blank cells in the grid as ,, > > I would like to write the data into a sequence emulating 3 horizontal elements > and n rows like > > data{{}{}{}} so that I can extract values using a statement like > cell_value=data[row][col] > > So far I'm defeated! Any suggestions? Strtok was made for this task. It can retrieve, insert, find, match, and sort such records. Kat