Re: Reading comma delimited .csv data files
- Posted by ChrisBurch2 <crylex at freeuk.c?.?k> Jan 19, 2008
- 592 views
JAYBEEDEE wrote: > > CChris wrote: > > > > JAYBEEDEE wrote: > > > > > > > > > I'm having difficulty in getting Eu to read multiple column data tables > > > created > > > via Excel > > > and saved as comma delimited text files. > > > > > > I'm confused by the get(), gets(), getc() and value() commands. > > > > > > Using gets() I have no problem in reading data in a single column, but if > > > there > > > are > > > more than one, then gets() reads each row (line) as a single element. > > > > > > For example, the csv file might be of the form: > > > > > > Month,, > > > 6,, > > > 8,, > > > 1988,, > > > > > > > > > 2,6,8 > > > 5,12,25.7 > > > . > > > etc > > > > > > Note 2 blank lines after 1988,, and blank cells in the grid as ,, > > > > > > I would like to write the data into a sequence emulating 3 horizontal > > > elements > > > and n rows like > > > > > > data{{}{}{}} so that I can extract values using a statement like > > > cell_value=data[row][col] > > > > > > So far I'm defeated! Any suggestions? > > > > Euphoria doesn't have standard routines to read formatted input, contrary to > > most other languages. Use the strtok library by Kat (in the archive) to > > split > > a text string using commas as delimiters. > > > > gets(your_file) will return say "8,," (a full line as a text string), and > > applying > > the right function in strtok to this will split it to {"8","",""} (a > > sequence > > of substrings some of which may be empty). > > > > Now, if any of these substrings is known to represent a number, you can call > > value() to perform the conversion. > > > > FYI, win32lib has routines (w32split(), w32TextToNumber() and others) to do > > the job. > > > > A simple splitting function could be coded like this: > > }}} <eucode> > > function split(sequence s) > > integer pos,prev_pos > > sequence result > > > > pos=find(',',s) > > if not pos then > > return {s} -- no splitting took place > > end if > > result={} > > prev_pos=0 > > while pos do > > result=append(result,s[prev_pos+1..pos-1]) -- another substring > > -- substrings may well be empty > > -- a sustring extends from prev delim+1 to next delim-1 > > -- find next > > prev_pos=pos > > pos=find_from(',',s,prev_pos+1) > > end while > > -- get tail substring and return the whole array > > return append(result,s[prev_pos+1..$]) > > end function > > </eucode> {{{ > > CChris > Thanks, Chris > > I had tried using find and find_from to slice up the strings but always got > a zero result. > I note that you included the comma in single quotes ',' whereas I used double > quotes ",". Was this where I went wrong? There doesn't seem to be anything > in the Euphoria Manual > about this distinction. Yes, I had a great deal of difficulty with that distinction too when I first started with Eu. "," represents a string, or list of things, and ',' represents one thing. so "," is a list of one comma only, whereas ',' is only a comma. To take it a small step further ",," is valid, but ',,' is invalid > > Your code looks is if it should do the job, but I haven't tried it yet. > > Incidentally - is there an index or database of "include" files and the > procedures > they contain? What a fantastic idea! > Searching the Archives comes up with a lot of chat, and files with unhelpful > titles > such as "Routines I wish had been included with Euphoria", but no indication > as to what they contain. > Makes life hard for us newbies. Perservere - the rewards are great. Chris