Re: Reading comma delimited .csv data files

new topic     » goto parent     » topic index » view thread      » older message » newer message

JAYBEEDEE wrote:
> 
> CChris wrote:
> > 
> > JAYBEEDEE wrote:
> > > 
> > > 
> > > I'm having difficulty in getting Eu to read multiple column data tables
> > > created
> > > via Excel
> > > and saved as comma delimited text files.
> > > 
> > > I'm confused by the get(), gets(), getc() and value() commands.
> > > 
> > > Using gets() I have no problem in reading data in a single column, but if
> > > there
> > > are
> > > more than one, then gets() reads each row (line) as a single element.
> > > 
> > > For example, the csv file might be of the form:
> > > 
> > > Month,,
> > > 6,,
> > > 8,,
> > > 1988,,
> > > 
> > > 
> > > 2,6,8
> > > 5,12,25.7
> > > . 
> > > etc
> > > 
> > > Note 2 blank lines after 1988,, and blank cells in the grid as ,,
> > > 
> > > I would like to write the data into a sequence emulating 3 horizontal
> > > elements
> > > and n rows like
> > > 
> > > data{{}{}{}}  so that I can extract values using a statement like
> > > cell_value=data[row][col]
> > > 
> > > So far I'm defeated!  Any suggestions?
> > 
> > Euphoria doesn't have standard routines to read formatted input, contrary to
> > most other languages. Use the strtok library by Kat (in the archive) to
> > split
> > a text string using commas as delimiters. 
> > 
> > gets(your_file) will return say "8,," (a full line as a text string), and
> > applying
> > the right function in strtok to this will split it to {"8","",""} (a
> > sequence
> > of substrings some of which may be empty). 
> > 
> > Now, if any of these substrings is known to represent a number, you can call
> > value() to perform the conversion.
> > 
> > FYI, win32lib has routines (w32split(), w32TextToNumber() and others) to do
> > the job.
> > 
> > A simple splitting function could be coded like this:
> > }}}
<eucode>
> > function split(sequence s)
> >   integer pos,prev_pos
> >   sequence result
> > 
> >   pos=find(',',s)
> >   if not pos then 
> >     return {s} -- no splitting took place
> >   end if 
> >   result={}
> >   prev_pos=0
> >   while pos do
> >     result=append(result,s[prev_pos+1..pos-1]) -- another substring
> >                                      -- substrings may well be empty
> > -- a sustring extends from prev delim+1 to next delim-1
> >     -- find next
> >     prev_pos=pos
> >     pos=find_from(',',s,prev_pos+1)
> >   end while
> > -- get tail substring and return the whole array
> >   return append(result,s[prev_pos+1..$]) 
> > end function
> > </eucode>
{{{

> > CChris
> Thanks, Chris
> 
> I had tried using find and find_from to slice up the strings but always got
> a zero result. 
> I note that you included the comma in single quotes ',' whereas I used double
> quotes ",".  Was this where I went wrong? There doesn't seem to be anything
> in the Euphoria Manual
> about this distinction.

Yes, I had a great deal of difficulty with that distinction too when I first
started with Eu. "," represents a string, or list of things, and ',' represents
one thing. so "," is a list of one comma only, whereas ',' is only a comma.
To take it a small step further ",," is valid, but ',,' is invalid


> 
> Your code looks is if it should do the job, but I haven't tried it yet.
> 
> Incidentally - is there an index or database of "include" files and the
> procedures
> they contain?

What a fantastic idea!

> Searching the Archives comes up with a lot of chat, and files with unhelpful
> titles
> such as "Routines I wish had been included with Euphoria", but no indication
> as to what they contain.
> Makes life hard for us newbies.

Perservere - the rewards are great.

Chris

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu