Re: Reading comma delimited .csv data files

new topic     » goto parent     » topic index » view thread      » older message » newer message

JAYBEEDEE wrote:
> 
> CChris wrote:
> > 
> > JAYBEEDEE wrote:
> > > 
> > > 
> > > I'm having difficulty in getting Eu to read multiple column data tables
> > > created
> > > via Excel
> > > and saved as comma delimited text files.
> > > 
> > > I'm confused by the get(), gets(), getc() and value() commands.
> > > 
> > > Using gets() I have no problem in reading data in a single column, but if
> > > there
> > > are
> > > more than one, then gets() reads each row (line) as a single element.
> > > 
> > > For example, the csv file might be of the form:
> > > 
> > > Month,,
> > > 6,,
> > > 8,,
> > > 1988,,
> > > 
> > > 
> > > 2,6,8
> > > 5,12,25.7
> > > . 
> > > etc
> > > 
> > > Note 2 blank lines after 1988,, and blank cells in the grid as ,,
> > > 
> > > I would like to write the data into a sequence emulating 3 horizontal
> > > elements
> > > and n rows like
> > > 
> > > data{{}{}{}}  so that I can extract values using a statement like
> > > cell_value=data[row][col]
> > > 
> > > So far I'm defeated!  Any suggestions?
> > 
> > Euphoria doesn't have standard routines to read formatted input, contrary to
> > most other languages. Use the strtok library by Kat (in the archive) to
> > split
> > a text string using commas as delimiters. 
> > 
> > gets(your_file) will return say "8,," (a full line as a text string), and
> > applying
> > the right function in strtok to this will split it to {"8","",""} (a
> > sequence
> > of substrings some of which may be empty). 
> > 
> > Now, if any of these substrings is known to represent a number, you can call
> > value() to perform the conversion.
> > 
> > FYI, win32lib has routines (w32split(), w32TextToNumber() and others) to do
> > the job.
> > 
> > A simple splitting function could be coded like this:
> > }}}
<eucode>
> > function split(sequence s)
> >   integer pos,prev_pos
> >   sequence result
> > 
> >   pos=find(',',s)
> >   if not pos then 
> >     return {s} -- no splitting took place
> >   end if 
> >   result={}
> >   prev_pos=0
> >   while pos do
> >     result=append(result,s[prev_pos+1..pos-1]) -- another substring
> >                                      -- substrings may well be empty
> > -- a sustring extends from prev delim+1 to next delim-1
> >     -- find next
> >     prev_pos=pos
> >     pos=find_from(',',s,prev_pos+1)
> >   end while
> > -- get tail substring and return the whole array
> >   return append(result,s[prev_pos+1..$]) 
> > end function
> > </eucode>
{{{

> > CChris
> Thanks, Chris
> 
> I had tried using find and find_from to slice up the strings but always got
> a zero result. 
> I note that you included the comma in single quotes ',' whereas I used double
> quotes ",".  Was this where I went wrong? There doesn't seem to be anything
> in the Euphoria Manual
> about this distinction.
> 

It obviously didn't help.
',' is a quoted character, which is a plain number, 44, just written in such a
way that you don't need to look up an ASCII table or write Chr$(,) or whatever it
is in Basic.
"," is a sequence, and is the same as {44}.
Since the line you retrieve from gets() is made of byte sized integers, you
should represent , by a byte sized integer, and accordingly write it either ','
or 44.

You'll find this topic covered in section 2.1.2, "Character strings and
individual characters", in the reference manual.

> Your code looks is if it should do the job, but I haven't tried it yet.
> 
> Incidentally - is there an index or database of "include" files and the
> procedures
> they contain?
> Searching the Archives comes up with a lot of chat, and files with unhelpful
> titles
> such as "Routines I wish had been included with Euphoria", but no indication
> as to what they contain.
> Makes life hard for us newbies.

This issue has been raised from time to time, but nothing has been done so far.
With like 2,000 entries currently in the Archive, this has to be a pretty big
community job - with the support of RDS -, and requires a very orderly follow-up
as updates or new files come in every other day. But that project would have its
uses indeed.

CChris

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu