1. OK what does ` mean - seriously seems to be not documented.

yes I mean seriously.

new topic     » topic index » view message » categorize

2. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

yes I mean seriously.

It's a raw string quote, and it is documented.

See http://openeuphoria.org/docs/lang_def.html#_86_characterstringsandindividualcharacters

new topic     » goto parent     » topic index » view message » categorize

3. Re: OK what does ` mean - seriously seems to be not documented.

OK

so, what is it for, why is it there, etc.

And yes, I thought it was undocumented.

new topic     » goto parent     » topic index » view message » categorize

4. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

OK

so, what is it for, why is it there, etc.

And yes, I thought it was undocumented.

RTFM.

new topic     » goto parent     » topic index » view message » categorize

5. Re: OK what does ` mean - seriously seems to be not documented.

RTFM = Read The Friendly Manual ;)

new topic     » goto parent     » topic index » view message » categorize

6. Re: OK what does ` mean - seriously seems to be not documented.

jimcbrown said...
gimlet said...

OK

so, what is it for, why is it there, etc.

And yes, I thought it was undocumented.

RTFM.

Yeas, i mean seriously, RTFM, there's thousands of returns at http://openeuphoria.org/search/results.wc?s=%60&manual=1

useless

new topic     » goto parent     » topic index » view message » categorize

7. Re: OK what does ` mean - seriously seems to be not documented.

eukat said...
jimcbrown said...
gimlet said...

OK

so, what is it for, why is it there, etc.

And yes, I thought it was undocumented.

RTFM.

Yeas, i mean seriously, RTFM, there's thousands of returns at http://openeuphoria.org/search/results.wc?s=%60&manual=1

That's why I never said to SEARCH the friendly manual. blink

This is probably an issue with the way MySQL's AGAINST works - when combining a word and a space like "function " or "procedure " with a backquote, it gives the same results as just searching for "function" or "procedure". We probably need special handling for characters like this (single quote, double quote, plus sign, minus sign, and the space character are similarly affected).

Of course, dealing with quotes and searching in mysql is always hard. http://forums.mysql.com/read.php?52,216660,216794#msg-216794

The other bit is that, for the search columns used internally in the database, we don't seem to have characters like the backquote linked to the pages that describe them. So if the first issue was fixed, the search would return zero results ...

new topic     » goto parent     » topic index » view message » categorize

8. Re: OK what does ` mean - seriously seems to be not documented.

I guess knowing that `...` was a raw string would have made my question superfluous.

However reading the manual doesn't give me what I want exactly.

My question is what are raw strings used for, what limitations are placed on the etc. I am asking this as I want to know whether they need to be accounted for when reading and writing from a file.

The wikipedia entry says they are a literal string format (which seems to imply that they fully the equivalent of quoted strings). It certainly would seem odd to write raw strings to file on Windows as the carriage returns normally present would appear to be missing.

new topic     » goto parent     » topic index » view message » categorize

9. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

I guess knowing that `...` was a raw string would have made my question superfluous.

However reading the manual doesn't give me what I want exactly.

My question is what are raw strings used for, what limitations are placed on the etc. I am asking this as I want to know whether they need to be accounted for when reading and writing from a file.

The wikipedia entry says they are a literal string format (which seems to imply that they fully the equivalent of quoted strings). It certainly would seem odd to write raw strings to file on Windows as the carriage returns normally present would appear to be missing.

I believe that raw strings (unlike normal quoted strings, created in Euphoria with the double quotes) can span multiple lines. In addition, using backslash as an escape character isn't allowed. A lot of characters that'd normally have to be escaped don't need to be escaped in Euphoria's raw string format, but the flip side (limitation) is that you can't escape anything - thus there's no way to represent the backquote itself inside of a backquote-generated raw string.

(That's why we have two raw string specifiers, the backquote and the triple doublequote format. If that's still not good enough, you can split the raw string into two raw strings and concatenate a normal string in between them.)

new topic     » goto parent     » topic index » view message » categorize

10. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

yes I mean seriously.

If you can't find something, or don't understand something--then it is the fault of the documentation.

RTFM means "Redact The Fine Manual" (Outside of Euphoria, this has a different meaning).

So yes, backtick ` needs a better presentation; this has been added to the backlog of documentation improvements...

_tom

new topic     » goto parent     » topic index » view message » categorize

11. Re: OK what does ` mean - seriously seems to be not documented.

_tom said...

So yes, backtick ` needs a better presentation; this has been added to the backlog of documentation improvements...

I agree Tom. I've just re-read the documentation on this topic (mostly written by myself) and it is pretty bad really.

new topic     » goto parent     » topic index » view message » categorize

12. Re: OK what does ` mean - seriously seems to be not documented.

I am sorry that this started a war.

How is one supposed to talk to you if a simple question starts a war?

My problem with raw strings aka '`' aka `"""` has not been properly addressed.

1 It is extremely unclear what the purpose of raw strings is  
  in Eu: 
  given that  
     1. Eu raw strings remove carriage returns and therefore are  
        of limited use as regex source on Windows. 
     2. Because they cannot contain '`' and `"""` raw strings  
        cannot contain `"""` they cannot be text (they need to  
        be read in as source and interpreted). 
     3. The documentation say nothing about purpose except as  
        being multi-line - but if they do not do carriage  
        returns then how useful are they? 
 
2 This is really my question. Given that `"$,"` is not just some 
  hacky back-end stuff allowing you to say "these characters" 
  what is the point? It seems they can be read as source, they  
  could be written as data, but they are really only strings and  
  have no separate existence. 

new topic     » goto parent     » topic index » view message » categorize

13. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

I am sorry that this started a war.

How is one supposed to talk to you if a simple question starts a war?

The deleted posts thread ( http://openeuphoria.org/forum/m/124091.wc ) had nothing to do with you and was not your fault, nor did it start a "war".

Please do not be afraid to ask questions. We'll do our best to answer them here.

gimlet said...

My problem with raw strings aka '`' aka `"""` has not been properly addressed. It is extremely unclear what the purpose of raw strings is in Eu given that they cannot be text (they need to be read in as source and interpreted).

That's just plain wrong. They're text. There's no interpretation of escape codes or anything like that.

gimlet said...

Because they cannot contain '`' and `"""`

Raw strings using the backquote can contain the triple double quote within them.

Raw strings using the triple double quote can contain the backquote within them.

The triple double quote form is more common, as one is more likely to need to a single character (the backquote) than the same three characters in a row (the triple double quote). The point though is that if your text needs to include one of those two things, you can use the other raw string form to handle it.

gimlet said...

raw strings cannot contain `"""`

Granted. This is the price for handling things as raw text, with no interpretation.

I suppose, if you really needed it, we could add a triple single quote raw string form.

gimlet said...

This is really my question. Given that `"$,"` is not just some hacky back-end stuff allowing you to say "these characters" what is the point?

Actually, that's all it is.

gimlet said...

It seems they can be read as source, they could be written as data, but they are really only strings and have no separate existence.

The raw string format is not really meant to represent raw Euphoria code or anything like that. The primary use in Euphoria's implementation and stdlib has been for help screens and the like, where a screenful of text that doesn't really ever change is displayed.

It just happens to be easier to do:

constant helptext =  
""" 

This is your multi-line help text. 
Some commands here. 
 
And even more detail here. 
""" 

instead of

constant helptext =  
"This is your multi-line help text.\n" & 
"Some commands here.\n"& 
"\n"& 
"And even more detail here.\n" 
gimlet said...

Eu raw strings remove carriage returns and therefore are of limited use as regex source on Windows.

The documentation say nothing about purpose except as being multi-line - but if they do not do carriage returns then how useful are they?

I sense this is really the main issue you are having. What's going on is that the operating system libraries for Windows below Eu strip off the carriage returns before handing the source code to Eu. So the raw string is only as raw as the OS will allow. I can see the C library imposing other limitations (e.g. no null characters). We could probably work around this by opening everything in binary instead of text mode.

Even so, we might want to add a with-statement option to control this behavior, as most of the time, if your Euphoria source code is stored in DOS CR-LF format, and you are only using raw strings for help text and such, you actually do want to strip the carriage returns out before displaying that text on the screen.

I'd like to blame M$ for this non-sense. On the PDP systems where C was born, binary mode meant reading a file one byte (9bits) at a time, whereas text mode meant reading a file one word (18bits) at a time. IIRC this was because files were actually writen to disk and stored in word-sized units instead of byte-sized units. So if you opened a file in text mode, it would sometimes pad the data with nulls. Binary mode was a way to see the raw file without padding if you needed to for some reason.

On the computers where MS-DOS ran on, files were stored in byte-sized units, so there's no need to do this. Same is true on modern unix systems, so today POSIX says that binary and text mode have no difference. However, M$ had this terrible idea of presenting text to programs using the unix text file format (newlines only), but storing it on disk in CRLF format. So text mode got you the file in unix text format (stripping out and inserting carriage returns behind the scenes) and binary format got you what the file actually looked like on disk.

I don't know why this decision was made, but the only suggestion I've heard is that some early serial printers allowed you to send a text file down the serial port, but you had to add in the carriage return in addition to the linefeed/newline or else the line would move down but the column wouldn't reset (so the next line could start in the middle or even at the end of the line, for example). Anyways, modern printers generally don't let you do that anymore and now that bad decision is still causing problems today.

new topic     » goto parent     » topic index » view message » categorize

14. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

yes I mean seriously.

A draft for a new introduction to Euphoria strings:

http://openeuphoria.org/wiki/view/NewDocs_Strings_One.wc

http://openeuphoria.org/wiki/view/NewDocs_Strings_Two.wc

There are two parts because the wiki does not allow for long articles.

_tom

new topic     » goto parent     » topic index » view message » categorize

15. Re: OK what does ` mean - seriously seems to be not documented.

Tom:

a couple of things  
 
1. Unfortunately in my browser the euro example displays '?' rather than the glyph. 
 
2. raw strings are not 'raw' as carriage returns are stripped. Instead of crlf you get  
   plain lf.  

new topic     » goto parent     » topic index » view message » categorize

16. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

raw strings are not 'raw' as carriage returns are stripped. Instead of crlf you get plain lf.

I think this is close enough.

If you disagree, then please demonstrate a use case where it is necessary to preserve carriage returns.

new topic     » goto parent     » topic index » view message » categorize

17. Re: OK what does ` mean - seriously seems to be not documented.

jimcbrown said...
gimlet said...

raw strings are not 'raw' as carriage returns are stripped. Instead of crlf you get plain lf.

I think this is close enough.

If you disagree, then please demonstrate a use case where it is necessary to preserve carriage returns.

That said, maybe it makes sense to come up with new terminology to distinguish between them, e.g. to compare between different types of raw strings from different languages.

Cooked strings - the standard double quotes, does escaping and etc.

Preheated strings - Euphoria's version, that are almost raw but strips off the carriage returns.

Extra raw - a version that is so raw that it even keeps carriage returns in the string.

new topic     » goto parent     » topic index » view message » categorize

18. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

Tom:

a couple of things  
 
1. Unfortunately in my browser the euro example displays '?' rather than the glyph. 
 
2. raw strings are not 'raw' as carriage returns are stripped. Instead of crlf you get  
   plain lf.  

Point One

Since I do not use Windows...

It means that Windows does not recognize the defacto standard that Unicode strings are encoded as UTF-8. When you see ? (or a diamond) it means "unrecognized character." In my example the Euro is UTF-8 encoded.

It does mean explaining how Windows works needs more effort--I will have to come up with something. But in my draft I do mention that Windows does not follow standard protocol.

Point Two

Raw strings are about input into Euphoria. Raw strings are not about preserving original Windows strings. I mention that all input methods, and all string formats result in exactly the same strings--you can not tell what the orignal looked like. From this viewpoint removing carriage returns is valid.

Someone with Windows skills may explain how to use Unicode on Windows. I need help here.


Carriage Return

What is the value of CR LF ( 13 10 ) at the end of each line?

My computer does not have a serial port! Yes, I do have a serial port dot matrix printer that hasn't been tossed out yet. The CR gimic was used to make ersatz bold and accents when printing. But that is still not a valid reason for CR LF on every line.

Wikipedia mentions that CR and LF were used, somewhere, for soft and hard paragraph breaks. Still, I don't think that is a valid reaon for having CR LF at the end of each line.

Pragmatic viewpoint. Use an editor that fixes line endings to LF.


Keep up the dialog. RTFM does mean fix the documentation.

_tom

new topic     » goto parent     » topic index » view message » categorize

19. Re: OK what does ` mean - seriously seems to be not documented.

jimcbrown said...
jimcbrown said...
gimlet said...

raw strings are not 'raw' as carriage returns are stripped. Instead of crlf you get plain lf.

I think this is close enough.

If you disagree, then please demonstrate a use case where it is necessary to preserve carriage returns.

That said, maybe it makes sense to come up with new terminology to distinguish between them, e.g. to compare between different types of raw strings from different languages.

Cooked strings - the standard double quotes, does escaping and etc.

Preheated strings - Euphoria's version, that are almost raw but strips off the carriage returns.

Extra raw - a version that is so raw that it even keeps carriage returns in the string.

I have so much to learn.

Someone please show me an example of Extra Raw and how it must be used.

I was told that Windows now has a start button. I am tempted to rip out my hard drive and upgrade to Windows 8.1.1, I really do need to understand what CR LF is all about.

_tom

new topic     » goto parent     » topic index » view message » categorize

20. Re: OK what does ` mean - seriously seems to be not documented.

_tom said...

Someone please show me an example of Extra Raw and how it must be used.

The 'extra raw' I think is Jim having a bit of a joke. There is no real reason for it to exist nowadays. And if one actually MUST HAVE it, then run your output text through a filter than converts all LF to CRLF first. But really, why bother?

_tom said...

I really do need to understand what CR LF is all about.

Ok ... back in the old days, when output went to devices such as a teletype the two bytes had a specific meaning for the device. The CR device control code would cause the print head to move back to the leftmost position on the print carriage (thus the name CARRIAGE_RETURN). Now this physical act took a second or two or three to actually happen, depending on the device in use. So while the device was moving the carriage back into position, it would get a LF device control code from the program, which would cause the device to scroll the paper up one line's worth (thus the name LINE_FEED).

This means that if your program only did a CR, then the next printed line would over type the previous one (no LF). And if you only did a LF then the printer would attempt to print at the same cursor position, but one line down (no CR).

But now that we have modern devices that don't rely on such line-based, mechanical operations, the use of CRLF is not really required, however most printers still emulate the device code operations. As for use in text files to delimit or mark the end of lines, it really is anachronistic and redundant.

new topic     » goto parent     » topic index » view message » categorize

21. Re: OK what does ` mean - seriously seems to be not documented.

Selgor here.

DerekParnell said...

Ok ... back in the old days, when output went to devices such as a teletype the two bytes had a specific meaning for the device. The CR device control code would cause the print head to move back to the leftmost position on the print carriage (thus the name CARRIAGE_RETURN). Now this physical act took a second or two or three to actually happen, depending on the device in use. So while the device was moving the carriage back into position, it would get a LF device control code from the program, which would cause the device to scroll the paper up one line's worth (thus the name LINE_FEED).

This means that if your program only did a CR, then the next printed line would over type the previous one (no LF). And if you only did a LF then the printer would attempt to print at the same cursor position, but one line down (no CR).

But now that we have modern devices that don't rely on such line-based, mechanical operations, the use of CRLF is not really required, however most printers still emulate the device code operations. As for use in text files to delimit or mark the end of lines, it really is anachronistic and redundant.

Brilliantly explained Derek.

Those were the days that one really had to know coding..programming . BASIC, COBOL, FORTRAN, PASCAl, PROLOG etc. . And Punch Cards and mark read sensed cards , even before them.

The "bad old days" (?) of computing.

The pioneer years , as someone once quoted.

Again Derek, very well written .

Cheers , Selgor.

new topic     » goto parent     » topic index » view message » categorize

22. Re: OK what does ` mean - seriously seems to be not documented.

A couple of things...

  1. The single character literal (stuff inside of a single-quote pair) can also be specified using the \b and \x prefix for binary and hexadecimal values respectively. eg. '\b100001', '\x21' and '!' are all the same character.
  2. In the output function section, you should also mention the writef() and writefln() functions as they are more versatile than printf(), in my opinion.
new topic     » goto parent     » topic index » view message » categorize

23. Re: OK what does ` mean - seriously seems to be not documented.

DerekParnell said...

In the output function section, you should also mention the writef() and writefln() functions as they are more versatile than printf(), in my opinion.

A quick pointer on how/why they are more versatile might be appropriate here.

DerekParnell said...
_tom said...

Someone please show me an example of Extra Raw and how it must be used.

The 'extra raw' I think is Jim having a bit of a joke.

Yep. Actually, that should have been Extra Rare, not Extra Raw. sad

new topic     » goto parent     » topic index » view message » categorize

24. Re: OK what does ` mean - seriously seems to be not documented.

jimcbrown said...
DerekParnell said...
_tom said...

Someone please show me an example of Extra Raw and how it must be used.

The 'extra raw' I think is Jim having a bit of a joke.

Yep. Actually, that should have been Extra Rare, not Extra Raw. sad

But. I take Jim seriously!

I makes me think that not removing CR from a string produces a half-baked string.

Microsoft just copied CP/M and ended up with CR+LF (which is explained in turn by teletype machines, DEC, and typewriters.)

But a "line ending" is a newline which is /n. It makes sense that all line endings result in the same character even if the source is CR+LF or just LF. If "raw string" means raw text input to the Euphoria text format understanding a character string should come easier.

The problem comes from expecting a string data-type and expecting that raw means preserving the text you input exactly.

_tom

new topic     » goto parent     » topic index » view message » categorize

25. Re: OK what does ` mean - seriously seems to be not documented.

jimcbrown said...
DerekParnell said...

In the output function section, you should also mention the writef() and writefln() functions as they are more versatile than printf(), in my opinion.

A quick pointer on how/why they are more versatile might be appropriate here.

Something like http://openeuphoria.org/forum/m/109887.wc and http://openeuphoria.org/forum/109932.wc#109932

new topic     » goto parent     » topic index » view message » categorize

26. Re: OK what does ` mean - seriously seems to be not documented.

jimcbrown said...
jimcbrown said...
DerekParnell said...

In the output function section, you should also mention the writef() and writefln() functions as they are more versatile than printf(), in my opinion.

A quick pointer on how/why they are more versatile might be appropriate here.

Something like http://openeuphoria.org/forum/m/109887.wc and http://openeuphoria.org/forum/109932.wc#109932

Speaking of which, I found http://openeuphoria.org/wiki/view/tutWriteln.wc from http://openeuphoria.org/wiki/view/tutorial.wc but it doesn't exist. (In fact, every single tutorial link on that page doesn't exist.)

new topic     » goto parent     » topic index » view message » categorize

27. Re: OK what does ` mean - seriously seems to be not documented.

jimcbrown said...

... (In fact, every single tutorial link on that page doesn't exist.)

I'd forgotten I started that page. However, just to point out the obvious, this is a wiki, so anyone can improve the material (or lack of material) on a page.

I'd created that as a Table-Of-Contents for a tutorial for Euphoria. It is quite incomplete as a TOC and as you note, none of the topic areas mentioned in the TOC have been started either. I guess I could give them another kick start and hope for additional contributors.

new topic     » goto parent     » topic index » view message » categorize

28. Re: OK what does ` mean - seriously seems to be not documented.

DerekParnell said...
jimcbrown said...

... (In fact, every single tutorial link on that page doesn't exist.)

I'd forgotten I started that page. However, just to point out the obvious, this is a wiki, so anyone can improve the material (or lack of material) on a page.

Agreed. I meant this as a reminder to Tom (or any other interested contributor) that there was already a good wiki page that he might want to help fill in.

DerekParnell said...

I'd created that as a Table-Of-Contents for a tutorial for Euphoria. It is quite incomplete as a TOC and as you note, none of the topic areas mentioned in the TOC have been started either. I guess I could give them another kick start and hope for additional contributors.

That would be an incredibly useful contribution, especially considering that Tom has already explained why he is hesitant to immediately start working on the enhanced writefln() documentation.

new topic     » goto parent     » topic index » view message » categorize

29. Re: OK what does ` mean - seriously seems to be not documented.

Here is my makeover for write and writef

http://openeuphoria.org/wiki/view/NewDocs_Strings_Three.wc

I have changed lots of details so I need someone to review what I have written so far.


Bill:

Writing documentation is an infinite loop.

I only claim to know less than everyone else.

  • The CR LF problem is due to DOS being copied from CP/M. Microsoft guessed wrong when the copied.
  • Notice that /n is "newline." That suggests CR LF, LF, and CR as suits your operating system.
  • There seems to be no value in keeping CR in modern software.
  • Programming editors let you select line endings and do conversions for you.
  • Euphoria does not have a "raw string" but does have "raw string input" which explains why CR characters are not preserved.
  • The character ` is grave accent and is an alternative to using """ for raw string input.
  • UTF-8 is the defacto standard for Unicode text. Euphoria has partial immediate support for UTF-8; routines do exist for extensive Unicode operations.

Keep asking questions, I will come up with an answer eventually.

_tom

new topic     » goto parent     » topic index » view message » categorize

30. Re: OK what does ` mean - seriously seems to be not documented.

Tom,

If \r\n is a relic then it is quite a recent relic. And wasn't it true that Mac lines were terminated with \r not \n? (This has no doubt changed now).

My concern originally was whether my program needed to support ` and """ when reading in values from text files.

I would like it to know the \r was there.

The other point is that neither ` nor """ allow escaping so the delimiter cannot be included in the string. Of course one can enter

  `this is a grave accent ` & "`." 
but this has to be interpreted rather than read in.

My question comes down to what is the full intention of `-strings in Euphoria? They could be used as clean regex source as the \ doesn't need to be escaped, but then what is the point of raw newlines? And how does one get a ` into a regex literal without resorting to the patch above?

Being able to write multi-line text easily is useful but these uses conflict.

new topic     » goto parent     » topic index » view message » categorize

31. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

If \r\n is a relic then it is quite a recent relic. And wasn't it true that Mac lines were terminated with \r not \n? (This has no doubt changed now).

Correct. The Macintosh operating system used to use CR as the end-of-line marker in text files. It now uses LF since it adopted a modified unix base.

gimlet said...

My concern originally was whether my program needed to support ` and """ when reading in values from text files.

I would like it to know the \r was there.

And this is where we might have been confused. The back-tick and double-quote notation is only applicable to string LITERALS in your source code text, not in text files in general.

Normally, an application doesn't have to know if the text file its reading/writing has CR/LF/or CRLF line endings because the library's functions handle all that for you. Your application always 'reads' a LF regardless of the platform its running on and always outputs the appropriate line ending for the platform you are running on.

If, however, your application must actually know if a CR etcetera was on the disk text file, you need to open the file as a 'binary' file and process the incoming bytes yourself. I realize that there may be real applications that need to know, but really, why bother? Why do you need this?

new topic     » goto parent     » topic index » view message » categorize

32. Re: OK what does ` mean - seriously seems to be not documented.

Derek,

Well say you have a string like:

.."crlf" = `\r\n`.. 
- this is a very contrived example but ..

The idea is you would read in lines like this and using value() or a user-defined function put them into key-value pairs.

In this particular case you should end up with a key which refers to the regex \r\n.

It would be nice to also be able to refer to the binary value \r\n but the \r is stripped.

It would be a major inconvenience to have to parse the string to get this functionality.

It seems perfectly reasonable that someone (given the functionality of value() may want to be able to read in multi-line strings and raw byte-strings.

Of course it is a slippery slope.

new topic     » goto parent     » topic index » view message » categorize

33. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

Derek,

Well say you have a string like:

.."crlf" = `\r\n`.. 
- this is a very contrived example but ..

The idea is you would read in lines like this and using value() or a user-defined function put them into key-value pairs.

You remain confused. Text like that is illegal in source code. If you are reading lines, then you need to read the text file in as input.

As Derek just said,

DerekParnell said...

And this is where we might have been confused. The back-tick and double-quote notation is only applicable to string LITERALS in your source code text, not in text files in general.

...

you need to open the file ... and process the incoming bytes yourself.

gimlet said...

It would be a major inconvenience to have to parse the string to get this functionality.

It seems perfectly reasonable that someone (given the functionality of value() may want to be able to read in multi-line strings and raw byte-strings.

Of course it is a slippery slope.

Incidently, I just tested this, and it seems that if you are using get() with a text file opened in binary mode, or you are using value() with a sequence, then any heredoc strings are read in preserving the carriage return. So the behavior of the heredoc strings in get.e and the raw strings handled by the Euphoria parser differs.

So if you are using get() to handle string literals, then the behavior you desire is already there.

gimlet said...

It would be nice to also be able to refer to the binary value \r\n but the \r is stripped.

Just use {13, 10}

gimlet said...

In this particular case you should end up with a key which refers to the regex \r\n.

Unless you are just checking for CRLF, that's not actually a regex. A regex should be more like ".*\r\n.*" or something, depending on what exactly you are trying to do.

new topic     » goto parent     » topic index » view message » categorize

34. Re: OK what does ` mean - seriously seems to be not documented.

I will spell it out.

Assume the text read in contains `\r\n` then the person writing this meant \r\n not {13,10}. You can say they are the same as 65 = 'A' but that doesn't make them the same. Just because in Euphoria 'A' and 65 are the same byte values doesn't make them the same values with the same meaning.

This is like saying a while loop is only a label loop pair.

Apart from anything else why should `\r\n` be illegal - I am not talking about source code but the textual representation of source to be read into a map.

How coherent would key: 63 = "assignment" be to any person reading the file?

You read in a piece of text containing "crlf". You pass this piece of text "crlf" including the quotes to value and it returns "crlf" a sequence.

You say I can't do something similar to `\r\n` and introduce the red herring that I can't match for \r\n but can match for "*
r
n*". Really?

new topic     » goto parent     » topic index » view message » categorize

35. Re: OK what does ` mean - seriously seems to be not documented.

I will preface my response with this:

If you're on Windows, then you have to be careful how you open the file, as Jim alluded. If you open a file in text mode on Windows (the default if you pass "r" to open) then Windows will strip out the \r characters before your program sees anything. Likewise, it will insert them if you output just a \n character. Other platforms don't do this, so you'll get exactly what's there. Text editors often have their own ideas about how to deal with all of this, and the bottom line is that if this stuff is critical to what you're doing, you have to be extra careful.

For euphoria source files, we strip out carriage returns since these are really extra fluff that aren't needed and can actually be harmful in some cases.

gimlet said...

I will spell it out.

Assume the text read in contains `\r\n` then the person writing this meant \r\n not {13,10}. You can say they are the same as 65 = 'A' but that doesn't make them the same. Just because in Euphoria 'A' and 65 are the same byte values doesn't make them the same values with the same meaning.

This is like saying a while loop is only a label loop pair.

Apart from anything else why should `\r\n` be illegal - I am not talking about source code but the textual representation of source to be read into a map.

How coherent would key: 63 = "assignment" be to any person reading the file?

You read in a piece of text containing "crlf". You pass this piece of text "crlf" including the quotes to value and it returns "crlf" a sequence.

You say I can't do something similar to `\r\n` and introduce the red herring that I can't match for \r\n but can match for "*
r
n*". Really?

There is no reason why loading any valid euphoria object as a map key or value is illegal. If you're trying to use the provided text file to map loading routines, then you have to follow the format. If that doesn't answer your questions, then I'm not sure what you're asking.

Matt

new topic     » goto parent     » topic index » view message » categorize

36. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

Assume the text read in contains `\r\n` then the person writing this meant \r\n not {13,10}.

Err, right. That's {92, 114, 92, 110}.

gimlet said...

You can say they are the same as 65 = 'A' but that doesn't make them the same. Just because in Euphoria 'A' and 65 are the same byte values doesn't make them the same values with the same meaning.

How coherent would key: 63 = "assignment" be to any person reading the file?

They have the same value. Point taken about meaning though.

gimlet said...

This is like saying a while loop is only a label loop pair.

Semantics.

gimlet said...

Apart from anything else why should `\r\n` be illegal - I am not talking about source code but the textual representation of source to be read into a map.

In that case, that's already legal TODAY.

gimlet said...

You read in a piece of text containing "crlf". You pass this piece of text "crlf" including the quotes to value and it returns "crlf" a sequence.

You say I can't do something similar to `\r\n` and introduce the red herring that I can't match for \r\n but can match for "*
r
n*". Really?

If you want you can search for `\r\n` which is the same as "
r
n" or {92, 114, 92, 110}.

If you are reading from a text file using get() then you can search for "\r\n" as well as

` 

` 

which preserves both the CR and LF. Just make sure you open the file in binary mode first.

new topic     » goto parent     » topic index » view message » categorize

37. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

I will spell it out.

Assume the text read in contains `\r\n` then the person writing this meant \r\n not {13,10}. You can say they are the same as 65 = 'A' but that doesn't make them the same. Just because in Euphoria 'A' and 65 are the same byte values doesn't make them the same values with the same meaning.

I'm very sorry gimlet, but you are simply wrong, because `\r\n` and "\r\n", and {13,10} actually are the same thing. They are the same thing because they are stored in RAM as exactly the same way. How you wish to interpret those two numbers is up to you.

The number 13 is just a number, but it can be interpreted as being the Carriage_Return device control code, or anything else your application chooses to. Its just a number.

The same applies to the number 123. If one wants to, it can be used to represent the ASCII character '{', or the EBCDIC character '#'. But its just a number.

The meaning one gives a number depends on your requirements, not one how the number is stored.

new topic     » goto parent     » topic index » view message » categorize

38. Re: OK what does ` mean - seriously seems to be not documented.

DerekParnell said...
gimlet said...

I will spell it out.

Assume the text read in contains `\r\n` then the person writing this meant \r\n not {13,10}. You can say they are the same as 65 = 'A' but that doesn't make them the same. Just because in Euphoria 'A' and 65 are the same byte values doesn't make them the same values with the same meaning.

I'm very sorry gimlet, but you are simply wrong, because `\r\n` and "\r\n", and {13,10} actually are the same thing. They are the same thing because they are stored in RAM as exactly the same way. How you wish to interpret those two numbers is up to you.

A simple caveat here - `\r\n` is actually the same as "
r
n", not "\r\n". It's an easy mistake to make - I made it myself a few posts earlier. But for text files (not source), there is a version using the backquote that is the same as "\r\n" and {13,10}. As Derek says, these three things are all the same value.

new topic     » goto parent     » topic index » view message » categorize

39. Re: OK what does ` mean - seriously seems to be not documented.

No Derek,

The representation is the same.

The meaning is different.

If what you said is correct then a = b and a = b are the same when a = b means a := b and a = b means a == b.

I don't see anyone writing 'A + 'B'.

new topic     » goto parent     » topic index » view message » categorize

40. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

Derek,

Well say you have a string like:

.."crlf" = `\r\n`.. 
- this is a very contrived example but ..

The idea is you would read in lines like this and using value() or a user-defined function put them into key-value pairs.

In this particular case you should end up with a key which refers to the regex \r\n.

It would be nice to also be able to refer to the binary value \r\n but the \r is stripped.

It would be a major inconvenience to have to parse the string to get this functionality.

It seems perfectly reasonable that someone (given the functionality of value() may want to be able to read in multi-line strings and raw byte-strings.

Of course it is a slippery slope.

Why are you using the value() function. It has a specific purpose, namely to convert text representation of Euphoria Objects into actual Euphoria objects.

If you want to create key-value pairs from a text file that contains lines like this ...

"crlf" = `\r\n` 
then have a look at the function keyvalues in the std/text.e library.

Or if you wish to map these key-value pairs, consider the function new_from_string in the std/map.e library.

new topic     » goto parent     » topic index » view message » categorize

41. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

No Derek,

The representation is the same.

The meaning is different.

If what you said is correct then a = b and a = b are the same when a = b means a := b and a = b means a == b.

I don't see anyone writing 'A + 'B'.

I have no idea what you are now talking about. Your words are not making any sense to me.

I was talking about numbers and how they mean different things depending on your application's requirements.

I have no idea at all what you mean by "a = b". And what has 'A + 'B' got to do with anything at all????

new topic     » goto parent     » topic index » view message » categorize

42. Re: OK what does ` mean - seriously seems to be not documented.

Why use value() rather than new_from_string()?

I guess it is that the manual doesn't really explain the intent of functions and things are scattered through the manual so that unless you know what you are looking for you can only find it by examining all the libraries.

Re a = b, 'A' + 'B'.

You are not making a firm distinction between what a set of bytes means and the actual values.

a = b in one place means assign the value of b to a, in another place it means test a is equal to b.

'A' + 'B' is nonsense 65 + 66 is not. Their representations are the same (and in Euphoria presumably if you wrote c = 'A' + 'B' then c would be assigned 131) - but shouldn't that be nonsense?

new topic     » goto parent     » topic index » view message » categorize

43. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

Why use value() rather than new_from_string()?

No reason at all. Both work the same way.

However, calling value() or new_from_string() requires you to first read the file in, line by line. You'd still have handle multi-line values specially. get() does it for you automatically.

gimlet said...

You are not making a firm distinction between what a set of bytes means and the actual values.

Neither does a CPU.

gimlet said...

'A' + 'B' is nonsense 65 + 66 is not. Their representations are the same (and in Euphoria presumably if you wrote c = 'A' + 'B' then c would be assigned 131) - but shouldn't that be nonsense?

It's very convient to be able to convert from 1 to '1' and back just by adding or subtracting '0'.

For the english alphabet, a similar trick works - 'a' - 'A'. The resulting value can be used to convert to/from uppercase and lowercase.

gimlet said...

I guess it is that the manual doesn't really explain the intent of functions and things are scattered through the manual so that unless you know what you are looking for you can only find it by examining all the libraries.

The documentation is not perfect. IMVHO it's pretty good overall, especially with cross-linking to related functions and pointing out how the function is meant to be used/what it is meant to be used for via examples. But certainly, if you have areas where you can point out specific examples that need to be addressed, you are more than welcome to do so. Likewise, if you have already written improved docs (or are at least willing to do so), you are more than welcome to share that work with us.

new topic     » goto parent     » topic index » view message » categorize

44. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

Why use value() rather than new_from_string()?

I guess it is that the manual doesn't really explain the intent of functions and things are scattered through the manual so that unless you know what you are looking for you can only find it by examining all the libraries.

I agree that it is hard to know everything in the manual, even when one has had a major part in writing it and the functionality it talks about. It is also hard to know the intent of people asking questions unless they help us by telling us what they are trying to achieve rather than how they are attempting to achieve it.

gimlet said...

Re a = b, 'A' + 'B'.

You are not making a firm distinction between what a set of bytes means and the actual values.

a = b in one place means assign the value of b to a, in another place it means test a is equal to b.

'A' + 'B' is nonsense 65 + 66 is not. Their representations are the same (and in Euphoria presumably if you wrote c = 'A' + 'B' then c would be assigned 131) - but shouldn't that be nonsense?

Thank you. This way of explaining things is a lot more helpful. I don't have to read your mind so much.

Anyhow, I thought I was making a firm distinction between meaning and value. A value (number) is absolute and a meaning is relative. As in the example using the '=', the meaning depends on the context. The source text 'a = b' is absolute but the meaning is relative to the context that text is placed in.

Consider the English language. It gives you the freedom to place together words that, when taken as a phrase, is nonsense. The author Lewis Carroll in his story "Alice in Wonderland" made great use of this feature, even managing to invent new words for the occasion. Euphoria, like many other programming languages, also gives you freedom to write nonsense. Your job as a programmer, is to ensure that any 'nonsense' you write has a valid purpose in your application, and to ensure that other nonsense is not present in your code. Sure some languages make it hard to write 'nonsense' code (I'm looking at you Ada) but there is still plenty of ways to create nonsense algorithms.

Ok, so Euphoria doesn't have a datatype for characters. Instead it uses integers, when used in the right context, mean the same as characters. This isn't going to change so either get used to it or don't use Euphoria.

By the way, have a look at this nonsense code ...

integer Lower 
integer Upper 
 
    Lower = 'A' + ' ' 
    Upper = 'a' - ' ' 
new topic     » goto parent     » topic index » view message » categorize

45. Re: OK what does ` mean - seriously seems to be not documented.

Bill

What programming language did you learn before Euphoria? Each language has a mindset.

Euphoria has no string data-type. But, Euphoria has the flexibility to work strings as if there were a string data-type. The as if part is interesting; everything works like a sequence without regard for data values.

If you "think string data-type" (lots of language examples here):

word = "dogs" + "bone" 
   --> "dogsbone" 

But, Euphoria is about sequences!

word = "dogs" & "bone" 
   --> "dogsbone" 

but

word = "dogs" + "bone" 
  --> {198,222,213,216} 

The "get it" part of Euphoria is seeing the flexibility of having everything as an object.

The lack of a string data-type has a few limitations. For example 1:1 indexing of a Unicode string is not always simple.

What is nonsense x = 'a' + 'b' in some languages is flexibility in Euphoria.

I am beginning to appreciate some of the "irritation" (I do no want to call you confused.) you are having with the Euphoria way of doing things. Coming up with a tutorial to clarify things will take some time...

To end with humor: Stop using Windows.

_tom

new topic     » goto parent     » topic index » view message » categorize

46. Re: OK what does ` mean - seriously seems to be not documented.

There are two background issues here:

  1. No string data-type.
  2. How Euphoria works with text: internal, raw input, files.

Explaining item two will take some time. For now, does this explain item one?

Text Data

Text is "data composed of characters or strings of characters."

When designing a computer language you choose (and compromise) how text will be represented and processed.

Binary operator Language Example
space Snobol

dog = 'K' 9
--> K9 
+ Python

dog = 'K' + 9
--> TypeError: Can't convert 'int' object to str implicitly 
& Euphoria

object dog = 'K' & 9
-->{75,9}  
+ Euphoria

dog = 'K' + 9
--> 84 

The lesson is: learn more than one language and pick the tool that best matches your current problem.

Snobol was designed to work with text. That is why a character 'K' and a number 9 concatenate to form a string. In a language designed to process strings a blank space is a very convenient concatenation operator.

Python was designed to isolate text and numbers using separate data-types. That means you can not mix a character 'K' and a number 9 and get a result. In Python a + adds numbers but + concatenates strings. In a language designed to teach programming the isolation of text and numbers can be a good thing.

Euphoria was designed so that all data is an object and objects are composed of numbers. That means 'K' is the number 75. That means there is no conflict in concatenating two numbers to form a two element sequence. That means there is no conflict in adding two numbers to form a sum. In a language designed to be flexible this is a good thing.

Euphoria lets you choose how to display an object value:

print(1, 'K') 
    --> 84 
puts(1, 'K' ) 
    --> K 

In Euphoria 'K' is always converted into the number 84; you then choose to display 84 as a number or as a character.

"Euphoria has no string data-type" means all character values are encoded as numbers. It turns out you can read text files, use literal text in expressions, and conveniently display text without remembering that you are just working with an object composed of numbers. Often Euphoria behaves the same as languages that have string data-types; this makes Euphoria simple.

Euphoria was designed to be simple and flexible.

  • The same operators and routines work on all data--text or numbers.
  • You have the freedom to do anything with your data.
  • The cost of flexibility is you have to take responsibility for your data.

For example. You can always display a Euphoria object as text or as numbers; however Euphoria can not know what you want to see. Actually, console:display does a good job in distinguishing numbers used as text from numbers used as numbers --but not always. Contrast this with a language that does have a string data-type; text is always displayed as text but you do not have the flexibility to display and process text in any way you choose.

_tom

new topic     » goto parent     » topic index » view message » categorize

47. Re: OK what does ` mean - seriously seems to be not documented.

Sometimes the explanations are more complicated than the issues they are explaining.

All computers store text as numbers. At their core, numbers are all that a computer understands. It is up to a human to give contextual meaning to those numbers.

Some programming languages abstract this idea away and the number of data types proliferate. One reason for this is that it makes it easier for the interpreter or compiler to flag logic errors earlier in the development process. However, it can also make less expressive and/or more verbose. It takes away some flexibility on the part of the programmer.

Euphoria simplifies this: every piece of data is either a number, a list of numbers, or a list of lists of numbers. How those numbers are interpreted is completely up to the programmer. Some numbers can be interpreted as human readable text instead of their literal value. There exists some conventions, such as ASCII/ANSI/Unicode, as well as different internal representation (and slightly different behavior) of integral values vs. real values.

However, the abstraction still works and makes many routines generic. Whereas in some programming languages you have to write a different function for each different kind of value that it can take as input, or worse, have to write ugly templating code, in Euphoria your functions can generally take one kind of input, perform an operation, and produce the correct output.

For input and output, Euphoria routines try to guess whether any given piece of data can be interpreted as text or as a numerical value. That's a tool for the programmer to decide to use.

new topic     » goto parent     » topic index » view message » categorize

48. Re: OK what does ` mean - seriously seems to be not documented.

jaygade said...

Sometimes the explanations are more complicated than the issues they are explaining.

Nice way to put it.

I will have to steal some of your words and blend them into the documentation.

_tom

new topic     » goto parent     » topic index » view message » categorize

49. Re: OK what does ` mean - seriously seems to be not documented.

jimcbrown said...

A simple caveat here - `\r\n` is actually the same as "
r
n", not "\r\n".

Ah, I see what happened there. In creole, \\ is a forced line break. What you actually meant to say was:
`\r\n` is actually the same as "\\r\\n"

new topic     » goto parent     » topic index » view message » categorize

50. Re: OK what does ` mean - seriously seems to be not documented.

petelomax said...
jimcbrown said...

A simple caveat here - `\r\n` is actually the same as "
r
n", not "\r\n".

Ah, I see what happened there. In creole, \\ is a forced line break. What you actually meant to say was:
`\r\n` is actually the same as "\\r\\n"

Bleargh. Yes.

new topic     » goto parent     » topic index » view message » categorize

51. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

[...]
'A' + 'B' is nonsense 65 + 66 is not. Their representations are the same (and in Euphoria presumably if you wrote c = 'A' + 'B' then c would be assigned 131) - but shouldn't that be nonsense?

Hallo

i have tried to explain this, maybe it helps
http://openeuphoria.org/wiki/view/andi49.wc

Andreas

new topic     » goto parent     » topic index » view message » categorize

52. Re: OK what does ` mean - seriously seems to be not documented.

gimlet said...

'A' + 'B' is nonsense 65 + 66 is not. Their representations are the same (and in Euphoria presumably if you wrote c = 'A' + 'B' then c would be assigned 131) - but shouldn't that be nonsense?

One of Euphoria's main attributes is it's tiny set of built-in datatypes. Anything more complex that scalars (atoms/integers) and lists (sequences) are the responsibility of the programmer and not the language. This is something we have to come to terms with and once one does, it is quite liberating.

Here is another example of nonsense that is the programmer's responsibility - and this also applies to some other languages as well.

enum 
  RED, 
  BLUE, 
  YELLOW, 
  GREEN, 
  ORANGE, 
  PURPLE, 
  BROWN, 
  WHITE, 
  BLACK 
 
     set_color(Field, RED    + BLUE)       -- Is the field now purple or yellow? 
     set_color(Field, RED    + YELLOW)     -- Is the field now orange or green? 
     set_color(Field, YELLOW + BLUE)       -- Is the field now green or orange? 
     set_color(Field, YELLOW + BLUE + RED) -- Is the field now white or purple? 

Ok, if that's a problem then try this modification ...

enum  
  RED = 1, 
  BLUE = 2, 
  YELLOW = 4, 
  GREEN = 6, 
  ORANGE = 5, 
  PURPLE = 3, 
  BROWN = 8, 
  WHITE = 7, 
  BLACK = 0 
 
     set_color(Field, RED    + BLUE)       -- Purple 
     set_color(Field, RED    + YELLOW)     -- Orange 
     set_color(Field, YELLOW + BLUE)       -- Green 
     set_color(Field, YELLOW + BLUE + RED) -- White 

So, is the concept of adding colors nonsense? No. The limitation comes with having colors represented by numbers (ORANGE + GREEN != BROWN) ... just as there are also limitations when numbers represent characters, such as 65 represents 'A'.

It is our responsibility as Euphoria programmers to manage these limitations.

new topic     » goto parent     » topic index » view message » categorize

53. Re: OK what does ` mean - seriously seems to be not documented.

I notice that I said in an earlier post 'A' + 'B' = 131 is nonsense. I should not have used the word 'nonsense' as it isn't quite what I mean and people seem to be taking it to mean that I believe 'A' + 'B' = 131 is (or should be) invalid Euphoria.

A much better word is surprising - something that may or may not be correct but draws your attention as being odd.

Vide Derek's post:

As the colours are numbers addition is not surprising.

RED + GREEN = WHITE is somewhat **surprising** mixing colours. 
 
RED & GREEN & BLUE makes sense as {RED,GREEN,BLUE} i.e. as the composition of white. 
 
Sadly, RED + GREEN + BLUE != WHITE 
Surprising is not the same as incorrect.

Some other examples

  -60 mph is incorrect (60 mph forward or back is 60 mph) 
  RED/3 is surprising  (RED is a colour. We don't divide colours do we?) 
  RED*3 is also surprising. 
  Neither is surprising if RED is a (badly named) atomic variable 
  -60 dollars has meaning with respect to debits and credits. 

You write 'A' + 'B' and it makes sense as the addition of 2 numbers. (OK?)

Writing upper('B') = 'B' + 'a' - 'A' makes sense as defining the relation between upper and lower case: only applicable with domain 'A'..'Z'.

Pascal would have upper('B') = chr((ord('B') + ord('a') - ord('A')) which is explicit about types but is otherwise identical.

If you mix types indiscriminately in you have to be aware that you may be either incorrect or surprising. Too much surprising is a bad thing.

Defining TRUE = 0, FALSE = 1 is likely to cause confusion, of course you can do it if you like.

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu