1. Should to_number return a sequence?
- Posted by xecronix 1 week ago
- 524 views
Should to_number return a sequence?
include std/convert.e include std/sequence.e if atom(to_number("(")) then puts(1, "to_number returned an atom\n") end if if sequence(to_number("(")) then puts(1, "to_number returned an sequence\n") end if
$ eui test_to_number.ex to_number returned an atom
$ eui -v Euphoria Interpreter v4.1.0 development 64-bit Windows, Using System Memory Revision Date: 2015-02-02 14:18:53, Id: 6300:57179171dbed
2. Re: Should to_number return a sequence?
- Posted by SDPringle 1 week ago
- 516 views
It seems like to_number would return a number by the name itself. For erroneous inputs, it just returns zero unless you specify a non-zero second parameter.
https://openeuphoria.org/docs/std_convert.html#_2293_to_number
3. Re: Should to_number return a sequence?
- Posted by xecronix 1 week ago
- 518 views
I slowed down reading the docs. I see what I missed the first time. This seems to do what I need now.
Thanks for your help.
include std/convert.e include std/sequence.e if atom(to_number("(", -1)) then puts(1, "to_number returned an atom\n") end if if sequence(to_number("(", -1)) then puts(1, "to_number returned a sequence\n") end if
$ eui test_to_number.ex to_number returned a sequence
4. Re: Should to_number return a sequence?
- Posted by xecronix 1 week ago
- 487 views
I added a is_number function to my wiki. It might be useful to someone. https://openeuphoria.org/wiki/view/Is%20Number%20Function.wc
5. Re: Should to_number return a sequence?
- Posted by petelomax 1 week ago
- 476 views
Phix is a bit different: to_number(s,failure={},inbase=10) is in fact a mini-wrapper to scanf() [see below] and never returns a bad_pos.
There is a separate and simpler to_integer(s,def_value=0,base=10) and an is_integer(s,base=10) to go with that.
scanf(s,fmt) tries to reconstitute p such that sprintf(fmt,p) would have produced s, and collects all p it can think of before trimming
that set down to those that match exactly, as long as that does not get rid of all of them. [Not very relevant to to_number, that part.]
I added a is_number function to my wiki. It might be useful to someone. https://openeuphoria.org/wiki/view/Is%20Number%20Function.wc
Your is_number("-") crashes.
I'll also suggest:
and (i + 1 > length(maybe)) then
==>
and i = length(maybe) then
and
while i < (length(maybe) + 1) do
==>
while i <= length(maybe) do
and
if equal(c, '.') and found_dot then return 0 end if if equal(c, '.') then found_dot = 1
==>
if equal(c, '.') then if found_dot then return 0 end if found_dot = 1
6. Re: Should to_number return a sequence?
- Posted by xecronix 1 week ago
- 464 views
Your is_number("-") crashes.
Thank you for the bug report. I'm appreciative for sure. I made a fix and a test. The wiki is updated. I'll look at all your suggestions. In the meantime, this was the stopgap I applied.
-- we'll skip the sign if it has one if find(maybe[i], "+-") then i += 1 end if -- The index might have just moved. Make sure there is at least 1 more character to consider. if i > length(maybe) then -- <<-- New code return 0 end if
$ eui is_number.ex Testing for FALSE (0): Testing is_number function with input [ ] result: [0] Testing is_number function with input [.] result: [0] Testing is_number function with input [-] result: [0] Testing is_number function with input [+] result: [0] Testing is_number function with input [..] result: [0] Testing is_number function with input [.-] result: [0] Testing is_number function with input [.+] result: [0] Testing is_number function with input [-.] result: [0] Testing is_number function with input [+.] result: [0] Testing is_number function with input [0..] result: [0] Testing is_number function with input [.0.] result: [0] Testing is_number function with input [--] result: [0] Testing is_number function with input [++] result: [0] Testing is_number function with input [1 4] result: [0] Testing is_number function with input [A] result: [0] Testing is_number function with input [-a] result: [0] Testing is_number function with input [+a] result: [0] Testing is_number function with input [-.a] result: [0] Testing is_number function with input [+.a] result: [0] Testing is_number function with input [.a] result: [0] Testing is_number function with input [1a] result: [0] Testing is_number function with input [.1a ] result: [0] Testing is_number function with input [] result: [0] Testing for TRUE (1): Testing is_number function with input [0] result: [1] Testing is_number function with input [0.] result: [1] Testing is_number function with input [.0] result: [1] Testing is_number function with input [-.0] result: [1] Testing is_number function with input [+.0] result: [1] Testing is_number function with input [0.0] result: [1] Testing is_number function with input [-0.0] result: [1] Testing is_number function with input [+0.0] result: [1] Testing is_number function with input [123] result: [1] Testing is_number function with input [123.] result: [1] Testing is_number function with input [.123] result: [1] Testing is_number function with input [-.123] result: [1] Testing is_number function with input [+.123] result: [1] Testing is_number function with input [123.123] result: [1] Testing is_number function with input [-123.123] result: [1] Testing is_number function with input [+123.123] result: [1] Testing is_number function with input [00032] result: [1] Testing is_number function with input [-00032] result: [1] Testing is_number function with input [+00032] result: [1] Testing is_number function with input [00032.000] result: [1] Testing is_number function with input [-00032.050] result: [1] Testing is_number function with input [+00032.0] result: [1] What would Euphoria do?? atom d = atom(to_number(test_str, -1)) Testing for FALSE (0): What would Euphoria do with input [ ] result: [0] What would Euphoria do with input [.] result: [0] What would Euphoria do with input [-] result: [0] What would Euphoria do with input [+] result: [0] What would Euphoria do with input [..] result: [0] What would Euphoria do with input [.-] result: [0] What would Euphoria do with input [.+] result: [0] What would Euphoria do with input [-.] result: [0] What would Euphoria do with input [+.] result: [0] What would Euphoria do with input [0..] result: [0] What would Euphoria do with input [.0.] result: [0] What would Euphoria do with input [--] result: [0] What would Euphoria do with input [++] result: [0] What would Euphoria do with input [1 4] result: [0] What would Euphoria do with input [A] result: [0] What would Euphoria do with input [-a] result: [0] What would Euphoria do with input [+a] result: [0] What would Euphoria do with input [-.a] result: [0] What would Euphoria do with input [+.a] result: [0] What would Euphoria do with input [.a] result: [0] What would Euphoria do with input [1a] result: [0] What would Euphoria do with input [.1a ] result: [0] What would Euphoria do with input [] result: [0] Testing for TRUE (1): What would Euphoria do with input [0] result: [1] What would Euphoria do with input [0.] result: [1] What would Euphoria do with input [.0] result: [1] What would Euphoria do with input [-.0] result: [1] What would Euphoria do with input [+.0] result: [1] What would Euphoria do with input [0.0] result: [1] What would Euphoria do with input [-0.0] result: [1] What would Euphoria do with input [+0.0] result: [1] What would Euphoria do with input [123] result: [1] What would Euphoria do with input [123.] result: [1] What would Euphoria do with input [.123] result: [1] What would Euphoria do with input [-.123] result: [1] What would Euphoria do with input [+.123] result: [1] What would Euphoria do with input [123.123] result: [1] What would Euphoria do with input [-123.123] result: [1] What would Euphoria do with input [+123.123] result: [1] What would Euphoria do with input [00032] result: [1] What would Euphoria do with input [-00032] result: [1] What would Euphoria do with input [+00032] result: [1] What would Euphoria do with input [00032.000] result: [1] What would Euphoria do with input [-00032.050] result: [1] What would Euphoria do with input [+00032.0] result: [1]
7. Re: Should to_number return a sequence?
- Posted by xecronix 1 week ago
- 462 views
I'll also suggest:
if equal(c, '.') and found_dot then return 0 end if if equal(c, '.') then found_dot = 1
==>
if equal(c, '.') then if found_dot then return 0 end if found_dot = 1
Last week I would have naturally wrote my "if" block the way you're suggesting. I'm trying something new this week. https://www.youtube.com/watch?v=CFRhGnuXG-4
8. Re: Should to_number return a sequence?
- Posted by euphoric (admin) 1 week ago
- 449 views
Yikes! That is a terrible implementation of a to_number() function*.
If you want all that other functionality, create proper functions for them. Then do,
if can_be_a_number(can_this_be_a_number_var) then return to_number(can_this_be_a_number_var) else return Null end if
to_number() should return a number or Null (?) if it cannot convert the given object. O or maybe even crash?.
If you want to wrap it yourself, do something like, safe_to_number(can_this_be_a_number_var,X,Y,...).
*I could be wrong.
9. Re: Should to_number return a sequence?
- Posted by xecronix 6 days ago
- 418 views
It's a little bit of a pickle. to_number, given it's name, should simply return a number. But since 0 is a number, that opens up the door for confusion about the meaning of the return value. I think return values that differ in size and type seems peculiar though. And since try/catch isn't an option...
So, what are options?
- to_number crashes on failure. Valid option. But, without an is_number function, that seems a little unforgiving.
- to_number always returns a sequence. It's debatable about what's in the sequence, but 0 or a failure position in element 1 and a number or 0 in element 2 seems reasonable. Or maybe just 1 or 0 in the first element? Or maybe the first element is a number type? (0=NAN, 1=Integer, 2=Float, 3=Currency maybe?) Then, every call has a consistent return value. The next question is if to_number is a valid name for the function. (maybe parse_num might be better?)
- Do nothing. It's fine the way it is.
hmm... Does to_number have too many valid number formats and/or do too much? IDK, opinions, I guess. I think so. But, there isn't anything stopping someone from writing: (not tested... probably works)
function parse_num(sequence s) atom err = atom(to_number(s), -1) atom val = to_number(s) return {err, val} end function
Or if someone were inclined to be Phix and Euphoria compatible: (also not tested)
function parse_num(sequence s) atom b = is_number(s) -- adopting my is_number function or writing a better one. if b then return {b, to_number(s)} -- maybe return a type checked sequence instead? worth considering. end if return {b, 0} end function
10. Re: Should to_number return a sequence?
- Posted by euphoric (admin) 6 days ago
- 400 views
It's a little bit of a pickle. to_number, given its name, should simply return a number. But since 0 is a number, that opens up the door for confusion about the meaning of the return value. I think return values that differ in size and type seems peculiar though. And since try/catch isn't an option...
So, what are options?
- to_number crashes on failure. Valid option. But, without an is_number function, that seems a little unforgiving.
- to_number always returns a sequence. It's debatable about what's in the sequence, but 0 or a failure position in element 1 and a number or 0 in element 2 seems reasonable. Or maybe just 1 or 0 in the first element? Or maybe the first element is a number type? (0=NAN, 1=Integer, 2=Float, 3=Currency maybe?) Then, every call has a consistent return value. The next question is if to_number is a valid name for the function. (maybe parse_num might be better?)
- Do nothing. It's fine the way it is.
I think parse_num() is a great replacement name for that function. It can return a sequence. { TRUE/FALSE, RETURNED_VALUE }. RETURNED_VALUE is only relevant if the first element is TRUE.
Isn't there a value() function that basically does this already?
(I searched the manual for value but could not find anything. Not even a search through Google worked.)
I've always wondered if every return value in Euphoria should be of { SUCCESS_BOOL, RETURNED_VALUE }, but that seems way too verbose.
11. Re: Should to_number return a sequence?
- Posted by petelomax 5 days ago
- 364 views
Or if someone were inclined to be Phix and Euphoria compatible: (also not tested)
I'd suggest:
function parse_num(sequence s) atom b = is_number(s), v = 0 if b then v = to_number(s) end if return {b, v} end function
OTOH, I'm not at all sure how exactly that is supposed to beat just calling is/to inline directly.
Isn't there a value() function that basically does this already?
(I searched the manual for value but could not find anything. Not even a search through Google worked.)
https://openeuphoria.org/docs/std_get.html#_2336_value
http://phix.x10.mx/docs/html/value.htm (incidentally, Example 1 of those docs is an isNumber() function...)
I've always wondered if every return value (in Euphoria) should be of { SUCCESS_BOOL, RETURNED_VALUE }, but that seems way too verbose.
That's pretty much Go (aka golang) in a nutshell.
12. Re: Should to_number return a sequence?
- Posted by irv 5 days ago
- 347 views
How about returning the original string when the conversion fails?
include std/get.e include std/types.e include std/console.e function val (object s) if atom(s) then return s end if for i = 1 to length(s) do if s[i] > '9' then return s end if end for sequence v = value(s) if v[1] = GET_SUCCESS then return v[2] else return s end if end function object a object tests = {"3.14","-1","+2.34",42,"99hello","Hi 44"} for i = 1 to length(tests) do a = val(tests[i]) if string(a) then display("[] failed!",{a}) -- show a warning or something else display("[] => [] success",{tests[i],a}) -- use it. end if end for
irv@irv-desktop:~$ eui test 3.14 => 3.14 success -1 => -1 success +2.34 => 2.34 success 42 => 42 success 99hello failed! Hi 44 failed!
13. Re: Should to_number return a sequence?
- Posted by xecronix 5 days ago
- 324 views
Why shouldn't it return a number or NaN on failure?
Or does NaN not exist in Eu?
I can't speak definitively to whether or not NaN exists in Euphoria or Phix. But, I can say, I haven't seen that yet... sort of.
-- Phix to_number("x") -- this returns a sequence. Which is Not a Number. -- euphoria to_number("x") -- this returns an atom, 0. Which is number. And given the function name, makes sense. -- euphoria to_number("x", -1) -- this voodoo returns a sequence. Which is Not a Number.
I think SDPringle said it best:
It seems like to_number would return a number by the name itself.
So, even if NaN exists, and somehow NaN was represented by something that isn't a number, a dev would not be wrong expect a number to be returned from a function named to_number.
I've always wondered if every return value in Euphoria should be of { SUCCESS_BOOL, RETURNED_VALUE }, but that seems way too verbose.
I've thought about this too over the years. Not specifically for Euphoria but for my code in any language I use. I've never fully committed to this. But looking back on the industry as a whole I think there is a strong argument for returning {success, value} for most functions that return values. Especially numeric returns and object returns. (since objects can be null in many languages).
{success, value} isn't arbitrary in position. For a dev to get to the value, he has to first step over the success. Hopefully he checks it along the way. Honestly, I could make a good living just going from shop to shop fixing null ref exceptions. This return pattern might have helped drive the "trust but verify" point home. But now, I'm starting to slip past the topic at hand.
14. Re: Should to_number return a sequence?
- Posted by xecronix 5 days ago
- 314 views
I'd suggest:
function parse_num(sequence s) atom b = is_number(s), v = 0 if b then v = to_number(s) end if return {b, v} end function
OTOH, I'm not at all sure how exactly that is supposed to beat just calling is/to inline directly.
Thanks for the code suggestion. Your version is clearer than what I wrote, for sure. I especially like the single exit point of the function. My version skipped 1 additional variable allocation, 2 assignment operations, and one variable lookup, but probably wasn't worth the additional clarity your version provides.
I don't think it's a matter of one approach beats another. In fact, I know now exactly how to get the answer as to whether or not a string is a number or not. I know how to do it Euphoria or Phix. I know how to do it cross dialect if I wanted to. Honestly, to everyone reading this who actually worked on the to_number function, thank you. It works and does what I need.
And, if you're interested in my personal UX, this is my 2 cents, FWIW. Functions that vary in return types seem confusing. That's not to say varied return types are invalid patterns. Just that it will require the end user to spend more time in the documentation learning about the cool features (and nuances) Euphoria/Phix has to offer detracting from the time spent using the cool features. Not a bad thing. Simply a matter of fact.
What follows is no ones fault but my own. But I'm going to describe my UX when looking up how to convert a string to a number. First, I was programming in Phix. So, I looked there first for docs but didn't any function that did what I wanted. Maybe it exists. I didn't find it. So, I hopped over to OpenEuphoria and found to_number. I thought to myself, that seems like it will work but, what if the string is not a number? So I started skimming the docs. What I read sort of went like this. Words, words, words. Blah, blah, blah. More words. Ah, there it is. -1 returns this data in a sequence, something else the sequence has blah blah. Ok, got it. It returns a sequence. Cool! And moved on. Just being honest. That's what happened.
So, basically, the fact that it worked as I expected in Phix was just pure luck. And when I tried to use the function in Euphoria I was surprised, but validated by my Phix experience. Thankfully, I had written tests that caught the problem. Anyway, I wrote a quick is_number function to get past the moment and started this thread.
My take-a-way is slow down reading the docs. Assume nothing. I can only hope this conversation had value to you, the readers. To ensure that I've contributed in some meaningful way, I created 2 wiki pages. You gave me your time and expertise. I tried to give back.
Also, FWIW. Euphoria has been my favorite language for a long time. Sadly, I haven't used it much over the years. Professionally, the languages I needed most were (in order) PHP, Javascript, C#, Java, C, Perl, Ruby, Cpp, VB Script, VB, and Python. For fun, I program in Euphoria, now Phix (which I really like), and Pascal. I've stopped programming in BASIC. But I've tried many of those including QBasic, BlitzBASIC, DarkBASIC (under 2K lines), PureBASIC (under 2k lines), RapidQ (loved this one), FreeBASIC (also great), and LibertyBASIC. In most languages I just mentioned I wrote at least 2000 lines of code. Some of the languages 100000 lines or more. Perhaps 300K+ in one of them. The reason I mention my experience is because I want you to know when I say I think Euhporia/Phix could possibly be the best language out there, the opinion is coming from a place of authority. I've tried a bunch.
Thank you.
15. Re: Should to_number return a sequence?
- Posted by petelomax 5 days ago
- 305 views
Why shouldn't it return a number or NaN on failure?
Or does NaN not exist in Eu?
I can't speak definitively to whether or not NaN exists in Euphoria or Phix. But, I can say, I haven't seen that yet... sort of.
-- inf = 1e300*1e300 #ilASM{ fld1 fldz fdivp [32] lea edi,[inf] [64] lea rdi,[inf] [] call :%pStoreFlt } -- Erm, this one is a bit bizarre... -- On the one hand it seems RDS Eu does not support nan properly, but then it somehow does... -- If you try testing for nan, it seems to go all pear-shaped, but avoiding the tests -- seems to make it happy again, and yet print "nan" and "inf" like a good little boy... -- Of course, you shouldn't be using this code on RDS Eu anyway. -- --/**/ nan = -(inf/inf) --/* Phix nan = 3.245673689e243 -- RDS --*/
So that's how I've been creating nan for the past 15 years or more, but I cannot remember why I replaced 1e300*1e300 and
I'm sure I still use that method in a few other places (including 64-bit) - it might very well have been a multithreading issue.
I suspect there's a fairly strong probability that specific value for nan only works on 32-bit RDS Eu.
-- Phix to_number("x") -- this returns a sequence. Which is Not a Number.
-- Phix to_number("x",0) -- this returns 0. Which is a Number.
PS: I would have loved to have been a fly on the wall and see how you found to_number in the Euphoria docs after failing to find it in Phix.chm...
PPS: At least in my book, to_number() [optionally/sometimes] delivering a typecheck is [often] a perfectly good debugging technique.
17. Re: Should to_number return a sequence?
- Posted by xecronix 4 days ago
- 245 views
I'm building 3 new api functions. I don't know if they'll make the cut (as in ever be included with OpenEuphoria or Phix) but they are:
- parse_int : this one is done. It will parse any undecorated string representation of an int either signed or unsigned base2 - base36
- parse_num : this one is almost done. It will parse any undecorated base10 signed or unsigned string representation of a number. I allow for either a comma or a period for a decimal delimiter. The period is the default. (To all of Europe, sorry.)
- parse_decorated_num : this one tries to parse all the silly decorations we pepper number strings with. underscores, commas, spaces, and thin spaces. In fact, I think there are many space types. Whatever. That's what this one does. One day, maybe if I somehow become less jaded about needing such a function, I might add currency and percentages.
All of these functions return {0,0} on failure or {1, atom} on success. (even the parse_int)
What does decoration mean to me? For a base-10 number, it means reject any character that is not a digit, a decimal, or a sign.
for a baseX number, where x is 2 - 36 Reject any character that is not appropriate for the digit. Allow signs. Consider this string:
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
Specific Decroations of note:
- so is 0x a decoration? ATM, yes.
- is # a decoration? ATM, yes.
- is $ a decoration? ATM, yes.
- is , a decoration? By default, yes. But I'll allow it once if it's specified as a decimal delimiter.