1. POLL: What should value do with underscores?

The interpreter can read things like "2e10_000" and "#DEAD_BEEF". It can read values as hexadecimal, decimal, octal, and even binary. Okay, now although the interpreter might be able to handle special formatting of numbers, that does not necessarily mean that an application written in EUPHORIA should handle '_' this way or handle specially strings like "0b0" at the beginning of the number when it uses value() or get().

So, my question is to you all is, should any literal that is readable by the interpreter be readable by the routines value() and get() the same way?

Should the following be equivalent?

  • value("1024") ,
  • value("1_024"),
  • value("0b100_00000000"), -- 1024 in binary BTW
  • value("#400"), -- 1024 in hex
  • value("0t2000") -- 1024 in octal

Currently, value does not regard '_' as a possible part of a number and parses decimal, hexadecimal and scientific notation but neither octal nor binary.

new topic     » topic index » view message » categorize

2. Re: POLL: What should value do with underscores?

Maybe, as a low-priority thing.

Yes: Consistency is good. That implies triplequotes, backticks, embeded values in strings (\xHH, \uHHHH, \UHHHHHHHH) and quite a few others should be added to the pot.
No: You ain't gonna need it might well apply.

I suppose that counts as a +0.1 for.

Pete

new topic     » goto parent     » topic index » view message » categorize

3. Re: POLL: What should value do with underscores?

It seems to me, there is no controversy and leaving value and get to not handle underscores specially. The front end presently parses: 1.7976931348623159e308 as 1.348269851e+308. My intention is to make the scanner throw an error when the biggest value is exceeded.

Shawn

new topic     » goto parent     » topic index » view message » categorize

4. Re: POLL: What should value do with underscores?

SDPringle said...

It seems to me, there is no controversy and leaving value and get to not handle underscores specially. The front end presently parses: 1.7976931348623159e308 as 1.348269851e+308. My intention is to make the scanner throw an error when the biggest value is exceeded.

Shawn

Can't this be prevented by a user defined type ?

As far as parsing; allowing various forms of values is just adding more omplications.

and defeating the simplicity of euphoria.

Bad coding errors should just fail !

The better solution is improve the error reporting explaining what caused the error.


Forked into: Interpreter erroneously reading some numbers

new topic     » goto parent     » topic index » view message » categorize

5. Re: POLL: What should value do with underscores?

 
function get_number()  
-- read a number or a comment  
-- ch is "live" at entry and exit  
	plus_or_minus sign, e_sign  
	integer ndigits, base  
	integer base_digit  
	atom mantissa, dec, e_mag  
	sequence number_string = {}  
  
	sign = +1  
	mantissa = 0  
	ndigits = 0  
	base = 10  
	  
	-- process sign  
	if ch = '-' then  
		number_string &= ch 
		get_ch()  
		if ch='-' then  
			return read_comment()  
		end if  
		sign = -1  
	elsif ch = '+' then  
		number_string &= ch 
		get_ch()  
	end if  
  
	while ch = '_' do 
		get_ch() 
	end while 
	 
	-- get base  
	if ch = '#' then  
		number_string &= ch  
		get_ch()  
		base = 16  
	elsif ch = '0' then  
		number_string &= ch  
		get_ch()  
		if find(ch, "btx") then  
			if ch = 'b' then  
				base = 2  
			elsif ch = 't' then  
				base = 8  
			elsif ch ='x' then  
				base = 16  
			end if  
			number_string &= ch  
			get_ch()  
		end if  
	end if  
  
	-- process integer part  
	while TRUE do  
		base_digit = find(ch - 32 * (ch >= 'a'), HEX_DIGITS[1..base] & '_') - 1  
		if base_digit >= 0 then  
			if base_digit < base then  
	 			ndigits += 1  
				mantissa = mantissa * base + base_digit  
				number_string &= ch  
			end if  
			get_ch()  
		else  
			exit  
		end if  
	end while  
  
	-- process decimal part  
	if ch = '.' then  
		number_string &= ch  
		get_ch()  
		dec = base  
		while TRUE do  
			base_digit = find(ch - 32 * (ch >= 'a'), HEX_DIGITS[1..base] & '_') - 1  
			if base_digit >= 0 then  
				if base_digit < base then  
					ndigits += 1  
					mantissa += base_digit / dec  
					dec *= base  
					number_string &= ch  
				end if  
				get_ch()  
			else  
				exit  
			end if  
		end while  
	end if  
  
	mantissa = sign * mantissa  
  
	if ch = 'e' or ch = 'E' then  
		-- get exponent sign  
		number_string &= ch  
		get_ch()  
		while ch = '_' do 
			get_ch() 
		end while 
		if ch = '-' or ch = '+' then  
			number_string &= ch  
			get_ch()  
		end if  
		  
		-- get exponent magnitude  
		while TRUE do  
			base_digit = find(ch - 32 * (ch >= 'a'), HEX_DIGITS[1..base] & '_') - 1  
			if base_digit >= 0 then  
				if base_digit < base then  
					ndigits += 1  
					number_string &= ch  
				end if  
				get_ch()  
			else  
				exit  
			end if  
		end while  
		mantissa = scientific_to_atom( number_string )  
	end if  
  
	if ndigits = 0 then  
		return {GET_FAIL, 0}  
	end if  
  
	return {GET_SUCCESS, mantissa}  
end function  
 

get value for binary, octal, decimal or hexadecimal from a string (integer or floating point).

 
[$]: cat valnum.ex  
 
include std/get.e.new 
 
? value("-1024.345") 
? value("-1_02___4.3_45_ ") 
? value("-#4_00__.5851_EB85_2") 
? value("-#4_00__.5851_eb85_2") 
? value("-0x4_0____0.__58_51EB852") 
? value("-0x4_0____0.__58_51eb852") 
? value("-0t2__0_00.260507534") 
? value("-0b100_00__00_0000.0101_1000_1") 
? value("-1_024.3_45__e-_2_5") 
? value("0x3.243F6A888") 
 
[$]: eui valnum.ex  
{0,-1024.345} 
{0,-1024.345} 
{0,-1024.345} 
{0,-1024.345} 
{0,-1024.345} 
{0,-1024.345} 
{0,-1024.345} 
{0,-1024.345703} 
{0,-1.024345e-22} 
{0,3.141592654} 
 

new topic     » goto parent     » topic index » view message » categorize

6. Re: POLL: What should value do with underscores?

quoting:
Should the following be equivalent?

value("1024") ,
value("1_024"),
value("0b100_00000000"), 1024 in binary BTW
value("#400"), 1024 in hex
value("0t2000") 1024 in octal

Currently, value does not regard '_' as a possible part of a number and parses decimal,
hexadecimal and scientific notation but neither octal nor binary.

Just use these standards:

for nat: value(1[.]024) No decimal comma (as natural numbers cannot be broken)
for dec: value(1[.]024,0) Obligatory decimal comma
for hex: value(#[0]400)
for oct: value(*2000)
for bin: value(&[000000]1000000000)

For the sake of programming however 3 options will suffice: nat/dec and hex AND maybe
for sci: value(%1.024e3)

new topic     » goto parent     » topic index » view message » categorize

7. Re: POLL: What should value do with underscores?

Spanish numbers format:

     dddd[,dd...]  <== no group < 10_000  
   dd ddd[,dd...]  <== space as separator 
d ddd ddd[,dd...]  <== decimal comma 

and uses long scale for names.

new topic     » goto parent     » topic index » view message » categorize

8. Re: POLL: What should value do with underscores?

See Ticket:638, especially the title. The content of the ticket only refers to specific differences.

Arthur

new topic     » goto parent     » topic index » view message » categorize

9. Re: POLL: What should value do with underscores?

ArthurCrump said...

See Ticket:638, especially the title. The content of the ticket only refers to specific differences.

Arthur

a patch for tests (new numbers style an c-style comments).

--- get.e	2015-02-13 21:23:13.423620000 +0100 
+++ get.e	2015-02-13 22:02:58.924620000 +0100 
@@ -182,112 +182,160 @@ 
 	end if 
 end function 
  
-function get_number() 
--- read a number or a comment 
--- ch is "live" at entry and exit 
-	plus_or_minus sign, e_sign 
-	natural ndigits 
-	integer hex_digit 
-	atom mantissa, dec, e_mag 
-	sequence number_string = { ch } 
- 
-	sign = +1 
-	mantissa = 0 
-	ndigits = 0 
- 
-	-- process sign 
-	if ch = '-' then 
-		sign = -1 
-		get_ch() 
-		number_string &= ch 
-		if ch='-' then 
-			return read_comment() 
-		end if 
-	elsif ch = '+' then 
-		get_ch() 
-		number_string &= ch 
-	end if 
  
-	-- get mantissa 
-	if ch = '#' then 
-		-- process hex integer and return 
+function read_c_comment() 
+	if atom(input_string) then 
+		while ch != '*' and ch != -1 do 
+			get_ch() 
+		end while 
 		get_ch() 
-		number_string &= ch 
-		while TRUE do 
-			hex_digit = find(ch, HEX_DIGITS)-1 
-			if hex_digit >= 0 then 
-				ndigits += 1 
-				mantissa = mantissa * 16 + hex_digit 
-				get_ch() 
-				number_string &= ch 
+		if ch = '/' then 
+			return {GET_IGNORE, 0} 
+		else 
+			return {GET_EOF, 0} 
+		end if 
+	else 
+		for i = string_next to length(input_string) do 
+			ch = input_string[i] 
+			if ch != '*' then 
+				string_next = i + 1 
 			else 
-				if ndigits > 0 then 
-					return {GET_SUCCESS, sign * mantissa} 
-				else 
-					return {GET_FAIL, 0} 
+				ch = input_string[i+1] 
+				string_next = i + 1 
+				if ch = '/' then 
+					return {GET_IGNORE, 0} 
 				end if 
 			end if 
-		end while 
+		end for 
+		return {GET_EOF, 0} 
 	end if 
+end function 
  
-	-- decimal integer or floating point 
-	while ch >= '0' and ch <= '9' do 
-		ndigits += 1 
-		mantissa = mantissa * 10 + (ch - '0') 
-		get_ch() 
+function get_number()  
+-- read a number or a comment  
+-- ch is "live" at entry and exit  
+	plus_or_minus sign, e_sign  
+	integer ndigits, base  
+	integer base_digit  
+	atom mantissa, dec, e_mag  
+	sequence number_string = {}  
+  
+	sign = +1  
+	mantissa = 0  
+	ndigits = 0  
+	base = 10  
+	  
+	-- process sign  
+	if ch = '-' then  
 		number_string &= ch 
-	end while 
- 
-	if ch = '.' then 
-		-- get fraction 
-		get_ch() 
+		get_ch()  
+		if ch='-' then  
+			return read_comment()  
+		end if  
+		sign = -1  
+	elsif ch = '+' then  
 		number_string &= ch 
-		dec = 10 
-		while ch >= '0' and ch <= '9' do 
-			ndigits += 1 
-			mantissa += (ch - '0') / dec 
-			dec *= 10 
-			get_ch() 
-			number_string &= ch 
-		end while 
-	end if 
- 
-	if ndigits = 0 then 
-		return {GET_FAIL, 0} 
-	end if 
- 
-	mantissa = sign * mantissa 
- 
-	if ch = 'e' or ch = 'E' then 
-		 
-		-- get exponent sign 
+		get_ch()  
+	end if  
+  
+	while ch = '_' do 
 		get_ch() 
-		number_string &= ch 
-		if ch = '-' then 
-			get_ch() 
-			number_string &= ch 
-		elsif ch = '+' then 
+	end while 
+	 
+	-- get base  
+	if ch = '#' then  
+		number_string &= ch  
+		get_ch()  
+		base = 16  
+	elsif ch = '0' then  
+		number_string &= ch  
+		get_ch()  
+		if find(ch, "btx") then  
+			if ch = 'b' then  
+				base = 2  
+			elsif ch = 't' then  
+				base = 8  
+			elsif ch ='x' then  
+				base = 16  
+			end if  
+			number_string &= ch  
+			get_ch()  
+		end if  
+	end if  
+  
+	-- process integer part  
+	while TRUE do  
+		base_digit = find(ch - 32 * (ch >= 'a'), HEX_DIGITS[1..base] & '_') - 1  
+		if base_digit >= 0 then  
+			if base_digit < base then  
+	 			ndigits += 1  
+				mantissa = mantissa * base + base_digit  
+				number_string &= ch  
+			end if  
+			get_ch()  
+		else  
+			exit  
+		end if  
+	end while  
+  
+	-- process decimal part  
+	if ch = '.' then  
+		number_string &= ch  
+		get_ch()  
+		dec = base  
+		while TRUE do  
+			base_digit = find(ch - 32 * (ch >= 'a'), HEX_DIGITS[1..base] & '_') - 1  
+			if base_digit >= 0 then  
+				if base_digit < base then  
+					ndigits += 1  
+					mantissa += base_digit / dec  
+					dec *= base  
+					number_string &= ch  
+				end if  
+				get_ch()  
+			else  
+				exit  
+			end if  
+		end while  
+	end if  
+  
+	mantissa = sign * mantissa  
+  
+	if ch = 'e' or ch = 'E' then  
+		-- get exponent sign  
+		number_string &= ch  
+		get_ch()  
+		while ch = '_' do 
 			get_ch() 
-			number_string &= ch 
-		end if 
-		 
-		-- get exponent magnitude 
-		if ch >= '0' and ch <= '9' then 
-			 
-			while ch >= '0' and ch <= '9' with entry do 
-				number_string &= ch 
-			entry 
-				get_ch() 
-			end while 
-		else 
-			return {GET_FAIL, 0} -- no exponent 
-		end if 
-		 
-		mantissa = scientific_to_atom( number_string ) 
-	end if 
+		end while 
+		if ch = '-' or ch = '+' then  
+			number_string &= ch  
+			get_ch()  
+		end if  
+		  
+		-- get exponent magnitude  
+		while TRUE do  
+			base_digit = find(ch - 32 * (ch >= 'a'), HEX_DIGITS[1..base] & '_') - 1  
+			if base_digit >= 0 then  
+				if base_digit < base then  
+					ndigits += 1  
+					number_string &= ch  
+				end if  
+				get_ch()  
+			else  
+				exit  
+			end if  
+		end while  
+		mantissa = scientific_to_atom( number_string )  
+	end if  
+  
+	if ndigits = 0 then  
+		return {GET_FAIL, 0}  
+	end if  
+  
+	return {GET_SUCCESS, mantissa}  
+end function  
  
-	return {GET_SUCCESS, mantissa} 
-end function 
  
 function Get() 
 -- read a Euphoria data object as a string of characters 
@@ -345,7 +393,16 @@ 
 					skip_blanks() 
 					if ch = '}' then 
 						get_ch() 
-					return {GET_SUCCESS, s} 
+						return {GET_SUCCESS, s} 
+					elsif ch = '/' then -- comment starts after item and before comma 
+						get_ch() 
+						if ch = '*' then 
+							get_ch() 
+							e = read_c_comment() 
+							if e[1] != GET_IGNORE then  -- it wasn't a comment, this is illegal 
+								return {GET_FAIL, 0} 
+							end if 
+						end if 
 					elsif ch!='-' then 
 						exit 
 					else -- comment starts after item and before comma 
@@ -355,11 +412,11 @@ 
 						end if 
 						-- read next comment or , or } 
 					end if 
-			end while 
+				end while 
 				if ch != ',' then 
-				return {GET_FAIL, 0} 
+					return {GET_FAIL, 0} 
 				end if 
-			get_ch() -- skip comma 
+				get_ch() -- skip comma 
 			end while 
  
 		elsif ch = '\"' then 
@@ -368,6 +425,14 @@ 
 			return get_heredoc("`") 
 		elsif ch = '\'' then 
 			return get_qchar() 
+		elsif ch = '/' then 
+			get_ch() 
+			if ch = '*' then 
+				get_ch() 
+				return read_c_comment() 
+			else 
+				return {GET_FAIL, 0} 
+			end if 
 		else 
 			return {GET_FAIL, 0} 
  
@@ -442,7 +507,16 @@ 
 					skip_blanks() 
 					if ch = '}' then 
 						get_ch() 
-					return {GET_SUCCESS, s, string_next-1-offset-(ch!=-1), leading_whitespace} 
+						return {GET_SUCCESS, s, string_next-1-offset-(ch!=-1), leading_whitespace} 
+					elsif ch = '/' then -- comment starts after item and before comma 
+						get_ch() 
+						if ch = '*' then 
+							get_ch() 
+							e = read_c_comment() 
+							if e[1] != GET_IGNORE then  -- it wasn't a comment, this is illegal 
+								return {GET_FAIL, 0} 
+							end if 
+						end if 
 					elsif ch!='-' then 
 						exit 
 					else -- comment starts after item and before comma 
@@ -452,11 +526,11 @@ 
 						end if 
 						-- read next comment or , or } 
 					end if 
-			end while 
+				end while 
 				if ch != ',' then 
-				return {GET_FAIL, 0, string_next-1-offset-(ch!=-1), leading_whitespace} 
+					return {GET_FAIL, 0, string_next-1-offset-(ch!=-1), leading_whitespace} 
 				end if 
-			get_ch() -- skip comma 
+				get_ch() -- skip comma 
 			end while 
  
 		elsif ch = '\"' then 
@@ -468,9 +542,16 @@ 
 		elsif ch = '\'' then 
 			e = get_qchar() 
 			return e & {string_next-1-offset-(ch!=-1), leading_whitespace} 
+		elsif ch = '/' then 
+			get_ch() 
+			if ch = '*' then 
+				e = read_c_comment() 
+				return e & {string_next-1-offset-(ch!=-1), leading_whitespace} 
+			else 
+				return {GET_FAIL, 0, string_next-1-offset-(ch!=-1), leading_whitespace} 
+			end if 
 		else 
 			return {GET_FAIL, 0, string_next-1-offset-(ch!=-1), leading_whitespace} 
- 
 		end if 
  
 	end while 
@@ -545,11 +626,6 @@ 
 -- A single call to ##get## will read in this entire sequence,  return its value as a result, and 
 -- return complementary information. 
 -- 
--- The function get() is guaranteed to read a value produced by print().  It is often 
--- necessary to leave a new line between objects printed though.  The user should not use 
--- get() or value() to read number notations that use bases other than ten or sixteen. 
--- Hexadecimal values will not be read unless they are preceeded by a hash mark (#). 
--- 
 -- If a nonzero offset is supplied, it is interpreted as an offset to the current file 
 -- position; the file will start ##seek## from there first. 
 -- 
 

[$]: cat getpru.ex  
 
include std/get.e 
 
sequence s = """{ 2, 2.5 /*  
ho * / 
 la */, 
14.69   -- segundo 
,31 
/*ul 
timo*/}""" 
 
? value(s) 
 
[$]: eui getpru.ex  
{ 
  0, 
  {2,2.5,14.69,31} 
} 
 
 

new topic     » goto parent     » topic index » view message » categorize

10. Re: POLL: What should value do with underscores?

Forgot count zero when it is not part of the header number.

--- get.e	2015-02-14 08:37:35.982026000 +0100 
+++ get.e	2015-02-14 08:39:14.970026000 +0100 
@@ -192,7 +192,7 @@ 
 		if ch = '/' then 
 			return {GET_IGNORE, 0} 
 		else 
-			return {GET_EOF, 0} 
+			return read_c_comment() 
 		end if 
 	else 
 		for i = string_next to length(input_string) do 
@@ -203,7 +203,7 @@ 
 				ch = input_string[i+1] 
 				string_next = i + 1 
 				if ch = '/' then 
-					return {GET_IGNORE, 0} 
+					return read_c_comment() 
 				end if 
 			end if 
 		end for 
@@ -260,6 +260,8 @@ 
 			end if  
 			number_string &= ch  
 			get_ch()  
+		else 
+			ndigits += 1  
 		end if  
 	end if  
   
@@ -301,7 +303,7 @@ 
   
 	mantissa = sign * mantissa  
   
-	if ch = 'e' or ch = 'E' then  
+	if base = 10 and (ch = 'e' or ch = 'E') then  
 		-- get exponent sign  
 		number_string &= ch  
 		get_ch()  
@@ -314,6 +316,7 @@ 
 		end if  
 		  
 		-- get exponent magnitude  
+		ndigits = 0 
 		while TRUE do  
 			base_digit = find(ch - 32 * (ch >= 'a'), HEX_DIGITS[1..base] & '_') - 1  
 			if base_digit >= 0 then  
@@ -326,7 +329,9 @@ 
 				exit  
 			end if  
 		end while  
-		mantissa = scientific_to_atom( number_string )  
+		if ndigits > 0 then  
+			mantissa = scientific_to_atom( number_string )  
+		end if  
 	end if  
   
 	if ndigits = 0 then  
 

Sorry for the inconvenience.

Thanks.

new topic     » goto parent     » topic index » view message » categorize

11. Re: POLL: What should value do with underscores?

patch for test c comments and new numeric formats for get() and value()

http://openeuphoria.org/pastey/266.wc

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu