1. How to convert sequence of characters to numbers
- Posted by casey Oct 18, 2012
- 1572 views
I need to convert a date/time string (sequence) of the format "20121018123010" into a sequence of numeric values of the format {2012,10,18,12,30,10} I'm looking for the fastest routine execution time possible. It looks like I could use breakup() and then maybe to_integer() or value() on each of the six elements. Am I missing something or is there a better way. Ultimately I need to poke2 each of the word values into a c_func call
Thanks, casey
2. Re: How to convert sequence of characters to numbers
- Posted by Spock Oct 19, 2012
- 1471 views
I need to convert a date/time string (sequence) of the format "20121018123010" into a sequence of numeric values of the format {2012,10,18,12,30,10} I'm looking for the fastest routine execution time possible. It looks like I could use breakup() and then maybe to_integer() or value() on each of the six elements. Am I missing something or is there a better way. Ultimately I need to poke2 each of the word values into a c_func call
Thanks, casey
Hi Casey,
I imagine that the fastest way would be to poke the string into memory and call a custom machine code routine that did the conversion AND called your C function.
However, if you wanted to use Euphoria then a custom routine would be fastest. You would not have to call breakup() or any other Eu functions as you could address and convert each element directly:
sequence s = "20121018123010" s -= '0' integer year = s[1]*1000 + s[2]*100 + s[3]*10 + s[4] -- year is now 2012
I think you get the idea.
Spock
3. Re: How to convert sequence of characters to numbers
- Posted by useless_ Oct 19, 2012
- 1495 views
I need to convert a date/time string (sequence) of the format "20121018123010" into a sequence of numeric values of the format {2012,10,18,12,30,10} I'm looking for the fastest routine execution time possible. It looks like I could use breakup() and then maybe to_integer() or value() on each of the six elements. Am I missing something or is there a better way. Ultimately I need to poke2 each of the word values into a c_func call
Thanks, casey
Hi Casey,
I imagine that the fastest way would be to poke the string into memory and call a custom machine code routine that did the conversion AND called your C function.
However, if you wanted to use Euphoria then a custom routine would be fastest. You would not have to call breakup() or any other Eu functions as you could address and convert each element directly:
sequence s = "20121018123010" s -= '0' integer year = s[1]*1000 + s[2]*100 + s[3]*10 + s[4] -- year is now 2012
I think you get the idea.
Spock
Umm, s[1] is '2' , not 2, and '2' = 50, 2 = 2. Likewise s[1..4] = "2012".
I think you get the idea.
useless
4. Re: How to convert sequence of characters to numbers
- Posted by jaygade Oct 19, 2012
- 1492 views
Umm, s[1] is '2' , not 2, and '2' = 50, 2 = 2. Likewise s[1..4] = "2012".
I think you get the idea.
useless
Yes, and '0' is 48.
The expression "20121018123010" - '0' is equivalent to {50, 48, 49, 50, 49, 48, 49, 56, 49, 50, 51, 48, 49, 48} - 48 which equals {2, 0, 1, 2, 1, 0, 1, 8, 1, 2, 3, 0, 1, 0}
Now s[1..4] is {2, 0, 1, 2} which multiplied out gives you the year as an integer.
Edit: Corrected conversion error
5. Re: How to convert sequence of characters to numbers
- Posted by casey Oct 19, 2012
- 1467 views
Thanks very much. Your solution is simpler and indeed 2x+ faster than calling breakup() and then to_number().
Casey
6. Re: How to convert sequence of characters to numbers
- Posted by useless_ Oct 19, 2012
- 1496 views
Umm, s[1] is '2' , not 2, and '2' = 50, 2 = 2. Likewise s[1..4] = "2012".
I think you get the idea.
useless
Yes, and '0' is 48.
The expression "20121018123010" - '0' is equivalent to {50, 48, 49, 50, 49, 48, 49, 56, 49, 50, 51, 48, 49, 48} - 48 which equals {2, 0, 1, 2, 1, 0, 1, 8, 1, 2, 3, 0, 1, 0}
Now s[1..4] is {2, 0, 1, 2} which multiplied out gives you the year as an integer.
Edit: Corrected conversion error
Wierd, i had to look twice again to see his "s -= '0'" line, but i saw your "- 48" code immeadiately. My mistake, sorry.
EDIT:
I'd like to blame this on the font. The other day, Derek used what looked like a lower case 'o' in some math, and i copy/pasted it to a text editor which displays a greater difference between 0, O, and o. I'd have typed it just as you did, using 48 instead of '0'.
useless
7. Re: How to convert sequence of characters to numbers
- Posted by ghaberek (admin) Oct 19, 2012
- 1521 views
I need to convert a date/time string (sequence) of the format "20121018123010" into a sequence of numeric values of the format {2012,10,18,12,30,10} I'm looking for the fastest routine execution time possible. It looks like I could use breakup() and then maybe to_integer() or value() on each of the six elements. Am I missing something or is there a better way. Ultimately I need to poke2 each of the word values into a c_func call
Obligatory XKCD reference: Regular Expressions
include "std/convert.e" include "std/regex.e" regex pattern = regex:new( "([0-9]{4})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})" ) function get_parts( sequence string ) sequence parts = {} if regex:is_match( pattern, string ) then object matches = regex:matches( pattern, string ) parts = repeat( 0, length(matches)-1 ) for i = 1 to length(matches)-1 do parts[i] = to_integer( matches[i+1] ) end for end if return parts end function
Example:
sequence string = "20121018123010" sequence parts = get_parts( string ) printf( 1, "string = \"%s\"\n", {string} ) puts( 1, "parts = " ) ? parts
Output:
$ eui regex-test.ex string = "20121018123010" parts = {2012,10,18,12,30,10}
-Greg
8. Re: How to convert sequence of characters to numbers
- Posted by jaygade Oct 20, 2012
- 1457 views
Right, but is it fast. That was one of the OP's original criteria.
9. Re: How to convert sequence of characters to numbers
- Posted by ghaberek (admin) Oct 20, 2012
- 1431 views
Right, but is it fast. That was one of the OP's original criteria.
Dang it, you're right. I just did some tests on over 500,000 iterations; the method proposed by you, Spock, and Kat is over 25 times faster than using my regular expression.
However, using a regular expression makes it easier to:
- quickly parse the 'number' out of a string or text file
- validate the input on one pass: it either matches or it doesn't
- allow for any number of variations and/or strictness in the input, e.g.
- 'year' can only be 1970-2012
- 'month' can only be 01-12
- 'day' can only be 01-31
- 'hour' can only be 00-23
- 'minute' and 'second' can only be 00-59
Here is the function I used against mine for testing...
function get_parts( sequence string ) string -= '0' sequence parts = repeat( 0, 6 ) parts[1] = (string[1] * 1000) + (string[2] * 100) + (string[3] * 10) + string[4] parts[2] = (string[5] * 10) + string[6] parts[3] = (string[7] * 10) + string[8] parts[4] = (string[9] * 10) + string[10] parts[5] = (string[11] * 10) + string[12] parts[6] = (string[13] * 10) + string[14] return parts end function
-Greg
10. Re: How to convert sequence of characters to numbers
- Posted by petelomax Oct 25, 2012
- 1393 views
the method proposed by you, Spock, and Kat is over 25 times faster than my regular expression.
Ha
However, using a regular expression makes it easier to:
Uh
- allow for any number of variations and/or strictness in the input, e.g.
- 'year' can only be 1970-2012
- 'month' can only be 01-12
- 'day' can only be 01-31
- 'hour' can only be 00-23
- 'minute' and 'second' can only be 00-59
I'll believe that is a simple and easy regular expression when I see it and not before.
Pete