SplitToTypes

Split To Types

I could not find a similar function to parse geographic coordinates from a string in a generic way. Perhaps someone might find this useful or may know of a better way of doing it. The function uses recursion.

include std/pretty.e 
include std/convert.e 
include std/sequence.e 
 
function split_to_type( sequence s, sequence delims, sequence types ) 
	-- delims: split by word etc ... e.g. {" ","|"} 
	-- types: i (integer), n (numeric), "" means not to convert ... e.g. {"", "i", "n"}, s is string 
	-- delims and types should be of equal length to work as expected 
	if length(delims) > 0 then 
		s = split(s, delims[1]) 
		for i = 1 to length(s) do 
			if equal(types[1], "") = 1  then 
				s[i] = split_to_type(s[i], remove(delims, 1), remove(types, 1)) 
			elsif equal(types[1], "i") = 1  then 
				s[i] = to_integer(s[i]) 
			elsif equal(types[1], "n") = 1  then 
				s[i] = to_number(s[i]) 
			end if 
		end for 
	end if 
	return s 
end function 
 
sequence s, a 
 
s = "34.5,124.1 333.9,82.4|3223442.78,12 33.9,8.4 342.78,12.4 353.9,48.4" 
puts (1,"input string: " & s & "\n") 
a = split_to_type(s,{"|"," ",","},{"","","n"}) 
pretty_print(1,a) 

output looks like:

input string: 34.5,124.1 333.9,82.4|3223442.78,12 33.9,8.4 342.78,12.4 353.9,48.4 
{ 
  { 
    {34.5,124.1}, 
    {333.9,82.4} 
  }, 
  { 
    {3223442.78,12}, 
    {33.9,8.4}, 
    {342.78,12.4}, 
    {353.9,48.4} 
  } 
} 


A couple of remarks.

1. It would seem better of use a single list of parameters like:

  enum INT, NUM, STR 
 
  par = {"|"," ",",",NUM} 

so that the typing of fields is done last - rather than passing a delimiter and a type together.

2. A question arises about quoted strings. I guess they are not handled because they will not occur in your data.

In the code that follows {} is handled explicitly and apply is used rather than recursion. I think it is a little clearer than the for-loop and recursion.

  include std/sequence.e as seq 
  include std/convert.e 
 
  enum INT, NUM, STR 
 
  function split(sequence S, integer d) -- provide default parameters 
    return seq:split(S,d,0,0)            
  end function 
 
  constant _split = routine_id("split") 
  constant _Split = routine_id("Split") 
 
  function Split(sequence S, sequence DT) 
    if length(S) = 0 then 
      return S 
    elsif length(DT) = 1 then 
      switch DT[1] do 
      case INT then return to_integer(S) 
      case NUM then return to_number(S) 
      case STR then return S 
      end switch 
    elsif sequence(S[1]) then 
      return apply(apply(S, _split, DT[1]), _Split, DT[2..$]) 
    else 
      return apply(split(S, DT[1]), _Split, DT[2..$]) 
    end if 
  end function 
 
  ? Split("34.5,124.1 333.9,82.4|3223442.78,12 33.9,8.4 342.78,12.4 353.9,48.4","| ," & NUM) 

bj

Search



Quick Links

User menu

Not signed in.

Misc Menu