RE: A sequence, by any other name
Hi again,
I think it would be preposterous for someone to write a library
that returns ambiguous data making it impossible for the user
to determine what type it is that is being returned. Normally,
the type is determined from the context, such as
field1 is always a number
field2 is always a string of characters
etc.
or else the user has control over what is stored, and therefore
determines his own context beforehand.
Im sure if you ask the writer they will provide you with that
information.
The second method also assumes you are the one doing the storing of the
data. All you really do is add 256 to any positive integers just
before storing, but leave your characters alone. When you read the
data back, you simply test the number to see if its an integer, if it
is, you test it again to see if its equal to or over 256. If it is,
you know its an integer, not a character. If its under 256, you know
its a character and not an integer. In this way you only have to
store one number per integer (or character) so you dont use any more
storage space then you do when you normally store something.
To detect character strings fast, you simply follow one more simple
rule:
RULE #2:
you store character strings separately from integer sets like this:
to store the string "ABCDE"
n="ABCDE"
but to store the set
n={65,'B','C','D','E'} --the number 65 followed by string "BCDE"
you actually store it as:
n={{65+256},"BCDE"}
That way you only have to test the first number of each sub sequence
to determine whether or not it is a character string or a set of
integers. Note also that negative numbers go unchanged, as well as
floating point numbers. (see demo below)
Here are some functions to illustrate the idea, but you'll have to
expand on this idea to include character strings. (Shouldnt be too
hard).
Note that one of these functions is implemented using a sort of
pseudo polymorphism. You always pass a sequence, if the sequence is
two elements long, its taken to be a character, but if one element
long, a number. This is mainly because you dont have to convert
character strings, they will always be stored exactly as they
normally appear in a sequence. You do have to convert all numbers
though, because if its a positive integer it has to be augmented with
256 in order to detect that fact when reading back the data from the
data base or whatever.
If you also follow rule #2 then you only have to test the first
element as stated before. If you dont follow rule #2 then you
really have to test every single element, which could get really
slow.
---------------------------------
with trace
trace(1)
sequence n,a
atom x
constant CHARACTER=0,NUMBER=1
function ConvertForStorage(sequence a)
atom x
x=a[1]
if length(a)<2 then
--of type NUMBER:
if integer(x) then
if x>=0 then
x=x+256
if integer(x) then
return x
else
printf(1,"%s\n",{"Integer too large"})
abort(1)--modify this to suite application
end if
else
return x
end if
else
return x
end if
else
--of type CHARACTER:
--(dont really have to call this for characters,
-- they always go unchanged)
return x
end if
end function
function ConvertBackToOriginal(atom x)
if integer(x) then
if x<0 then
--its a negative integer so just return it:
return {NUMBER,x}
elsif x>=256 then
--its a positive integer so subtract 256 to get the
--original value:
return {NUMBER,(x-256)}
else
--its a character so dont subtract:
return {CHARACTER,x}
end if
else
--its not an integer so just return it
return {NUMBER,x}
end if
end function
function Number(sequence a)
--quick test to determine read back type
if a[1]=NUMBER then
return 1
else
return 0
end if
end function
--this is what the test sequence will look like:
-- n={-65,65,321,65.1} -- store -65, 'A', +65, and +65.1
-- note: 321=65+256
n=repeat(0,4)
x=ConvertForStorage({-65}) --note: pass one element long for numbers
n[1]=x
x=ConvertForStorage({'A',CHARACTER})--note:
--pass two elements for a char
n[2]=x
x=ConvertForStorage({65})
n[3]=x
x=ConvertForStorage({65.1})
n[4]=x
for k=1 to length(n) do
x=n[k]
a=ConvertBackToOriginal(x)
x=a[2]
if Number(a) then
?x --print the number
else
printf(1,"%s\n",{x}) --print the character
end if
end for
---------------------------------
One last note:
the high end range of possible integers that can be stored is
effectively decreased by exactly 256. This means the top end range
decreases from #3FFFFFFF to #3FFFFEFF (not much at all).
If you try to store a positive integer greater then #3FFFFEFF
you'll see an error print out on the screen just before abort.
You can modify that to whatever you wish, but you really have to
include that test in the code somewhere in order to insure you can
accurately detect the correct type when reading back the data,
because if the integer overflows into an atom it wont be detected as
an integer during read back and therefore wont get decreased by 256
back to the original number. This of course compromises the
integrity of the stored data.
Of course as mentioned before these methods slow down the code to
some degree. Usually you can keep track of what is where without
resorting to these types of methods, except maybe in a data base
program made to store arbitrary types of data. In any case, you are
the only one that can decide what method is best for your application.
Good luck with it.
--Al
|
Not Categorized, Please Help
|
|