RE: A sequence, by any other name
- Posted by Al Getz <Xaxo at aol.com> Feb 28, 2001
- 377 views
Hi again, I think it would be preposterous for someone to write a library that returns ambiguous data making it impossible for the user to determine what type it is that is being returned. Normally, the type is determined from the context, such as field1 is always a number field2 is always a string of characters etc. or else the user has control over what is stored, and therefore determines his own context beforehand. Im sure if you ask the writer they will provide you with that information. The second method also assumes you are the one doing the storing of the data. All you really do is add 256 to any positive integers just before storing, but leave your characters alone. When you read the data back, you simply test the number to see if its an integer, if it is, you test it again to see if its equal to or over 256. If it is, you know its an integer, not a character. If its under 256, you know its a character and not an integer. In this way you only have to store one number per integer (or character) so you dont use any more storage space then you do when you normally store something. To detect character strings fast, you simply follow one more simple rule: RULE #2: you store character strings separately from integer sets like this: to store the string "ABCDE" n="ABCDE" but to store the set n={65,'B','C','D','E'} --the number 65 followed by string "BCDE" you actually store it as: n={{65+256},"BCDE"} That way you only have to test the first number of each sub sequence to determine whether or not it is a character string or a set of integers. Note also that negative numbers go unchanged, as well as floating point numbers. (see demo below) Here are some functions to illustrate the idea, but you'll have to expand on this idea to include character strings. (Shouldnt be too hard). Note that one of these functions is implemented using a sort of pseudo polymorphism. You always pass a sequence, if the sequence is two elements long, its taken to be a character, but if one element long, a number. This is mainly because you dont have to convert character strings, they will always be stored exactly as they normally appear in a sequence. You do have to convert all numbers though, because if its a positive integer it has to be augmented with 256 in order to detect that fact when reading back the data from the data base or whatever. If you also follow rule #2 then you only have to test the first element as stated before. If you dont follow rule #2 then you really have to test every single element, which could get really slow. --------------------------------- with trace trace(1) sequence n,a atom x constant CHARACTER=0,NUMBER=1 function ConvertForStorage(sequence a) atom x x=a[1] if length(a)<2 then --of type NUMBER: if integer(x) then if x>=0 then x=x+256 if integer(x) then return x else printf(1,"%s\n",{"Integer too large"}) abort(1)--modify this to suite application end if else return x end if else return x end if else --of type CHARACTER: --(dont really have to call this for characters, -- they always go unchanged) return x end if end function function ConvertBackToOriginal(atom x) if integer(x) then if x<0 then --its a negative integer so just return it: return {NUMBER,x} elsif x>=256 then --its a positive integer so subtract 256 to get the --original value: return {NUMBER,(x-256)} else --its a character so dont subtract: return {CHARACTER,x} end if else --its not an integer so just return it return {NUMBER,x} end if end function function Number(sequence a) --quick test to determine read back type if a[1]=NUMBER then return 1 else return 0 end if end function --this is what the test sequence will look like: -- n={-65,65,321,65.1} -- store -65, 'A', +65, and +65.1 -- note: 321=65+256 n=repeat(0,4) x=ConvertForStorage({-65}) --note: pass one element long for numbers n[1]=x x=ConvertForStorage({'A',CHARACTER})--note: --pass two elements for a char n[2]=x x=ConvertForStorage({65}) n[3]=x x=ConvertForStorage({65.1}) n[4]=x for k=1 to length(n) do x=n[k] a=ConvertBackToOriginal(x) x=a[2] if Number(a) then ?x --print the number else printf(1,"%s\n",{x}) --print the character end if end for --------------------------------- One last note: the high end range of possible integers that can be stored is effectively decreased by exactly 256. This means the top end range decreases from #3FFFFFFF to #3FFFFEFF (not much at all). If you try to store a positive integer greater then #3FFFFEFF you'll see an error print out on the screen just before abort. You can modify that to whatever you wish, but you really have to include that test in the code somewhere in order to insure you can accurately detect the correct type when reading back the data, because if the integer overflows into an atom it wont be detected as an integer during read back and therefore wont get decreased by 256 back to the original number. This of course compromises the integrity of the stored data. Of course as mentioned before these methods slow down the code to some degree. Usually you can keep track of what is where without resorting to these types of methods, except maybe in a data base program made to store arbitrary types of data. In any case, you are the only one that can decide what method is best for your application. Good luck with it. --Al