Re: Last Element Reference

new topic     » goto parent     » topic index » view thread      » older message » newer message

On Sat, Sep 20, 2003 at 11:54:49AM -0400, jzeitlin at cloud9.net wrote:
> 
> 
> On Sat, 20 Sep 2003 08:12:04 -0700, Al Getz <Xaxo at aol.com> wrote:
> 
> >jzeitlin at cloud9.net wrote:
>  
> >> This is not using negative indexing; this is using a calculated index.  
> >> Under the proposed
> >> format, "$" used as a subscript will always be a non-negative number (I 
> >> suppose it could
> >> conceivably be zero), and if $-n is less than zero for a particular 
> >> value of n, you would
> >> get an error from Euphoria (as you do now if you try to use a negative 
> >> subscript on a
> >> sequence).
> 
> 
> >All i meant was that using a negative index is 'almost' the same
> >as using "$" minus some number.  Let's compare two examples.
> 
> >[1] using $
> >s="MyFile.exw"
> >Name=s[1..6]
> >Ext=s[$-3..$]
> 
> >[2] using (-) indexing
> >s="MyFile.exw"
> >Name=s[1..6]
> >Ext=s[-4..-1]
> 
> >This is why i thought Petes idea was good.
> 
> But it does, as was previously indicated, remove the possibility of using a
> check for
> negative subscript as a flaggable error.

Well, not all negative subscripts, just those that are in the range
-1..-length(s),
anything less than -length(s) would still be flagged as an error.

>  This might not be a problem if sequences were
> internally (to the Euphoria interpreter) managed as association lists, but if
> they're
> managed as arrays (as seems likely), you're complicating things pretty much
> needlessly.

Not really. In implementation terms, just do

if index < 0 then index+=length(s)+1 end if

after saving the original value of index for debugging purposes (that
way we see "error: -5 is out of bounds" instead of "error: 18 is out of bounds"
or something).

> On the other hand, using "s[$]" isn't any more complicated from the
> interpreter's (and
> Rob's, as the writer of the interpreter) point of view than would be using the
> expression
> s[length(s)] - in fact, the _translator_ could (and probably would) generate
> _exactly_ the
> same code for both.

Using s[$] is a parser change, not a run-time engine change. The 2 are
completely different and which is more complicated depends on the implementation
(tho a parser change is prolly more easy than a change to the sequence engine
in most cases).

The translator could generate the same code for negative indexes as well tho,
at the cost for some overhead.
(Implementation note: wrap all index values with fix_neg(), which takes the
index and the length of the sequence as parameters and does the implementation
I outlined above.)

> 
> >The only thing i dont like is that the "-1"
> >after the two dots looks a little strange there.
> 
> I'm ignoring esthetics for now.  I'm interested in functionality and internal
> consistency.

Negative indexes does not affect that, except in debugging terms (yes I
admit it is likely to introduce bugs). Other than that, it wouldn't
affect anything. I should know, I've implemented it before.

> 
> >What about both at the same time?
> 
> >[3] using (-) indexing AND "$" for the end
> 
> >s="MyFile.exw"
> >Name=s[1..6]
> >Ext=s[-4..$]
> 
> >This looks pretty clear, but it involves more of a change
> >to the language then just adding "$" to it.
> 
> Indeed it does - and it still possesses the disadvantages of allowing negative
> subscripting.  What's more, this idea requires the maintenance of _two_
> internal
> pseudoconstants for each sequence, its length (for the use of $), and
> _one_more_than_its_length_, to support the negative subscripting,

No it wouldn't. I already showed that. (And I believe Pete L did as well in
other posts.) I can prove it to, since I've actually implemented negative
subscripts before (the length of the sequence would always have to be stored
in there for bounds checking anyways, so there would be no pseudoconstant
especially for the use of $ btw).

> as the two expressions
> 
> s[$]
> s[-1]
> 
> would necessarily return the same (last) element of the sequence.

Yes. Thats the idea.

>  It would be simpler -
> for Rob, as the author of the language and it's normative interpreter - to
> maintain only
> the $ symbolic pseudoconstant, as all that would need be done is (essentially,
> pseudocode
> description)
> 
> if token-encountered = "$" AND expression-context = sequence-subscript THEN
> return
> sequence-length
> 
> (with appropriate use of variables to indicate _which_ sequence the current
> context is
> referring to).

Which would be done in the parser. No argument there.

But its not more complex to do this in the sequence engine (presumely where
bounds checking is done) (also in pseudocode description):

Save the original index.
If index is negative then add (length of sequence+1) to it.
...NORMAL BOUNDS CHECKING STARTS...
If index is still negative or index is more than length of sequence, then
raise error and use the saved index value in the error message.
......

> 
> And for you and me, it doesn't add any real complexity to the language; it's
> what's often
> called "syntactic sugar", making "$" exactly equivalent to "length()" as a
> sequence
> subscript.

And -1 would be exactly equivalent to length() in a sequence subscript as well.
What of it?

> 
> --
> Jeff Zeitlin
> jzeitlin at cyburban.com
> 
> 
> 
> TOPICA - Start your own email discussion group. FREE!
> 

-- 
Outlook Users, please don't put my email address in your address book. That way,
my email address won't appear in forged emails sent by email viruses. (Which are
technically worms btw :P)

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu