OpenEuphoria: Forum: RE: match() (not short, he he)

RE: match() (not short, he he)

new topic » topic index » view thread » older message » newer message
Posted by Al Getz <Xaxo at aol.com> May 26, 2003
424 views
Hello there,

In addtion to the other interesting posts...


match("",s)
cant return a number greater then zero because
that interpretation of the match command would 
only be valid when you assume that the first
element of s is a null sequence, and that's not
really true at all.
Lets say we allow match to return a 1 for this case,
then this would lead to code like this:

s="xyz"
i=match("","xyz")
--i is now equal to the number 1
printf(1,"%s",s[i])

which would print the first character
of the sequence s, which would not be
the null sequence "".
If we had found the correct sequence,
the printf statement would have printed
the null sequence, which is not the same
as printing an 'x'.


In the remainder of this post, i'll show why the function

match(s1,s2)

with either or both args equal to the null sequence ""
should return a zero.



A sequence, in it's more basic definition, is a container just like
any other variable.
It can be an empty container, or a container with something inside it.
If it's an empty container, it cant hold anything of whatever is
inside of a container that is NOT empty.  This means that
an empty container can NEVER contain anything that a non empty
container contains.

This applies even if the containers themselves can contain other
containers.  If there is nothing within the internal containers,
then the most external container also contains nothing.

In the case of 'match', we want to know what is the best value
to return for cases when one or both operators are null, or
empty containers:

retv=match(container1,container2)

Since an empty container cant possibly have anything that a non empty
container has in it,

match("abc","")

already returns a zero in Euphoria.

This leaves the two cases,

match("","")

which is almost like

match("abc","")

which should both return a zero, because you cant find anything
within a container that doesnt have anything in it, even if
the first container also doesnt have anything in it,

and

match("","abc")

which should also return a zero because you cant find 
the lack of anything inside of a container that already
contains something.


In spite of this clear viewpoint, it might be still possible to 
argue against the simple conclusions drawn from this
on the basis that calling the sequences 'containers' is still just
another abstraction, no better than:

--------------------------------------------------------------------------


match("abc","")
"If you dont find something within that which has nothing,
you havent found anything, and so you have found zero of anything".


match("","")
It's questionable if we could really find nothing among nothing, so:

"If you couldnt find nothing within that which already has nothing,
you have still found nothing, which is still zero of anything",
and
"Even if you could find nothing among that which has nothing, you would 
have still found nothing, which is indeed still zero of anything".


match("","abc")
"If you are looking for nothing within that which has something, you
wont find it, and so you have found zero of anything"

--------------------------------------------------------------------------



So i'll present more conclusive evidence why

match("","abc")
and
match("","")

should both also return a zero.

Main reasons for a change:
1. match("abc","") already returns a zero.
2. 'nothing' is not part of 'something', so it cant be there.

Validating supplementary statement:
3. no existing code would break.


[1]
match("abc","")
returns a zero already.
This means the language already admits that 'nothing' is a valid
argument, otherwise it would also trigger an error.
This means as it stands right now, 
"It's ok to look for something within nothing and not find it,
but it's not ok to look for nothing within something".

[2]
'nothing' is not part of 'something' in the computer world,
so it cant ever be found and so a zero return value is warranted.

[3]
No immediate existing code would break, nor would any 
simply modified code break if it was written correctly in 
the first place.  Existing code cant contain sequences
which ever reach null "" (match("","abc")) or else an error
would have stopped their program and they would have corrected it.
Slightly modified code might then reach a state where at
least one variable reaches the value "" within a match() call,
which would then return a zero.  Since the code would have
to be able to respond to a case where match("abc","") could
come up, it would already be able to respond to a zero returned
from match() for the new case of "" for the first op.


Really though, match(object1,object2) should be the ultimate goal
of the language.  I cant think of a reason why you shouldnt be
able to look for 'a' in "abc" can anyone else?
Especially since you can fool it:

i=match('a'&"","abc") -- haha, fooled you 'match()' !
--i comes out to 1




I am therefore asserting two things to be correct:

1. Changing 'match' to return a zero when its first member
   is "" is the best way to handle the return value.
2. Changing 'match' as #1  states wont break any existing code.

Note that im not just saying "change the match function because it's
   better",
Im also asserting that this change wont break any existing code.




For one last note, the function:

global function Match(sequence a, sequence b)
  if length(a)<1 then
    return 0
  end if
  return match(a,b)
end function

can be used as a replacement for match() and also wont
break any existing code!


Any arguments?  


Take care for now,
Al
OpenEuphoria

RE: match() (not short, he he)

Search

Include:

Quick Links

User menu

Misc Menu