1. Re: Data validation (was Re: Stu--- Just how many times has this changed?)

see  *grins* ..if I didn't provide such interesting sounding subject
lines...
1.  First, no one would read it...and
2.  Then people couldn't have such fun changing them to other things..

Michelle Rogers
----- Original Message ----- 
From: "Patrick Barnes" <mistertrik at hotmail.com>
To: <EUforum at topica.com>
Sent: Tuesday, June 01, 2004 10:16 PM
Subject: RE: Data validation (was Re: Stu--- Just how many times has this
changed?)


>
>
> MrTrick
>
>
> > > >From: Derek Parnell <guest at RapidEuphoria.com>
> > > >Subject: Re: Stupid Newbie-sounding question.
> > > > > Okay Rob... What sort of structure do these sequences have to
have?
> >I
> > > >don't
> > > > > know! If we had the ability to use stronger typing, then function
> > > > > myFunc(text_picture r1, string r2, flag_list r3) would make a lot
> >more
> > > > > sense. I could go and look at the type declaration, and figure out
> >what
> > > >to
> > > > > pass to the function.
> > > > > People don't use the types like this at the moment because it's so
> >damn
> > > > > computationally expensive for sequences.
> > > >
> > > >This is true. And there is the trade off - speed & flexibility
against
> > > >structure and complexity. Think of it in terms of a bicycle without
> > > >training wheels and one with training wheels blink
> > > >
> > > >RDS has chosen speed. Meaning that the coder has more responsibility
to
> > > >provide quality assurance than the translator does. Yes this does
mean
> > > >more work for the coder and better discipline.
> > >
> > > Well, if you're writing a library that other people would use, I would
> >say
> > > that it's the library programmer's responsibility to check the data
> >passed
> > > to it.
> >
> >This is a good issue to discuss. I must deal with this problem with the
> >Win32lib library. The question for me is, how *much* parameter validation
> >should the library perform?
> >
> >On one extreme, it could be said that I should not do any parameter
> >validation because it is the *user's* responsibility to provide
> >parameter values as documented. If they do not heed the documentation,
why
> >is it considered to be my problem? This approach would speed up the
> >execution of win32lib applications, and slow down the development of
> >quality applications.
> >
> >On the other extreme, I should do everything in my power to protect
> >the user from using incorrect parameter values. As a service to the
> >coder (and end-user?) I should try to provide meaningful exception
> >messages and/or codes when I detect unusable parameter values being
> >passed.  This approach would speed up the development of
> >quality applications and slow down the execution of win32lib
applications.
> >
> >And there are many shades in between these two extremes. Maybe I should
> >provide two versions of the library - one with training wheels and one
> >without?
> >
> >At this stage I don't have the answer. The current library does some
> >checking but could do more.
> >
> > > I think that types could be implemented in a way that would run
*faster*
> > > than without. Without these types, libraries and functions need to
check
> > > that the data passed to it is valid. This results in redundant checks.
> > > Having types supplemented with the "of" command speeds it up many-fold
> >over
> > > the old type system, because data will only be checked that has
changed.
> >And
> > > because that if a variable is of a certain type, it is *known* already
> >that
> > > it is valid, so it won't need to be checked multiple times.
> >
> >What you are saying is true, but it comes at a cost to RDS. It introduces
> >more potential bugs in the interpreter and translator. It also will cause
> >Euphoria apps to run slower than if they had no type checking.
>
>
> >BTW, how would the user defined type routine get to know which parts of
> >the variable have been modified? Currently, the entire object is passed
> >to the routine, but there is no indication which parts were changed.
>
> This is what forms the basis of this suggestion...
> Say you have this:
> type positive_int(integer t)
>     return (t >= 0)
> end type
> type index( sequence of positive_int x)
>    return length(x) < 10
> end type
>
> If an element or groups of elements change, the base type (index) is not
> checked, but each element that changed has the "of" type (positive_int)
> checked against it. If the aggregate properties of the index change (ie
> length), then the base type is checked, then the elements that changed.
The
> base type should only check things that affect the entire sequence, like
> length.
>
> Example:
> index x
> x = {6}           --1 checks base, and first element
> x &= 4          --3 checks base (cause length has changed) and new
element,
> but not first
> x[1] = 5        --4 checks element 1, but not base (length has not
changed)
> x[1..2] = {4,1}   --5 checks element 1 and 2
> x &= {10, 0}   --6 checks base, and the new elements, but not existing
> x = {0}          --7 x is completely reassigned. If existing index
assigned,
> no checking is done, otherwise check everything
> x[1] = -1  --error here (element 1 is checked, and fails.)
>
> That sounds a little convoluted, but to check the base type is very
simple,
> and the interpreter could optimise the base type right out if all it does
is
> return 1.
> Also, if you append a slice of an index to an existing index, then the
types
> are the same - you don't need to recheck all those elements, just the
> literals and originally non-indexes.
>
> > > It's still a valid argument (slicing and seq errors). It's very easy
to
> > > misplace a subscript, or make some error if you are breaking up and
> > > reassembling a sequence in one line.
> >
> >Yep. Any ideas how to make this less error prone?
>
> See rest of thread :o)
> The checking would not protect against mis-slicing (ie 2..n instead of
> 2..n-1), but it would protect the structure of your data, and you wouldn't
> get mysterious "Attempt to subscript atom" messages 100 lines further on.
> And, as shown above, reassembling these types would not cause much
> performance penalty. For example, to right(left?) shift a sequence, you
just
> write myvar = mvar[2..length(myvar)] & {myvar[1]} ----(or append, or
> something). Because all of the elements are already known to be a certain
> type, and are being assigned to the same level as they were before, they
> don't need to be checked. Only the base type does.
>
> > > >Ummmm? Why not code so that it works?
> > > >
> > > >  type myStruct (object s)
> > > >      if sequence(s) return 1
> > > >      else return 0
> > > >  end type
> > >
> > > Because it is non-intuitive.
> >
> >It is for me. It is saying that if 's' is a sequence then its okay
> >otherwise
> >its not okay.
>
> Well if it's not ok, it should shortcut and return 0 before processing the
> type body rather than crashing. This is more of an implementation thing,
in
> that it would affect processing for the "of" system. It should be apparent
> from reading the above
>
> > > Derek, if someone passes a badly-formed sequence into one of the
> >Win32lib
> > > functions, the error is stated to be inside that function, when in
fact
> >it
> > > was the fault of the other programmer. The trace window or ex.err may
> >show
> > > the value of the sequence (or maybe only the first part), but it may
not
> >be
> > > easy to elucidate that a) the error resulted from their own mistake.
b)
> > > Exactly what was wrong with that sequence they passed, anyway.
> >
> >That's what documentation is useful for blink
>
> True, but sometimes it's there, sometimes it's not. :o)
>
>
>
>

new topic     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu