1. Re: Data validation (was Re: Stu--- Just how many times has this changed?)
- Posted by "Michelle Rogers" <michellerogers at bellsouth.net> Jun 02, 2004
- 531 views
see *grins* ..if I didn't provide such interesting sounding subject lines... 1. First, no one would read it...and 2. Then people couldn't have such fun changing them to other things.. Michelle Rogers ----- Original Message ----- From: "Patrick Barnes" <mistertrik at hotmail.com> To: <EUforum at topica.com> Sent: Tuesday, June 01, 2004 10:16 PM Subject: RE: Data validation (was Re: Stu--- Just how many times has this changed?) > > > MrTrick > > > > > >From: Derek Parnell <guest at RapidEuphoria.com> > > > >Subject: Re: Stupid Newbie-sounding question. > > > > > Okay Rob... What sort of structure do these sequences have to have? > >I > > > >don't > > > > > know! If we had the ability to use stronger typing, then function > > > > > myFunc(text_picture r1, string r2, flag_list r3) would make a lot > >more > > > > > sense. I could go and look at the type declaration, and figure out > >what > > > >to > > > > > pass to the function. > > > > > People don't use the types like this at the moment because it's so > >damn > > > > > computationally expensive for sequences. > > > > > > > >This is true. And there is the trade off - speed & flexibility against > > > >structure and complexity. Think of it in terms of a bicycle without > > > >training wheels and one with training wheels> > > > > > > >RDS has chosen speed. Meaning that the coder has more responsibility to > > > >provide quality assurance than the translator does. Yes this does mean > > > >more work for the coder and better discipline. > > > > > > Well, if you're writing a library that other people would use, I would > >say > > > that it's the library programmer's responsibility to check the data > >passed > > > to it. > > > >This is a good issue to discuss. I must deal with this problem with the > >Win32lib library. The question for me is, how *much* parameter validation > >should the library perform? > > > >On one extreme, it could be said that I should not do any parameter > >validation because it is the *user's* responsibility to provide > >parameter values as documented. If they do not heed the documentation, why > >is it considered to be my problem? This approach would speed up the > >execution of win32lib applications, and slow down the development of > >quality applications. > > > >On the other extreme, I should do everything in my power to protect > >the user from using incorrect parameter values. As a service to the > >coder (and end-user?) I should try to provide meaningful exception > >messages and/or codes when I detect unusable parameter values being > >passed. This approach would speed up the development of > >quality applications and slow down the execution of win32lib applications. > > > >And there are many shades in between these two extremes. Maybe I should > >provide two versions of the library - one with training wheels and one > >without? > > > >At this stage I don't have the answer. The current library does some > >checking but could do more. > > > > > I think that types could be implemented in a way that would run *faster* > > > than without. Without these types, libraries and functions need to check > > > that the data passed to it is valid. This results in redundant checks. > > > Having types supplemented with the "of" command speeds it up many-fold > >over > > > the old type system, because data will only be checked that has changed. > >And > > > because that if a variable is of a certain type, it is *known* already > >that > > > it is valid, so it won't need to be checked multiple times. > > > >What you are saying is true, but it comes at a cost to RDS. It introduces > >more potential bugs in the interpreter and translator. It also will cause > >Euphoria apps to run slower than if they had no type checking. > > > >BTW, how would the user defined type routine get to know which parts of > >the variable have been modified? Currently, the entire object is passed > >to the routine, but there is no indication which parts were changed. > > This is what forms the basis of this suggestion... > Say you have this: > type positive_int(integer t) > return (t >= 0) > end type > type index( sequence of positive_int x) > return length(x) < 10 > end type > > If an element or groups of elements change, the base type (index) is not > checked, but each element that changed has the "of" type (positive_int) > checked against it. If the aggregate properties of the index change (ie > length), then the base type is checked, then the elements that changed. The > base type should only check things that affect the entire sequence, like > length. > > Example: > index x > x = {6} --1 checks base, and first element > x &= 4 --3 checks base (cause length has changed) and new element, > but not first > x[1] = 5 --4 checks element 1, but not base (length has not changed) > x[1..2] = {4,1} --5 checks element 1 and 2 > x &= {10, 0} --6 checks base, and the new elements, but not existing > x = {0} --7 x is completely reassigned. If existing index assigned, > no checking is done, otherwise check everything > x[1] = -1 --error here (element 1 is checked, and fails.) > > That sounds a little convoluted, but to check the base type is very simple, > and the interpreter could optimise the base type right out if all it does is > return 1. > Also, if you append a slice of an index to an existing index, then the types > are the same - you don't need to recheck all those elements, just the > literals and originally non-indexes. > > > > It's still a valid argument (slicing and seq errors). It's very easy to > > > misplace a subscript, or make some error if you are breaking up and > > > reassembling a sequence in one line. > > > >Yep. Any ideas how to make this less error prone? > > See rest of thread :o) > The checking would not protect against mis-slicing (ie 2..n instead of > 2..n-1), but it would protect the structure of your data, and you wouldn't > get mysterious "Attempt to subscript atom" messages 100 lines further on. > And, as shown above, reassembling these types would not cause much > performance penalty. For example, to right(left?) shift a sequence, you just > write myvar = mvar[2..length(myvar)] & {myvar[1]} ----(or append, or > something). Because all of the elements are already known to be a certain > type, and are being assigned to the same level as they were before, they > don't need to be checked. Only the base type does. > > > > >Ummmm? Why not code so that it works? > > > > > > > > type myStruct (object s) > > > > if sequence(s) return 1 > > > > else return 0 > > > > end type > > > > > > Because it is non-intuitive. > > > >It is for me. It is saying that if 's' is a sequence then its okay > >otherwise > >its not okay. > > Well if it's not ok, it should shortcut and return 0 before processing the > type body rather than crashing. This is more of an implementation thing, in > that it would affect processing for the "of" system. It should be apparent > from reading the above > > > > Derek, if someone passes a badly-formed sequence into one of the > >Win32lib > > > functions, the error is stated to be inside that function, when in fact > >it > > > was the fault of the other programmer. The trace window or ex.err may > >show > > > the value of the sequence (or maybe only the first part), but it may not > >be > > > easy to elucidate that a) the error resulted from their own mistake. b) > > > Exactly what was wrong with that sequence they passed, anyway. > > > >That's what documentation is useful for
> > True, but sometimes it's there, sometimes it's not. :o) > > > >