1. find/match not working

I have a list of urls, contained in urllist like:
urllist[1], urllist[2]..urllist[x]

and i tried eliminating duplicates before they are added, by using find, which 
didn't work. Now i am using 

found = 0
for loop = 1 to length(urllist) do
  if match(junk,urllist[loop]) then
     found = 1
     exit
   end if
 end for
 if not found then
   urllist = urllist & {junk}
  end if

And darned if i still don't get duplicates. I do trim off leading and trailing 
spaces from junk. Why is this happening?

Kat

new topic     » topic index » view message » categorize

2. Re: find/match not working

Hi Kat

my be there is some "enters" in the end of the sequences...

Rubens

At 07:50 17/8/2004, you wrote:
>
>
>I have a list of urls, contained in urllist like:
>urllist[1], urllist[2]..urllist[x]
>
>and i tried eliminating duplicates before they are added, by using find, 
>which
>didn't work. Now i am using
>
>found = 0
>for loop = 1 to length(urllist) do
>   if match(junk,urllist[loop]) then
>      found = 1
>      exit
>    end if
>  end for
>  if not found then
>    urllist = urllist & {junk}
>   end if
>
>And darned if i still don't get duplicates. I do trim off leading and 
>trailing
>spaces from junk. Why is this happening?
>
>Kat
>
>
>

new topic     » goto parent     » topic index » view message » categorize

3. Re: find/match not working

Kat wrote:
> 
> I have a list of urls, contained in urllist like:
> urllist[1], urllist[2]..urllist[x]
> 
> and i tried eliminating duplicates before they are added, by using find, which
>
> didn't work. Now i am using 
> 
> found = 0
> for loop = 1 to length(urllist) do
>   if match(junk,urllist[loop]) then
>      found = 1
>      exit
>    end if
>  end for
>  if not found then
>    urllist = urllist & {junk}
>   end if
> 
> And darned if i still don't get duplicates. I do trim off leading and trailing
>
> spaces from junk. Why is this happening?
> 
> Kat
> 
> 


Hi

I seem to remember having a problem like this to - try taking off the first and
last character of junk and see what happens

eg
 found = 0
 for loop = 1 to length(urllist) do
   if match(junk[2..length(junk)-1,urllist[loop]) then
      found = 1
      exit
    end if
  end for
  if not found then
    urllist = urllist & {junk}
   end if


assuming that hunk is a full url, or the match isn't THAT critical

Chris

new topic     » goto parent     » topic index » view message » categorize

4. Re: find/match not working

Chris Burch wrote:
--snip--

> assuming that hunk is a full url, or the match isn't THAT critical
> 
> Chris
> 

the hunk should be junk!

new topic     » goto parent     » topic index » view message » categorize

5. Re: find/match not working

Kat wrote:
> 
> I have a list of urls, contained in urllist like:
> urllist[1], urllist[2]..urllist[x]
> 
> and i tried eliminating duplicates before they are added, by using find, which
>
> didn't work. Now i am using 
> 
> found = 0
> for loop = 1 to length(urllist) do
>   if match(junk,urllist[loop]) then
>      found = 1
>      exit
>    end if
>  end for
>  if not found then
>    urllist = urllist & {junk}
>   end if
> 
> And darned if i still don't get duplicates. I do trim off leading and trailing
>
> spaces from junk. Why is this happening?
> 
> Kat
> 
> 
integer a
 sequence newlist,found
  newlist={}
  found={}
 for loop=1 to length(urllist) do
   a=find(urllist[loop],found))
   if a=0 then
     newlist=append(newlist,urllist[loop])
     found=append(found,urllist[loop])
   end if
 end for

new list should have no duplicates don cole SF }}}

new topic     » goto parent     » topic index » view message » categorize

6. Re: find/match not working

I re-thought this.
integer a
 sequence newlist,found
  newlist={}
  found={}
 for loop=1 to length(urllist) do
   a=find(urllist[loop],found))
   if a=0 then
      found=append(found,urllist[loop])  
    -- newlist=append(newlist,urllist[loop])--you don't really need newlist
                                            --unless you are going to alter
   end if                                   --urllist[loop] before you add 
 end for                                    --it to your sequence.
 urllist=found
--now your list is free of duplicates--

for loop=1 to length(urllist) do
   if find(junk,urllist[loop]) then
     exit
   end if
end for
  urllist=urlist & {junk}


don cole
SF

new topic     » goto parent     » topic index » view message » categorize

7. Re: find/match not working

On 17 Aug 2004, at 8:17, don cole wrote:

> 
> 
> posted by: don cole <doncole at pacbell.net>
> 
> I re-thought this.
> }}}
<eucode>
>  integer a
>  sequence newlist,found
>   newlist={}
>   found={}
>  for loop=1 to length(urllist) do
>    a=find(urllist[loop],found))
>    if a=0 then
>       found=append(found,urllist[loop])  
>     -- newlist=append(newlist,urllist[loop])--you don't really need newlist
>                                         --unless you are going to alter
>    end if                                   --urllist[loop] before you add 
>  end for                                    --it to your sequence.
>  urllist=found
> --now your list is free of duplicates--

The list started out with nothing in it, so i am somewhat sure it had no 
duplicates. The following code you wrote is exactly what i had before i went 
to using match() instead.

> for loop=1 to length(urllist) do
>    if find(junk,urllist[loop]) then
>      exit
>    end if
> end for
>   urllist=urlist & {junk}
> </eucode>
{{{

> 
> don cole
> SF

Kat

new topic     » goto parent     » topic index » view message » categorize

8. Re: find/match not working

Kat wrote:
> 
> On 17 Aug 2004, at 8:17, don cole wrote:
> 
> > 
> > posted by: don cole <doncole at pacbell.net>
> > 
> > I re-thought this.
> <font color="#330033">> </font>
> <font color="#330033">>  </font><font color="#FF00FF">integer </font><font
> color="#330033">a</font>
> <font color="#330033">>  </font><font color="#FF00FF">sequence </font><font
> color="#330033">newlist,found</font>
> <font color="#330033">>   newlist={}</font>
> <font color="#330033">>   found={}</font>
> <font color="#330033">>  </font><font color="#0000FF">for </font><font
> color="#330033">loop=1 </font><font color="#0000FF">to </font><font
> color="#FF00FF">length</font><font color="#330033">(urllist) </font><font
> color="#0000FF">do</font>
> <font color="#330033">>    a=</font><font color="#FF00FF">find</font><font
> color="#330033">(urllist</font><font color="#993333">[</font><font
> color="#330033">loop</font><font color="#993333">]</font><font
> color="#330033">,found))</font>
> <font color="#330033">>    </font><font color="#0000FF">if </font><font
> color="#330033">a=0 </font><font color="#0000FF">then</font>
> <font color="#330033">>       found=</font><font
> color="#FF00FF">append</font><font color="#330033">(found,urllist</font><font
> color="#993333">[</font><font color="#330033">loop</font><font
> color="#993333">]</font><font color="#330033">)  </font>
> <font color="#330033">>     </font><font color="#FF0055">--
> newlist=append(newlist,urllist[loop])--you don't really need newlist</font>
> <font color="#330033">>                                         </font><font
> color="#FF0055">--unless you are going to alter</font>
> <font color="#330033">>    </font><font color="#0000FF">end if                
>                   </font><font color="#FF0055">--urllist[loop] before you add
> </font>
> <font color="#330033">>  </font><font color="#0000FF">end for                 
>                   </font><font color="#FF0055">--it to your sequence.</font>
> <font color="#330033">>  urllist=found</font>
> <font color="#330033">> </font><font color="#FF0055">--now your list is free
> of duplicates--</font>
> <font color="#330033"></font>
> <font color="#330033">The list started out </font><font color="#0000FF">with
> </font><font color="#330033">nothing in it, so i am somewhat sure it had no
> </font>
> <font color="#330033">duplicates. The following code you wrote is exactly what
> i had before i went </font>
> <font color="#0000FF">to </font><font color="#330033">using </font><font
> color="#FF00FF">match</font><font color="#330033">() instead.</font>
> <font color="#330033"></font>
> <font color="#330033">> </font><font color="#0000FF">for </font><font
> color="#330033">loop=1 </font><font color="#0000FF">to </font><font
> color="#FF00FF">length</font><font color="#330033">(urllist) </font><font
> color="#0000FF">do</font>
> <font color="#330033">>    </font><font color="#0000FF">if </font><font
> color="#FF00FF">find</font><font color="#330033">(junk,urllist</font><font
> color="#993333">[</font><font color="#330033">loop</font><font
> color="#993333">]</font><font color="#330033">) </font><font
> color="#0000FF">then</font>
> <font color="#330033">>      </font><font color="#0000FF">exit</font>
> <font color="#330033">>    </font><font color="#0000FF">end if</font>
> <font color="#330033">> </font><font color="#0000FF">end for</font>
> <font color="#330033">>   urllist=urlist & {junk}</font>
> <font color="#330033">> </font>
> > 
> > don cole
> > SF
> 
> Kat
> 
> 
Ok try
global function my_find(sequence a,sequence b)
  integer an
   a=trim(lower(a))
    b=trim(lower(b))
    an=find(a,b)
    return an
end function

global function my_match(sequence a,sequence b) integer an maches 2 squences of anny length a=trim(lower(a)) b=trim(lower(b)) an=match(a,b) return an end function

don cole SF }}}

new topic     » goto parent     » topic index » view message » categorize

9. Re: find/match not working

On 17 Aug 2004, at 9:42, don cole wrote:

< 
< 
< posted by: don cole <doncole at pacbell.net<
< 
< Kat wrote:
< < 
< < On 17 Aug 2004, at 8:17, don cole wrote:
< < 
< < < 
< < < posted by: don cole <doncole at pacbell.net<
< < < 
< < < I re-thought this.
< < <font color="#330033"<< </font<
< < <font color="#330033"<<  </font<<font color="#FF00FF"<integer 
</font<<font
< < color="#330033"<a</font< <font color="#330033"<<  </font<<font
< < color="#FF00FF"<sequence </font<<font 
color="#330033"<newlist,found</font<
< < <font color="#330033"<<   newlist={}</font< <font color="#330033"<<  
< < found={}</font< <font color="#330033"<<  </font<<font 
color="#0000FF"<for
< < </font<<font color="#330033"<loop=1 </font<<font color="#0000FF"<to
< < </font<<font color="#FF00FF"<length</font<<font 
color="#330033"<(urllist)
< < </font<<font color="#0000FF"<do</font< <font color="#330033"<<   
< < a=</font<<font color="#FF00FF"<find</font<<font
< < color="#330033"<(urllist</font<<font color="#993333"<[</font<<font
< < color="#330033"<loop</font<<font color="#993333"<]</font<<font
< < color="#330033"<,found))</font< <font color="#330033"<<    </font<<font
< < color="#0000FF"<if </font<<font color="#330033"<a=0 </font<<font
< < color="#0000FF"<then</font< <font color="#330033"<<       
found=</font<<font
< < color="#FF00FF"<append</font<<font 
color="#330033"<(found,urllist</font<<font
< < color="#993333"<[</font<<font color="#330033"<loop</font<<font
< < color="#993333"<]</font<<font color="#330033"<)  </font< <font
< < color="#330033"<<     </font<<font color="#FF0055"<--
< < newlist=append(newlist,urllist[loop])--you don't really need newlist</font<
< < <font color="#330033"<<                                        </font<<font
< < color="#FF0055"<--unless you are going to alter</font< <font 
color="#330033"<<
< <    </font<<font color="#0000FF"<end if                                  
< < </font<<font color="#FF0055"<--urllist[loop] before you add </font< <font
< < color="#330033"<<  </font<<font color="#0000FF"<end for                     
  
< <             </font<<font color="#FF0055"<--it to your sequence.</font< <font
< < color="#330033"<<  urllist=found</font< <font color="#330033"<< 
</font<<font
< < color="#FF0055"<--now your list is free of duplicates--</font< <font
< < color="#330033"<</font< <font color="#330033"<The list started out
< < </font<<font color="#0000FF"<with </font<<font 
color="#330033"<nothing in it,
< < so i am somewhat sure it had no </font< <font 
color="#330033"<duplicates. The
< < following code you wrote is exactly what i had before i went </font< <font
< < color="#0000FF"<to </font<<font color="#330033"<using </font<<font
< < color="#FF00FF"<match</font<<font color="#330033"<() instead.</font< 
<font
< < color="#330033"<</font< <font color="#330033"<< </font<<font
< < color="#0000FF"<for </font<<font color="#330033"<loop=1 </font<<font
< < color="#0000FF"<to </font<<font color="#FF00FF"<length</font<<font
< < color="#330033"<(urllist) </font<<font color="#0000FF"<do</font< <font
< < color="#330033"<<    </font<<font color="#0000FF"<if </font<<font
< < color="#FF00FF"<find</font<<font 
color="#330033"<(junk,urllist</font<<font
< < color="#993333"<[</font<<font color="#330033"<loop</font<<font
< < color="#993333"<]</font<<font color="#330033"<) </font<<font
< < color="#0000FF"<then</font< <font color="#330033"<<      </font<<font
< < color="#0000FF"<exit</font< <font color="#330033"<<    </font<<font
< < color="#0000FF"<end if</font< <font color="#330033"<< </font<<font
< < color="#0000FF"<end for</font< <font color="#330033"<<   urllist=urlist 
&
< < {junk}</font< <font color="#330033"<< </font<
< < < 
< < < don cole
< < < SF
< < 
< < Kat
< < 
< < 
< Ok try
< <eucode<
< global function my_find(sequence a,sequence b)
<   integer an
<    a=trim(lower(a))
<     b=trim(lower(b))
<     an=find(a,b)
<     return an
< end function
< 
< global function my_match(sequence a,sequence b)
<    integer an    --maches 2 squences of anny length
<     a=trim(lower(a))
<     b=trim(lower(b))
<     an=match(a,b)  
<    return an
< end function

The variables are already trimmed, and a printout to a file shows me they are 
the same case.

The code runs with match() and i am not going to take it down to try new 
find() code. I have 5280 urls to mine by tonite. Due to the telco here cutting 
me offline this morning to verify i can get online, and other assorted 
madness, i hope i make it. 

Kat

< don cole
< SF
< 
< 
< 
<

new topic     » goto parent     » topic index » view message » categorize

10. Re: find/match not working

On Tue, 17 Aug 2004 16:30:00 +0000, Pete E <euphoria at eberlein.org>
wrote:

>Perhaps you want:
>
>  found = find (junk, urllist)
>  if not found then
>    urllist = urllist & {junk}
>  end if
>
>
Kat, perhaps you could trap your program at some point when it has
created a duplicate, and create an ex.err or print() the values so
this (the above code misbehaving) can be reproduced?

new topic     » goto parent     » topic index » view message » categorize

11. Re: find/match not working

Kat wrote:
> 
> I have a list of urls, contained in urllist like:
> urllist[1], urllist[2]..urllist[x]
> 
> and i tried eliminating duplicates before they are added, by using find, which
>
> didn't work. Now i am using 
> 

I have a list of urls, contained in urllist like:
urllist[1], urllist[2]..urllist[x]

and i tried eliminating duplicates before they are added, by using find, which 
didn't work. Now i am using 
> 
> Kat
> 
> 
Compile your complete list duplicates and all.
Don't worry if they are added.
Then eliminate all duplicates with:
function clean_dups(list)
  integer a
  sequence found
 for x=1 to length(list) do
  a=find(list[x],found)
  if a=0 then
     found=append(found,list[x])
  end if
 end for
 return found
end function

It might take longer but will work.
don cole 
SF

new topic     » goto parent     » topic index » view message » categorize

12. Re: find/match not working

Kat wrote:

> found = 0
> for loop = 1 to length(urllist) do
>   if match(junk,urllist[loop]) then
>      found = 1
>      exit
>    end if
>  end for
>  if not found then
>    urllist = urllist & {junk}
>   end if

When doing this sort of thing, I usually do this ...

 urltemp = upper(urllist)
 junk = trim(junk)
 found = find(upper(junk), urltemp)
 if not found then
    urllist = append(urllist, junk)
 end if

In other words, trim both data and do an case-insensitive search.

The find() function looks for a single element that exactly
matches the subject argument.

The match() function looks for a set of adjacent elements that
exactly matches all the elements in the subject argument.

To use match() in the way you have above might work better as ...

   if match(junk,urllist[loop]) = 1  and 
      length(junk) = length(urllist[loop]) then

in other words, ensure that if you get a match it starts
with the first character and is the same length. But that's
now equivalent to equal()! So it might be better to do ...

   if equal(junk,urllist[loop]) then

but if you are doing that, you may as well use find()
as it will work faster.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

13. Re: find/match not working

I am so sorry i brought up this thread too. Like i said days ago, i printed all 
the additions to a file, and printed the list they were supposedly not found in.
The additions list contained the same items over and over again,same case, 
same length, same everything,,, identical lines, more than once. And they 
got added to the list that find() didn't find them in.

Kat

On 19 Aug 2004, at 10:46, Bernard Ryan wrote:

> 
> 
> posted by: Bernard Ryan <xotron at bluefrog.com>
> 
> Kat:
> 
> }}}
<eucode>
> include wildcard.e
> --
> procedure add2list( sequence new_url, sequence url_list )
> --
>   for loop = 1 to length(url_list) do
>     if equal(lower(new_url),lower(url_list[loop])) then return end if
>   end for
>   url_list &= {new_url}
> --
> procedure
> </eucode>
{{{

> 
> Bernie
> 
> My files in archive:
> http://www.rapideuphoria.com/w32engin.zip
> http://www.rapideuphoria.com/mixedlib.zip
> http://www.rapideuphoria.com/eu_engin.zip
> http://www.rapideuphoria.com/win32eru.zip
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

14. Re: find/match not working

On 18 Aug 2004, at 18:05, Derek Parnell wrote:

> 
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> Kat wrote:
> 
> > found = 0
> > for loop = 1 to length(urllist) do
> >   if match(junk,urllist[loop]) then
> >      found = 1
> >      exit
> >    end if
> >  end for
> >  if not found then
> >    urllist = urllist & {junk}
> >   end if
> 
> When doing this sort of thing, I usually do this ...
> 
>  urltemp = upper(urllist)
>  junk = trim(junk)
>  found = find(upper(junk), urltemp)
>  if not found then
>     urllist = append(urllist, junk)
>  end if
> 
> In other words, trim both data and do an case-insensitive search.
> 
> The find() function looks for a single element that exactly
> matches the subject argument.
> 
> The match() function looks for a set of adjacent elements that
> exactly matches all the elements in the subject argument.

According to the files i printed out, the items were the same. But find() didn't
find them. 

Kat

> To use match() in the way you have above might work better as ...
> 
>    if match(junk,urllist[loop]) = 1  and 
>       length(junk) = length(urllist[loop]) then
> 
> in other words, ensure that if you get a match it starts
> with the first character and is the same length. But that's
> now equivalent to equal()! So it might be better to do ...
> 
>    if equal(junk,urllist[loop]) then
> 
> but if you are doing that, you may as well use find()
> as it will work faster.
> 
> -- 
> Derek Parnell
> Melbourne, Australia
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

15. Re: find/match not working

Kat wrote:

> According to the files i printed out, the items were the same.
> But find() didn't find them. 

That's like one of my users telling me my software doesn't work,
yet it's working on hundreds of other PCs. I'm incredulous at that point,
and usually, with a bit of investigation, the situation isn't exactly
as they claimed, it turns out to be a UE, and another case of UDRTFM.

I've been in the same situation: claiming that the software was
screwing up only to discover it was I who was doing the screwing.

Wait, that didn't come out right. :)

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

16. Re: find/match not working

cklester wrote:
> 
> Kat wrote:
> 
> > According to the files i printed out, the items were the same.
> > But find() didn't find them. 
> 
> That's like one of my users telling me my software doesn't work,
> yet it's working on hundreds of other PCs. I'm incredulous at that point,
> and usually, with a bit of investigation, the situation isn't exactly
> as they claimed, it turns out to be a UE, and another case of UDRTFM.

I had a client call once: "Your software quit working!"

Me: "Did it work yesterday?"

Client: "Yes"

Me: "Did you do anything on your computer since yesterday?"

Client: "No. Just cleaned out a bunch of useless files."

Me: "What were their names?"

Client: "I don't know!  Just a bunch of .dat and .exe stuff I never use.
You need to come fix this!"

Me: "I'll be right over. Get out your checkbook."

Next time they called, I was "out of the country" :)

Irv

new topic     » goto parent     » topic index » view message » categorize

17. Re: find/match not working

Kat wrote:

> According to the files i printed out, the items were the same. But find()
> didn't
> find them. 

Perhaps there were some control codes which don't get printed.

Irv

new topic     » goto parent     » topic index » view message » categorize

18. Re: find/match not working

irv mullins wrote:
> cklester wrote:
> > That's like one of my users telling me my software doesn't work,
> > yet it's working on hundreds of other PCs. I'm incredulous at that point,
> > and usually, with a bit of investigation, the situation isn't exactly
> > as they claimed, it turns out to be a UE, and another case of UDRTFM.
> 
> I had a client call once: "Your software quit working!"
> Me: "Did it work yesterday?"
> Client: "Yes"
> Me: "Did you do anything on your computer since yesterday?"
> Client: "No. Just cleaned out a bunch of useless files."

That's exactly how a lot of my conversations go, except they say...

Client: "No. Nothing's changed at all whatsoever."

At this point, I have to tell them:

Me: "Software doesn't just stop working for no reason."

We later find that either 1) their application was upgraded, which
requires an update to our software as well, or 2) they upgraded their
version of Windows, which sometimes requires an upgrade of our software.

So much for "Nothing's changed at all whatsoever."

> Me: "I'll be right over. Get out your checkbook."

I'm memorizing that one! :D

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

19. Re: find/match not working

Kat wrote:

> According to the files i printed out, the items were the same. But find()
> didn't
> find them. 
> 

Is it possible for you to show us your source code? Not the 'sample'
code shown before, but the actual code you are really using.

So far, it sounds like everyone else but you can do this, so I'm 
guessing you have a mistake in your code somewhere. The only way
to prove otherwise is to expose your code to peer review.

Have you actually tried *any* of the code given to you by others? If
so, what was the results of that?

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

20. Re: find/match not working

On 19 Aug 2004, at 15:04, Derek Parnell wrote:

> 
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> Kat wrote:
> 
> > According to the files i printed out, the items were the same. But find()
> > didn't find them. 
> > 
> 
> Is it possible for you to show us your source code? Not the 'sample'
> code shown before, but the actual code you are really using.

I did, i copy/pasted the code to this listserv days ago.

> So far, it sounds like everyone else but you can do this, so I'm 
> guessing you have a mistake in your code somewhere. The only way
> to prove otherwise is to expose your code to peer review.

Which is why i posted it here.

> Have you actually tried *any* of the code given to you by others? If
> so, what was the results of that?

Like i said, no. The code is running as is *now*, looping thru the list and 
using match on each item in the list. I don't want to interrupt it and waste 
internet bandwidth on a test, in case someone is waiting on the results of the 
code run.

Sorry i sound so unreasonable, but i gave you the code i was using that 
didn't run, said i do not feel i can stop the code for tests, and it's running
with
match(). In the files i printed out of the repeats, Textpad's search function 
found the items find() didn't.

Kat

new topic     » goto parent     » topic index » view message » categorize

21. Re: find/match not working

At 02:28 PM 8/19/04 -0500, you wrote:
>
>
>I am so sorry i brought up this thread too. Like i said days ago, i 
>printed all
>the additions to a file, and printed the list they were supposedly not 
>found in.
>The additions list contained the same items over and over again,same case,
>same length, same everything,,, identical lines, more than once. And they
>got added to the list that find() didn't find them in.
>
>Kat

<snip>

Kat,

         Have you looked at the data with a hex editor?  I'm sure that 
you're aware that large datasets often have unexpected "junk" in 
them.  Maybe a fresh pair of eyes would help?  I'd be more than happy to 
look at it, if you want to send me a sample that screws up.  Maybe 1 Megs 
worth.  Send it privately.

                 Bob

new topic     » goto parent     » topic index » view message » categorize

22. Re: find/match not working

Kat wrote:
> 
> On 19 Aug 2004, at 15:04, Derek Parnell wrote:
> 
> > 
> > posted by: Derek Parnell <ddparnell at bigpond.com>
> > 
> > Kat wrote:
> > 
> > > According to the files i printed out, the items were the same. But find()
> > > didn't find them. 
> > > 
> > 
> > Is it possible for you to show us your source code? Not the 'sample'
> > code shown before, but the actual code you are really using.
> 
> I did, i copy/pasted the code to this listserv days ago.


The code you posted was this...


found = 0
for loop = 1 to length(urllist) do
  if match(junk,urllist[loop]) then
     found = 1
     exit
   end if
 end for
 if not found then
   urllist = urllist & {junk}
  end if
--------------

Try this instead...

-------------
if not find(junk, urllist) then 
   urllist = urlist & {junk} 
end if
------------


-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

23. Re: find/match not working

Kat

< On 17 Aug 2004, at 9:42, don cole wrote:
< Ok try
< <eucode<
< global function my_find(sequence a,sequence b)
<   integer an
<    a=trim(lower(a))
<     b=trim(lower(b))
<     an=find(a,b)
<     return an
< end function
< 
< global function my_match(sequence a,sequence b)
<    integer an    --maches 2 squences of anny length
<     a=trim(lower(a))
<     b=trim(lower(b))
<     an=match(a,b)  
<    return an
< end function

> Kat wrote:
>The code runs with match() and i am not going to take it down to try new 
>find() code. I have 5280 urls to mine by tonite. Due to the telco here cutting 
>me offline this morning to verify i can get online, and other assorted 
>madness, i hope i make it. 

>Kat

< 
> On 19 Aug 2004, at 15:04, Derek Parnell wrote:
> 
> > Have you actually tried *any* of the code given to you by others? If
> > so, what was the results of that?
> 
> Kat wrote:
> Like i said, no. The code is running as is *now*, looping thru the list and 
> using match on each item in the list. I don't want to interrupt it and waste 
> internet bandwidth on a test, in case someone is waiting on the results of the
>
> code run.

I'm a fairly newbbie to this board and might be missing something here,
but I don't see how do you expect to resolve this issue if you are
unwilling to CHANGE YOUR CODE.

don cole
SF

new topic     » goto parent     » topic index » view message » categorize

24. Re: find/match not working

On 20 Aug 2004, at 2:14, don cole wrote:

> > Kat wrote:
> > Like i said, no. The code is running as is *now*, looping thru the list and
> > using match on each item in the list. I don't want to interrupt it and waste
> > internet bandwidth on a test, in case someone is waiting on the results of
> > the
> > code run.
> 
> I'm a fairly newbbie to this board and might be missing something here,
> but I don't see how do you expect to resolve this issue if you are
> unwilling to CHANGE YOUR CODE.

I DID bloody change the code, i am not using find() anymore, i am using 
match(). I have said that repeatedly, what's the problem?

Kat

new topic     » goto parent     » topic index » view message » categorize

25. Re: find/match not working

Kat wrote:
> 
> On 20 Aug 2004, at 2:14, don cole wrote:
> 
> > > Kat wrote:
> > > Like i said, no. The code is running as is *now*, looping thru the list
> > > and
> > > using match on each item in the list. I don't want to interrupt it and
> > > waste
> > > internet bandwidth on a test, in case someone is waiting on the results of
> > > the
> > > code run.
> > 
> > I'm a fairly newbbie to this board and might be missing something here,
> > but I don't see how do you expect to resolve this issue if you are
> > unwilling to CHANGE YOUR CODE.
> 
> I DID bloody change the code, i am not using find() anymore, i am using 
> match(). I have said that repeatedly, what's the problem?

Problem: Kat says find() doesn't work. Gives example using match(),
which also is faulty.

Solution: Multiple examples in which find() is shown to work.

Result: Kat refuses to try find() again. That is Kat refuses to change her
code (yet again) to use find() as per any of the solutions.

Consequence: Confusion as to whether or not Kat really wants help.

Kat, if don't want help with find(), why bring it up? If you do want
help with find(), why don't do accept it?

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

26. Re: find/match not working

On 20 Aug 2004, at 16:10, Derek Parnell wrote:

> 
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> Kat wrote:
> > 
> > On 20 Aug 2004, at 2:14, don cole wrote:
> > 
> > > > Kat wrote:
> > > > Like i said, no. The code is running as is *now*, looping thru the list
> > > > and using match on each item in the list. I don't want to interrupt it
> > > > and
> > > > waste internet bandwidth on a test, in case someone is waiting on the
> > > > results of the code run.
> > > 
> > > I'm a fairly newbbie to this board and might be missing something here,
> > > but
> > > I don't see how do you expect to resolve this issue if you are unwilling
> > > to
> > > CHANGE YOUR CODE.
> > 
> > I DID bloody change the code, i am not using find() anymore, i am using 
> > match(). I have said that repeatedly, what's the problem?
> 
> Problem: Kat says find() doesn't work. Gives example using match(),
> which also is faulty.

How is that faulty?? For the strings i am using, it is working.

> Solution: Multiple examples in which find() is shown to work.
> 
> Result: Kat refuses to try find() again. That is Kat refuses to change her
> code (yet again) to use find() as per any of the solutions.

See below.

> Consequence: Confusion as to whether or not Kat really wants help.
> 
> Kat, if don't want help with find(), why bring it up? If you do want
> help with find(), why don't do accept it?

I had a problem with find(). I reported it. I recoded the apps to use match(). 
Programs are now running fine, as of when i gave up on find() and began 
using match() and reported the problem here. My mistakes were reporting 
the problem here, and not saving the code and data that didn't run; but i was 
in a hurry (i noticed the dupes after 4 hrs of it running, which put me 4 hrs 
behind), and gave up on find() after 30 min or so, and simply over-wrote the 
bad code with something which works. As an aside, the person i was 
counting on to have working demo code for the data i obtained, hasn't written 
a line of code YET. So much for a tight schedule and being reliable. And i 
am am pretty sure i won't report any problems here again if i am in a hurry. 
This thread really takes the cake.

I just sent Derek a screen shot of how busy the computer is. At this time, i 
don't have any free cpu clocks, memory, or bandwidth to test anything. 
Unless i shut things down,, and become the person who isn't getting things 
done.

Kat

new topic     » goto parent     » topic index » view message » categorize

27. Re: find/match not working

On Thu, 19 Aug 2004 14:28:02 -0500, Kat <gertie at visionsix.com> wrote:

>I am so sorry i brought up this thread too. Like i said days ago, i printed all
>
>the additions to a file, and printed the list they were supposedly not found
>in.
>The additions list contained the same items over and over again,same case, 
>same length, same everything,,, identical lines, more than once. And they 
>got added to the list that find() didn't find them in.
>
Saints preserve us.

Kat, if there is a problem in find(), like you *CLAIM*, then surely to
god you would like it resolved?

I am aware that in your personal life, you ferverently believe that
everyone just shits on you for fun. But that is not in general the
case on this list is it? That is not the case, and you know that.
Fine, you want to claim *I* hate you or unfairly pick on you, no
problem - if you feel so, I can assure you it is not intentional - but
if you claim that everyone on this list just hates you, you need help.

If we can help, we will, but if you refuse to let us help, we can't.
(in various technical Euphoria-related problems, I mean)

I am aware that is really really patronising and rather shitty of me,
but I just don't know what else to say. It *is* meant to trigger a
reaction, but not meant to be offensive, my apologies for being
rather bad at this sort of thing...

Pete

new topic     » goto parent     » topic index » view message » categorize

28. Re: find/match not working

Listfilter blocked my reply to Pete Lomax. I'd prefer my answer be posted for 
equal time on the webpage, and sent in the listserv.

I don't see him going after Chris, who reported he also hasto change code to 
make find() work. Neither of us asked for any help. I have said for the entire 
duration of this thread that i was not going to make changes in code that 
was running fine. So think i am paranoid, or delusional, or anything you want, 
but get off my case. This week-long running idiotic thread that i didn't ask
for,
but everyone hasto get in on, leading up to Lomax's personal flaming at me, 
is ample evidence that i don't imagine the attacks.

Kat

new topic     » goto parent     » topic index » view message » categorize

29. Re: find/match not working

Kat wrote:
> 
> On 20 Aug 2004, at 16:10, Derek Parnell wrote:
> 
> > 
> > posted by: Derek Parnell <ddparnell at bigpond.com>
> > 
> > Kat wrote:
> > > 
> > > On 20 Aug 2004, at 2:14, don cole wrote:
> > > 
> > > > > Kat wrote:
> > > > > Like i said, no. The code is running as is *now*, looping thru the
> > > > > list
> > > > > and using match on each item in the list. I don't want to interrupt it
> > > > > and
> > > > > waste internet bandwidth on a test, in case someone is waiting on the
> > > > > results of the code run.
> > > > 
> > > > I'm a fairly newbbie to this board and might be missing something here,
> > > > but
> > > > I don't see how do you expect to resolve this issue if you are unwilling
> > > > to
> > > > CHANGE YOUR CODE.
> > > 
> > > I DID bloody change the code, i am not using find() anymore, i am using 
> > > match(). I have said that repeatedly, what's the problem?
> > 
> > Problem: Kat says find() doesn't work. Gives example using match(),
> > which also is faulty.
> 
> How is that faulty?? For the strings i am using, it is working.

Because it is possible to exclude some urls which should be added.

eg.   If "www.rds.com.au" is already in the list then "www.rds.com" 
will be excluded, even though it is a different url.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

30. Re: find/match not working

On 21 Aug 2004, at 2:07, Pete Lomax wrote:



On Thu, 19 Aug 2004 14:28:02 -0500, Kat <gertie at visionsix.com> wrote:

>I am so sorry i brought up this thread too. Like i said days ago, i printed all
> the additions to a file, and printed the list they were supposedly not found
>in. The additions list contained the same items over and over again,same case,
>same length, same everything,,, identical lines, more than once. And they got
>added to the list that find() didn't find them in.
>
Saints preserve us.

Kat, if there is a problem in find(), like you *CLAIM*, then surely to
god you would like it resolved?

I don't care anymore. I only reported it, in case someone else was 
logging
problems and cared. I moved on before i wrote the first email. I 
didn't ask for
any help.

I am aware that in your personal life, you ferverently believe that
everyone just shits on you for fun. But that is not in general the
case on this list is it? That is not the case, and you know that.
Fine, you want to claim *I* hate you or unfairly pick on you, no
problem - if you feel so, I can assure you it is not intentional - but
if you claim that everyone on this list just hates you, you need help.

If we can help, we will, but if you refuse to let us help, we can't.
(in various technical Euphoria-related problems, I mean)

I am aware that is really really patronising and rather shitty of me,
but I just don't know what else to say. It *is* meant to trigger a
reaction, but not meant to be offensive, my apologies for being
rather bad at this sort of thing...

My reaction is: answering this thread has taken too much time from
productive coding. I didn't ask for help. I'm sorry i reported a 
damned thing,
think as badly of me as you want, no arguement from me. I've tried being
nice, saying i am now using match() in that code, and it's working, 
and it
was working BEFORE i reported the bug (as i CLAIM). But why, after all 
the
emails, you decided to waste 3 paragraphs on making this personal, i 
dunno.

Kat,
getting as pissed as you are now.

new topic     » goto parent     » topic index » view message » categorize

31. Re: find/match not working

Well I feel that this has been a productive and enlightening discussion 
that future Euphorians and EUforum users will look back on and achieve 
a thorough understanding of the find() and match() functions.

don cole
SF

new topic     » goto parent     » topic index » view message » categorize

32. Re: find/match not working

>
>On 20 Aug 2004, at 16:10, Derek Parnell wrote:
>
> >
> > posted by: Derek Parnell <ddparnell at bigpond.com>
> >
> > Kat wrote:
> > >
> > > On 20 Aug 2004, at 2:14, don cole wrote:
> > >
> > > > > Kat wrote:
> > > > > Like i said, no. The code is running as is *now*, looping thru 
> the list
> > > > > and using match on each item in the list. I don't want to 
> interrupt it and
> > > > > waste internet bandwidth on a test, in case someone is waiting on the
> > > > > results of the code run.
> > > >
> > > > I'm a fairly newbbie to this board and might be missing something 
> here, but
> > > > I don't see how do you expect to resolve this issue if you are 
> unwilling to
> > > > CHANGE YOUR CODE.
> > >
> > > I DID bloody change the code, i am not using find() anymore, i am using
> > > match(). I have said that repeatedly, what's the problem?
> >
> > Problem: Kat says find() doesn't work. Gives example using match(),
> > which also is faulty.
>
>How is that faulty?? For the strings i am using, it is working.
>
> > Solution: Multiple examples in which find() is shown to work.
> >
> > Result: Kat refuses to try find() again. That is Kat refuses to change her
> > code (yet again) to use find() as per any of the solutions.
>
>See below.
>
> > Consequence: Confusion as to whether or not Kat really wants help.
> >
> > Kat, if don't want help with find(), why bring it up? If you do want
> > help with find(), why don't do accept it?
>
>I had a problem with find(). I reported it. I recoded the apps to use 
>match().
>Programs are now running fine, as of when i gave up on find() and began
>using match() and reported the problem here. My mistakes were reporting
>the problem here, and not saving the code and data that didn't run; but i was
>in a hurry (i noticed the dupes after 4 hrs of it running, which put me 4 hrs
>behind), and gave up on find() after 30 min or so, and simply over-wrote the
>bad code with something which works. As an aside, the person i was
>counting on to have working demo code for the data i obtained, hasn't written
>a line of code YET. So much for a tight schedule and being reliable. And i
>am am pretty sure i won't report any problems here again if i am in a hurry.
>This thread really takes the cake.
>
>I just sent Derek a screen shot of how busy the computer is. At this time, i
>don't have any free cpu clocks, memory, or bandwidth to test anything.
>Unless i shut things down,, and become the person who isn't getting things
>done.
>
>Kat

-- NODUPES.EX
include sort.e
include misc.e


sequence text  atom t

integer fn, l  sequence fname  object line
--fname = "findmatch not working.txt"
--fname = "DUPES.TXT"
fname = "FINDMACH.TXT" --file of URLs Kat posted
--fname = "randdata.txt"
fn = open(fname, "r")
if fn = -1 then
         printf(1, "Unable to open %s\n", {fname})
         abort(0)
end if

-- with trace
-- trace(1)
text = {}
-- remove new-line
while 1 do
         line = gets(fn)
         if atom(line) then
                 exit
         end if
         l = length(line)
         if equal(line[l], '\n') then
                 line = line[1..l - 1]
         end if
         -- putting all the data into this is, of course, unnecessary.
         -- I just did it to aid development.
         text = append(text, line)
end while
printf(1, "length(text) = %d\n", {length(text)})

-- I dont know the significance of "<done>" in the data, so I'm removing it.
for i = 1 to length(text) do
         line = text[i]
         l = length(line) - 7
         if match(" <done>", line) = l + 1 then
                 text[i] = line[1..l]
         end if
end for

sequence uniqueUrls  uniqueUrls = {}
printf(1, "collecting uniqueUrls...\n", {})  t = time()
for i = 1 to length(text) do
         if not find(text[i], uniqueUrls) then
                 uniqueUrls = append(uniqueUrls, text[i])
         else
                 printf(1, "DUP FOUND: %s\n", {text[i]})
         end if
end for
t = time() - t  printf(1, "elapsed time is %f seconds.\n", {t})

printf(1, "length(text) = %d\n", {length(text)})
printf(1, "length(uniqueUrls) = %d\n", {length(uniqueUrls)})

uniqueUrls = sort(uniqueUrls)
pretty_print(1, uniqueUrls, {2})  puts(1, "\n\n")



This spits out one duplicate: DUP FOUND: http://www.ed.gov

Also, I noticed such things as:

"http:// www.buydirectory.com"  has an embedded space;
"http://www.hub..terc.edu"         2 periods in a row;
"http://www.iee org.uk"              has an embedded space;
"http://www.ipl,irg/reading/books" has an embedded comma

Sorry for the delay.  I was trying to come up with a solution that wouldn't 
take hours to run.
Please tell me if you want me to continue.


Bob

new topic     » goto parent     » topic index » view message » categorize

33. Re: find/match not working

It's of course, very late in the discussion (been out of town), but here
is my contribution to the thread.  I couldn't find Kat's original code
(just her re-code using match), so like everyone else, I can't comment on 
what the problem is with find.  I've never had a problem with it, except for
the slowness when a sequence gets too big.

Here's a routine (with some modificatins) that I use to maintain a 
non-duplicated list.  It's very fast, because it keeps the items sorted,
and uses a binary search to find duplicates.  I had some data that 
contained about 70,000 unique entries over 1,300,000 lines.  By switching
from find() to this, I went from a load time of ~1.5hr to ~1min.  The
routine itself does two things.  It either finds the item in the list, or
adds it to the list in the correct spot.

constant blank_add = repeat( 0, 1024 )
sequence entry_list
integer entry_last
function no_dup( sequence entry )
	integer lo, hi, mid, c, first, last
	lo = 1
	hi = entry_last - 1
	mid = floor( (hi+lo)/2 )

	while lo < hi do
		c = compare( entry, entry_list[mid] )
		if c > 0 then
			if lo = mid then
				lo = hi
			else
				lo = mid
			end if
			
		elsif c < 0 then
			hi = mid
		else
			exit
		end if
		mid = floor( (lo + hi) / 2 )
	end while
	
	if mid then
		c = compare( entry, entry_list[mid] )
	else
		entry_list[1] = entry
		entry_last += 1
		return 1
	end if
	
	if c and entry_last > length( entry_list ) then
		entry_list &= blank_add
	end if
	
	if c then
		if c > 0 then
			-- needs to be added after current
			mid += 1
		end if		
		
		if mid = entry_last then
			entry_list &= blank_add
		end if
		entry_list[mid+1..entry_last+1] = entry_list[mid..entry_last]
		
		entry_list[mid] = entry

		entry_last += 1
	end if

	return mid
end function

entry_list = {}
entry_list &= blank_add
entry_last = 0
? no_dup( "one" )
? no_dup( "two" )
? no_dup( "three" )
? no_dup( "one" )
? no_dup( "two" )
? no_dup( "three" )
? no_dup( "one" )
? no_dup( "two" )
? no_dup( "three" )

include misc.e
pretty_print( 1, entry_list[1..entry_last], {})
include get.e
abort(wait_key())


Matt Lewis

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu