1. find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 17, 2004
- 512 views
I have a list of urls, contained in urllist like: urllist[1], urllist[2]..urllist[x] and i tried eliminating duplicates before they are added, by using find, which didn't work. Now i am using found = 0 for loop = 1 to length(urllist) do if match(junk,urllist[loop]) then found = 1 exit end if end for if not found then urllist = urllist & {junk} end if And darned if i still don't get duplicates. I do trim off leading and trailing spaces from junk. Why is this happening? Kat
2. Re: find/match not working
- Posted by Rubens Monteiro Luciano <rml at rubis.trix.net> Aug 17, 2004
- 481 views
Hi Kat my be there is some "enters" in the end of the sequences... Rubens At 07:50 17/8/2004, you wrote: > > >I have a list of urls, contained in urllist like: >urllist[1], urllist[2]..urllist[x] > >and i tried eliminating duplicates before they are added, by using find, >which >didn't work. Now i am using > >found = 0 >for loop = 1 to length(urllist) do > if match(junk,urllist[loop]) then > found = 1 > exit > end if > end for > if not found then > urllist = urllist & {junk} > end if > >And darned if i still don't get duplicates. I do trim off leading and >trailing >spaces from junk. Why is this happening? > >Kat > > >
3. Re: find/match not working
- Posted by Chris Burch <chriscrylex at aol.com> Aug 17, 2004
- 499 views
Kat wrote: > > I have a list of urls, contained in urllist like: > urllist[1], urllist[2]..urllist[x] > > and i tried eliminating duplicates before they are added, by using find, which > > didn't work. Now i am using > > found = 0 > for loop = 1 to length(urllist) do > if match(junk,urllist[loop]) then > found = 1 > exit > end if > end for > if not found then > urllist = urllist & {junk} > end if > > And darned if i still don't get duplicates. I do trim off leading and trailing > > spaces from junk. Why is this happening? > > Kat > > Hi I seem to remember having a problem like this to - try taking off the first and last character of junk and see what happens eg
found = 0 for loop = 1 to length(urllist) do if match(junk[2..length(junk)-1,urllist[loop]) then found = 1 exit end if end for if not found then urllist = urllist & {junk} end if
assuming that hunk is a full url, or the match isn't THAT critical Chris
4. Re: find/match not working
- Posted by Chris Burch <chriscrylex at aol.com> Aug 17, 2004
- 494 views
Chris Burch wrote: --snip-- > assuming that hunk is a full url, or the match isn't THAT critical > > Chris > the hunk should be junk!
5. Re: find/match not working
- Posted by don cole <doncole at pacbell.net> Aug 17, 2004
- 489 views
Kat wrote: > > I have a list of urls, contained in urllist like: > urllist[1], urllist[2]..urllist[x] > > and i tried eliminating duplicates before they are added, by using find, which > > didn't work. Now i am using > > found = 0 > for loop = 1 to length(urllist) do > if match(junk,urllist[loop]) then > found = 1 > exit > end if > end for > if not found then > urllist = urllist & {junk} > end if > > And darned if i still don't get duplicates. I do trim off leading and trailing > > spaces from junk. Why is this happening? > > Kat > >
integer a sequence newlist,found newlist={} found={} for loop=1 to length(urllist) do a=find(urllist[loop],found)) if a=0 then newlist=append(newlist,urllist[loop]) found=append(found,urllist[loop]) end if end for
new list should have no duplicates don cole SF }}}
6. Re: find/match not working
- Posted by don cole <doncole at pacbell.net> Aug 17, 2004
- 477 views
I re-thought this.
integer a sequence newlist,found newlist={} found={} for loop=1 to length(urllist) do a=find(urllist[loop],found)) if a=0 then found=append(found,urllist[loop]) -- newlist=append(newlist,urllist[loop])--you don't really need newlist --unless you are going to alter end if --urllist[loop] before you add end for --it to your sequence. urllist=found --now your list is free of duplicates-- for loop=1 to length(urllist) do if find(junk,urllist[loop]) then exit end if end for urllist=urlist & {junk}
don cole SF
7. Re: find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 17, 2004
- 472 views
On 17 Aug 2004, at 8:17, don cole wrote: > > > posted by: don cole <doncole at pacbell.net> > > I re-thought this. > }}} <eucode> > integer a > sequence newlist,found > newlist={} > found={} > for loop=1 to length(urllist) do > a=find(urllist[loop],found)) > if a=0 then > found=append(found,urllist[loop]) > -- newlist=append(newlist,urllist[loop])--you don't really need newlist > --unless you are going to alter > end if --urllist[loop] before you add > end for --it to your sequence. > urllist=found > --now your list is free of duplicates-- The list started out with nothing in it, so i am somewhat sure it had no duplicates. The following code you wrote is exactly what i had before i went to using match() instead. > for loop=1 to length(urllist) do > if find(junk,urllist[loop]) then > exit > end if > end for > urllist=urlist & {junk} > </eucode> {{{ > > don cole > SF Kat
8. Re: find/match not working
- Posted by don cole <doncole at pacbell.net> Aug 17, 2004
- 507 views
Kat wrote: > > On 17 Aug 2004, at 8:17, don cole wrote: > > > > > posted by: don cole <doncole at pacbell.net> > > > > I re-thought this. > <font color="#330033">> </font> > <font color="#330033">> </font><font color="#FF00FF">integer </font><font > color="#330033">a</font> > <font color="#330033">> </font><font color="#FF00FF">sequence </font><font > color="#330033">newlist,found</font> > <font color="#330033">> newlist={}</font> > <font color="#330033">> found={}</font> > <font color="#330033">> </font><font color="#0000FF">for </font><font > color="#330033">loop=1 </font><font color="#0000FF">to </font><font > color="#FF00FF">length</font><font color="#330033">(urllist) </font><font > color="#0000FF">do</font> > <font color="#330033">> a=</font><font color="#FF00FF">find</font><font > color="#330033">(urllist</font><font color="#993333">[</font><font > color="#330033">loop</font><font color="#993333">]</font><font > color="#330033">,found))</font> > <font color="#330033">> </font><font color="#0000FF">if </font><font > color="#330033">a=0 </font><font color="#0000FF">then</font> > <font color="#330033">> found=</font><font > color="#FF00FF">append</font><font color="#330033">(found,urllist</font><font > color="#993333">[</font><font color="#330033">loop</font><font > color="#993333">]</font><font color="#330033">) </font> > <font color="#330033">> </font><font color="#FF0055">-- > newlist=append(newlist,urllist[loop])--you don't really need newlist</font> > <font color="#330033">> </font><font > color="#FF0055">--unless you are going to alter</font> > <font color="#330033">> </font><font color="#0000FF">end if > </font><font color="#FF0055">--urllist[loop] before you add > </font> > <font color="#330033">> </font><font color="#0000FF">end for > </font><font color="#FF0055">--it to your sequence.</font> > <font color="#330033">> urllist=found</font> > <font color="#330033">> </font><font color="#FF0055">--now your list is free > of duplicates--</font> > <font color="#330033"></font> > <font color="#330033">The list started out </font><font color="#0000FF">with > </font><font color="#330033">nothing in it, so i am somewhat sure it had no > </font> > <font color="#330033">duplicates. The following code you wrote is exactly what > i had before i went </font> > <font color="#0000FF">to </font><font color="#330033">using </font><font > color="#FF00FF">match</font><font color="#330033">() instead.</font> > <font color="#330033"></font> > <font color="#330033">> </font><font color="#0000FF">for </font><font > color="#330033">loop=1 </font><font color="#0000FF">to </font><font > color="#FF00FF">length</font><font color="#330033">(urllist) </font><font > color="#0000FF">do</font> > <font color="#330033">> </font><font color="#0000FF">if </font><font > color="#FF00FF">find</font><font color="#330033">(junk,urllist</font><font > color="#993333">[</font><font color="#330033">loop</font><font > color="#993333">]</font><font color="#330033">) </font><font > color="#0000FF">then</font> > <font color="#330033">> </font><font color="#0000FF">exit</font> > <font color="#330033">> </font><font color="#0000FF">end if</font> > <font color="#330033">> </font><font color="#0000FF">end for</font> > <font color="#330033">> urllist=urlist & {junk}</font> > <font color="#330033">> </font> > > > > don cole > > SF > > Kat > > Ok try
global function my_find(sequence a,sequence b) integer an a=trim(lower(a)) b=trim(lower(b)) an=find(a,b) return an end function
global function my_match(sequence a,sequence b) integer an maches 2 squences of anny length a=trim(lower(a)) b=trim(lower(b)) an=match(a,b) return an end function
don cole SF }}}
9. Re: find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 17, 2004
- 486 views
On 17 Aug 2004, at 9:42, don cole wrote: < < < posted by: don cole <doncole at pacbell.net< < < Kat wrote: < < < < On 17 Aug 2004, at 8:17, don cole wrote: < < < < < < < < posted by: don cole <doncole at pacbell.net< < < < < < < I re-thought this. < < <font color="#330033"<< </font< < < <font color="#330033"<< </font<<font color="#FF00FF"<integer </font<<font < < color="#330033"<a</font< <font color="#330033"<< </font<<font < < color="#FF00FF"<sequence </font<<font color="#330033"<newlist,found</font< < < <font color="#330033"<< newlist={}</font< <font color="#330033"<< < < found={}</font< <font color="#330033"<< </font<<font color="#0000FF"<for < < </font<<font color="#330033"<loop=1 </font<<font color="#0000FF"<to < < </font<<font color="#FF00FF"<length</font<<font color="#330033"<(urllist) < < </font<<font color="#0000FF"<do</font< <font color="#330033"<< < < a=</font<<font color="#FF00FF"<find</font<<font < < color="#330033"<(urllist</font<<font color="#993333"<[</font<<font < < color="#330033"<loop</font<<font color="#993333"<]</font<<font < < color="#330033"<,found))</font< <font color="#330033"<< </font<<font < < color="#0000FF"<if </font<<font color="#330033"<a=0 </font<<font < < color="#0000FF"<then</font< <font color="#330033"<< found=</font<<font < < color="#FF00FF"<append</font<<font color="#330033"<(found,urllist</font<<font < < color="#993333"<[</font<<font color="#330033"<loop</font<<font < < color="#993333"<]</font<<font color="#330033"<) </font< <font < < color="#330033"<< </font<<font color="#FF0055"<-- < < newlist=append(newlist,urllist[loop])--you don't really need newlist</font< < < <font color="#330033"<< </font<<font < < color="#FF0055"<--unless you are going to alter</font< <font color="#330033"<< < < </font<<font color="#0000FF"<end if < < </font<<font color="#FF0055"<--urllist[loop] before you add </font< <font < < color="#330033"<< </font<<font color="#0000FF"<end for < < </font<<font color="#FF0055"<--it to your sequence.</font< <font < < color="#330033"<< urllist=found</font< <font color="#330033"<< </font<<font < < color="#FF0055"<--now your list is free of duplicates--</font< <font < < color="#330033"<</font< <font color="#330033"<The list started out < < </font<<font color="#0000FF"<with </font<<font color="#330033"<nothing in it, < < so i am somewhat sure it had no </font< <font color="#330033"<duplicates. The < < following code you wrote is exactly what i had before i went </font< <font < < color="#0000FF"<to </font<<font color="#330033"<using </font<<font < < color="#FF00FF"<match</font<<font color="#330033"<() instead.</font< <font < < color="#330033"<</font< <font color="#330033"<< </font<<font < < color="#0000FF"<for </font<<font color="#330033"<loop=1 </font<<font < < color="#0000FF"<to </font<<font color="#FF00FF"<length</font<<font < < color="#330033"<(urllist) </font<<font color="#0000FF"<do</font< <font < < color="#330033"<< </font<<font color="#0000FF"<if </font<<font < < color="#FF00FF"<find</font<<font color="#330033"<(junk,urllist</font<<font < < color="#993333"<[</font<<font color="#330033"<loop</font<<font < < color="#993333"<]</font<<font color="#330033"<) </font<<font < < color="#0000FF"<then</font< <font color="#330033"<< </font<<font < < color="#0000FF"<exit</font< <font color="#330033"<< </font<<font < < color="#0000FF"<end if</font< <font color="#330033"<< </font<<font < < color="#0000FF"<end for</font< <font color="#330033"<< urllist=urlist & < < {junk}</font< <font color="#330033"<< </font< < < < < < < don cole < < < SF < < < < Kat < < < < < Ok try < <eucode< < global function my_find(sequence a,sequence b) < integer an < a=trim(lower(a)) < b=trim(lower(b)) < an=find(a,b) < return an < end function < < global function my_match(sequence a,sequence b) < integer an --maches 2 squences of anny length < a=trim(lower(a)) < b=trim(lower(b)) < an=match(a,b) < return an < end function The variables are already trimmed, and a printout to a file shows me they are the same case. The code runs with match() and i am not going to take it down to try new find() code. I have 5280 urls to mine by tonite. Due to the telco here cutting me offline this morning to verify i can get online, and other assorted madness, i hope i make it. Kat < don cole < SF < < < <
10. Re: find/match not working
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Aug 17, 2004
- 491 views
On Tue, 17 Aug 2004 16:30:00 +0000, Pete E <euphoria at eberlein.org> wrote: >Perhaps you want: > > found = find (junk, urllist) > if not found then > urllist = urllist & {junk} > end if > > Kat, perhaps you could trap your program at some point when it has created a duplicate, and create an ex.err or print() the values so this (the above code misbehaving) can be reproduced?
11. Re: find/match not working
- Posted by don cole <doncole at pacbell.net> Aug 18, 2004
- 499 views
Kat wrote: > > I have a list of urls, contained in urllist like: > urllist[1], urllist[2]..urllist[x] > > and i tried eliminating duplicates before they are added, by using find, which > > didn't work. Now i am using > I have a list of urls, contained in urllist like: urllist[1], urllist[2]..urllist[x] and i tried eliminating duplicates before they are added, by using find, which didn't work. Now i am using > > Kat > > Compile your complete list duplicates and all. Don't worry if they are added. Then eliminate all duplicates with:
function clean_dups(list) integer a sequence found for x=1 to length(list) do a=find(list[x],found) if a=0 then found=append(found,list[x]) end if end for return found end function
It might take longer but will work. don cole SF
12. Re: find/match not working
- Posted by Derek Parnell <ddparnell at bigpond.com> Aug 19, 2004
- 488 views
Kat wrote: > found = 0 > for loop = 1 to length(urllist) do > if match(junk,urllist[loop]) then > found = 1 > exit > end if > end for > if not found then > urllist = urllist & {junk} > end if When doing this sort of thing, I usually do this ... urltemp = upper(urllist) junk = trim(junk) found = find(upper(junk), urltemp) if not found then urllist = append(urllist, junk) end if In other words, trim both data and do an case-insensitive search. The find() function looks for a single element that exactly matches the subject argument. The match() function looks for a set of adjacent elements that exactly matches all the elements in the subject argument. To use match() in the way you have above might work better as ... if match(junk,urllist[loop]) = 1 and length(junk) = length(urllist[loop]) then in other words, ensure that if you get a match it starts with the first character and is the same length. But that's now equivalent to equal()! So it might be better to do ... if equal(junk,urllist[loop]) then but if you are doing that, you may as well use find() as it will work faster. -- Derek Parnell Melbourne, Australia
13. Re: find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 19, 2004
- 480 views
I am so sorry i brought up this thread too. Like i said days ago, i printed all the additions to a file, and printed the list they were supposedly not found in. The additions list contained the same items over and over again,same case, same length, same everything,,, identical lines, more than once. And they got added to the list that find() didn't find them in. Kat On 19 Aug 2004, at 10:46, Bernard Ryan wrote: > > > posted by: Bernard Ryan <xotron at bluefrog.com> > > Kat: > > }}} <eucode> > include wildcard.e > -- > procedure add2list( sequence new_url, sequence url_list ) > -- > for loop = 1 to length(url_list) do > if equal(lower(new_url),lower(url_list[loop])) then return end if > end for > url_list &= {new_url} > -- > procedure > </eucode> {{{ > > Bernie > > My files in archive: > http://www.rapideuphoria.com/w32engin.zip > http://www.rapideuphoria.com/mixedlib.zip > http://www.rapideuphoria.com/eu_engin.zip > http://www.rapideuphoria.com/win32eru.zip > > > >
14. Re: find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 19, 2004
- 480 views
On 18 Aug 2004, at 18:05, Derek Parnell wrote: > > > posted by: Derek Parnell <ddparnell at bigpond.com> > > Kat wrote: > > > found = 0 > > for loop = 1 to length(urllist) do > > if match(junk,urllist[loop]) then > > found = 1 > > exit > > end if > > end for > > if not found then > > urllist = urllist & {junk} > > end if > > When doing this sort of thing, I usually do this ... > > urltemp = upper(urllist) > junk = trim(junk) > found = find(upper(junk), urltemp) > if not found then > urllist = append(urllist, junk) > end if > > In other words, trim both data and do an case-insensitive search. > > The find() function looks for a single element that exactly > matches the subject argument. > > The match() function looks for a set of adjacent elements that > exactly matches all the elements in the subject argument. According to the files i printed out, the items were the same. But find() didn't find them. Kat > To use match() in the way you have above might work better as ... > > if match(junk,urllist[loop]) = 1 and > length(junk) = length(urllist[loop]) then > > in other words, ensure that if you get a match it starts > with the first character and is the same length. But that's > now equivalent to equal()! So it might be better to do ... > > if equal(junk,urllist[loop]) then > > but if you are doing that, you may as well use find() > as it will work faster. > > -- > Derek Parnell > Melbourne, Australia > > > >
15. Re: find/match not working
- Posted by cklester <cklester at yahoo.com> Aug 19, 2004
- 494 views
Kat wrote: > According to the files i printed out, the items were the same. > But find() didn't find them. That's like one of my users telling me my software doesn't work, yet it's working on hundreds of other PCs. I'm incredulous at that point, and usually, with a bit of investigation, the situation isn't exactly as they claimed, it turns out to be a UE, and another case of UDRTFM. I've been in the same situation: claiming that the software was screwing up only to discover it was I who was doing the screwing. Wait, that didn't come out right. :) -=ck "Programming in a state of EUPHORIA." http://www.cklester.com/euphoria/
16. Re: find/match not working
- Posted by irv mullins <irvm at ellijay.com> Aug 19, 2004
- 494 views
- Last edited Aug 20, 2004
cklester wrote: > > Kat wrote: > > > According to the files i printed out, the items were the same. > > But find() didn't find them. > > That's like one of my users telling me my software doesn't work, > yet it's working on hundreds of other PCs. I'm incredulous at that point, > and usually, with a bit of investigation, the situation isn't exactly > as they claimed, it turns out to be a UE, and another case of UDRTFM. I had a client call once: "Your software quit working!" Me: "Did it work yesterday?" Client: "Yes" Me: "Did you do anything on your computer since yesterday?" Client: "No. Just cleaned out a bunch of useless files." Me: "What were their names?" Client: "I don't know! Just a bunch of .dat and .exe stuff I never use. You need to come fix this!" Me: "I'll be right over. Get out your checkbook." Next time they called, I was "out of the country" :) Irv
17. Re: find/match not working
- Posted by irv mullins <irvm at ellijay.com> Aug 19, 2004
- 480 views
- Last edited Aug 20, 2004
Kat wrote: > According to the files i printed out, the items were the same. But find() > didn't > find them. Perhaps there were some control codes which don't get printed. Irv
18. Re: find/match not working
- Posted by cklester <cklester at yahoo.com> Aug 19, 2004
- 488 views
- Last edited Aug 20, 2004
irv mullins wrote: > cklester wrote: > > That's like one of my users telling me my software doesn't work, > > yet it's working on hundreds of other PCs. I'm incredulous at that point, > > and usually, with a bit of investigation, the situation isn't exactly > > as they claimed, it turns out to be a UE, and another case of UDRTFM. > > I had a client call once: "Your software quit working!" > Me: "Did it work yesterday?" > Client: "Yes" > Me: "Did you do anything on your computer since yesterday?" > Client: "No. Just cleaned out a bunch of useless files." That's exactly how a lot of my conversations go, except they say... Client: "No. Nothing's changed at all whatsoever." At this point, I have to tell them: Me: "Software doesn't just stop working for no reason." We later find that either 1) their application was upgraded, which requires an update to our software as well, or 2) they upgraded their version of Windows, which sometimes requires an upgrade of our software. So much for "Nothing's changed at all whatsoever." > Me: "I'll be right over. Get out your checkbook." I'm memorizing that one! :D -=ck "Programming in a state of EUPHORIA." http://www.cklester.com/euphoria/
19. Re: find/match not working
- Posted by Derek Parnell <ddparnell at bigpond.com> Aug 19, 2004
- 482 views
- Last edited Aug 20, 2004
Kat wrote: > According to the files i printed out, the items were the same. But find() > didn't > find them. > Is it possible for you to show us your source code? Not the 'sample' code shown before, but the actual code you are really using. So far, it sounds like everyone else but you can do this, so I'm guessing you have a mistake in your code somewhere. The only way to prove otherwise is to expose your code to peer review. Have you actually tried *any* of the code given to you by others? If so, what was the results of that? -- Derek Parnell Melbourne, Australia
20. Re: find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 20, 2004
- 492 views
On 19 Aug 2004, at 15:04, Derek Parnell wrote: > > > posted by: Derek Parnell <ddparnell at bigpond.com> > > Kat wrote: > > > According to the files i printed out, the items were the same. But find() > > didn't find them. > > > > Is it possible for you to show us your source code? Not the 'sample' > code shown before, but the actual code you are really using. I did, i copy/pasted the code to this listserv days ago. > So far, it sounds like everyone else but you can do this, so I'm > guessing you have a mistake in your code somewhere. The only way > to prove otherwise is to expose your code to peer review. Which is why i posted it here. > Have you actually tried *any* of the code given to you by others? If > so, what was the results of that? Like i said, no. The code is running as is *now*, looping thru the list and using match on each item in the list. I don't want to interrupt it and waste internet bandwidth on a test, in case someone is waiting on the results of the code run. Sorry i sound so unreasonable, but i gave you the code i was using that didn't run, said i do not feel i can stop the code for tests, and it's running with match(). In the files i printed out of the repeats, Textpad's search function found the items find() didn't. Kat
21. Re: find/match not working
- Posted by Bob Elia <bobelia200 at netzero.net> Aug 20, 2004
- 469 views
At 02:28 PM 8/19/04 -0500, you wrote: > > >I am so sorry i brought up this thread too. Like i said days ago, i >printed all >the additions to a file, and printed the list they were supposedly not >found in. >The additions list contained the same items over and over again,same case, >same length, same everything,,, identical lines, more than once. And they >got added to the list that find() didn't find them in. > >Kat <snip> Kat, Have you looked at the data with a hex editor? I'm sure that you're aware that large datasets often have unexpected "junk" in them. Maybe a fresh pair of eyes would help? I'd be more than happy to look at it, if you want to send me a sample that screws up. Maybe 1 Megs worth. Send it privately. Bob
22. Re: find/match not working
- Posted by Derek Parnell <ddparnell at bigpond.com> Aug 20, 2004
- 472 views
Kat wrote: > > On 19 Aug 2004, at 15:04, Derek Parnell wrote: > > > > > posted by: Derek Parnell <ddparnell at bigpond.com> > > > > Kat wrote: > > > > > According to the files i printed out, the items were the same. But find() > > > didn't find them. > > > > > > > Is it possible for you to show us your source code? Not the 'sample' > > code shown before, but the actual code you are really using. > > I did, i copy/pasted the code to this listserv days ago. The code you posted was this... found = 0 for loop = 1 to length(urllist) do if match(junk,urllist[loop]) then found = 1 exit end if end for if not found then urllist = urllist & {junk} end if -------------- Try this instead... ------------- if not find(junk, urllist) then urllist = urlist & {junk} end if ------------ -- Derek Parnell Melbourne, Australia
23. Re: find/match not working
- Posted by don cole <doncole at pacbell.net> Aug 20, 2004
- 472 views
Kat < On 17 Aug 2004, at 9:42, don cole wrote: < Ok try < <eucode< < global function my_find(sequence a,sequence b) < integer an < a=trim(lower(a)) < b=trim(lower(b)) < an=find(a,b) < return an < end function < < global function my_match(sequence a,sequence b) < integer an --maches 2 squences of anny length < a=trim(lower(a)) < b=trim(lower(b)) < an=match(a,b) < return an < end function > Kat wrote: >The code runs with match() and i am not going to take it down to try new >find() code. I have 5280 urls to mine by tonite. Due to the telco here cutting >me offline this morning to verify i can get online, and other assorted >madness, i hope i make it. >Kat < > On 19 Aug 2004, at 15:04, Derek Parnell wrote: > > > Have you actually tried *any* of the code given to you by others? If > > so, what was the results of that? > > Kat wrote: > Like i said, no. The code is running as is *now*, looping thru the list and > using match on each item in the list. I don't want to interrupt it and waste > internet bandwidth on a test, in case someone is waiting on the results of the > > code run. I'm a fairly newbbie to this board and might be missing something here, but I don't see how do you expect to resolve this issue if you are unwilling to CHANGE YOUR CODE. don cole SF
24. Re: find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 20, 2004
- 489 views
On 20 Aug 2004, at 2:14, don cole wrote: > > Kat wrote: > > Like i said, no. The code is running as is *now*, looping thru the list and > > using match on each item in the list. I don't want to interrupt it and waste > > internet bandwidth on a test, in case someone is waiting on the results of > > the > > code run. > > I'm a fairly newbbie to this board and might be missing something here, > but I don't see how do you expect to resolve this issue if you are > unwilling to CHANGE YOUR CODE. I DID bloody change the code, i am not using find() anymore, i am using match(). I have said that repeatedly, what's the problem? Kat
25. Re: find/match not working
- Posted by Derek Parnell <ddparnell at bigpond.com> Aug 20, 2004
- 491 views
- Last edited Aug 21, 2004
Kat wrote: > > On 20 Aug 2004, at 2:14, don cole wrote: > > > > Kat wrote: > > > Like i said, no. The code is running as is *now*, looping thru the list > > > and > > > using match on each item in the list. I don't want to interrupt it and > > > waste > > > internet bandwidth on a test, in case someone is waiting on the results of > > > the > > > code run. > > > > I'm a fairly newbbie to this board and might be missing something here, > > but I don't see how do you expect to resolve this issue if you are > > unwilling to CHANGE YOUR CODE. > > I DID bloody change the code, i am not using find() anymore, i am using > match(). I have said that repeatedly, what's the problem? Problem: Kat says find() doesn't work. Gives example using match(), which also is faulty. Solution: Multiple examples in which find() is shown to work. Result: Kat refuses to try find() again. That is Kat refuses to change her code (yet again) to use find() as per any of the solutions. Consequence: Confusion as to whether or not Kat really wants help. Kat, if don't want help with find(), why bring it up? If you do want help with find(), why don't do accept it? -- Derek Parnell Melbourne, Australia
26. Re: find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 21, 2004
- 477 views
On 20 Aug 2004, at 16:10, Derek Parnell wrote: > > > posted by: Derek Parnell <ddparnell at bigpond.com> > > Kat wrote: > > > > On 20 Aug 2004, at 2:14, don cole wrote: > > > > > > Kat wrote: > > > > Like i said, no. The code is running as is *now*, looping thru the list > > > > and using match on each item in the list. I don't want to interrupt it > > > > and > > > > waste internet bandwidth on a test, in case someone is waiting on the > > > > results of the code run. > > > > > > I'm a fairly newbbie to this board and might be missing something here, > > > but > > > I don't see how do you expect to resolve this issue if you are unwilling > > > to > > > CHANGE YOUR CODE. > > > > I DID bloody change the code, i am not using find() anymore, i am using > > match(). I have said that repeatedly, what's the problem? > > Problem: Kat says find() doesn't work. Gives example using match(), > which also is faulty. How is that faulty?? For the strings i am using, it is working. > Solution: Multiple examples in which find() is shown to work. > > Result: Kat refuses to try find() again. That is Kat refuses to change her > code (yet again) to use find() as per any of the solutions. See below. > Consequence: Confusion as to whether or not Kat really wants help. > > Kat, if don't want help with find(), why bring it up? If you do want > help with find(), why don't do accept it? I had a problem with find(). I reported it. I recoded the apps to use match(). Programs are now running fine, as of when i gave up on find() and began using match() and reported the problem here. My mistakes were reporting the problem here, and not saving the code and data that didn't run; but i was in a hurry (i noticed the dupes after 4 hrs of it running, which put me 4 hrs behind), and gave up on find() after 30 min or so, and simply over-wrote the bad code with something which works. As an aside, the person i was counting on to have working demo code for the data i obtained, hasn't written a line of code YET. So much for a tight schedule and being reliable. And i am am pretty sure i won't report any problems here again if i am in a hurry. This thread really takes the cake. I just sent Derek a screen shot of how busy the computer is. At this time, i don't have any free cpu clocks, memory, or bandwidth to test anything. Unless i shut things down,, and become the person who isn't getting things done. Kat
27. Re: find/match not working
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Aug 21, 2004
- 471 views
On Thu, 19 Aug 2004 14:28:02 -0500, Kat <gertie at visionsix.com> wrote: >I am so sorry i brought up this thread too. Like i said days ago, i printed all > >the additions to a file, and printed the list they were supposedly not found >in. >The additions list contained the same items over and over again,same case, >same length, same everything,,, identical lines, more than once. And they >got added to the list that find() didn't find them in. > Saints preserve us. Kat, if there is a problem in find(), like you *CLAIM*, then surely to god you would like it resolved? I am aware that in your personal life, you ferverently believe that everyone just shits on you for fun. But that is not in general the case on this list is it? That is not the case, and you know that. Fine, you want to claim *I* hate you or unfairly pick on you, no problem - if you feel so, I can assure you it is not intentional - but if you claim that everyone on this list just hates you, you need help. If we can help, we will, but if you refuse to let us help, we can't. (in various technical Euphoria-related problems, I mean) I am aware that is really really patronising and rather shitty of me, but I just don't know what else to say. It *is* meant to trigger a reaction, but not meant to be offensive, my apologies for being rather bad at this sort of thing... Pete
28. Re: find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 21, 2004
- 468 views
Listfilter blocked my reply to Pete Lomax. I'd prefer my answer be posted for equal time on the webpage, and sent in the listserv. I don't see him going after Chris, who reported he also hasto change code to make find() work. Neither of us asked for any help. I have said for the entire duration of this thread that i was not going to make changes in code that was running fine. So think i am paranoid, or delusional, or anything you want, but get off my case. This week-long running idiotic thread that i didn't ask for, but everyone hasto get in on, leading up to Lomax's personal flaming at me, is ample evidence that i don't imagine the attacks. Kat
29. Re: find/match not working
- Posted by Derek Parnell <ddparnell at bigpond.com> Aug 21, 2004
- 490 views
Kat wrote: > > On 20 Aug 2004, at 16:10, Derek Parnell wrote: > > > > > posted by: Derek Parnell <ddparnell at bigpond.com> > > > > Kat wrote: > > > > > > On 20 Aug 2004, at 2:14, don cole wrote: > > > > > > > > Kat wrote: > > > > > Like i said, no. The code is running as is *now*, looping thru the > > > > > list > > > > > and using match on each item in the list. I don't want to interrupt it > > > > > and > > > > > waste internet bandwidth on a test, in case someone is waiting on the > > > > > results of the code run. > > > > > > > > I'm a fairly newbbie to this board and might be missing something here, > > > > but > > > > I don't see how do you expect to resolve this issue if you are unwilling > > > > to > > > > CHANGE YOUR CODE. > > > > > > I DID bloody change the code, i am not using find() anymore, i am using > > > match(). I have said that repeatedly, what's the problem? > > > > Problem: Kat says find() doesn't work. Gives example using match(), > > which also is faulty. > > How is that faulty?? For the strings i am using, it is working. Because it is possible to exclude some urls which should be added. eg. If "www.rds.com.au" is already in the list then "www.rds.com" will be excluded, even though it is a different url. -- Derek Parnell Melbourne, Australia
30. Re: find/match not working
- Posted by "Kat" <gertie at visionsix.com> Aug 21, 2004
- 478 views
On 21 Aug 2004, at 2:07, Pete Lomax wrote: On Thu, 19 Aug 2004 14:28:02 -0500, Kat <gertie at visionsix.com> wrote: >I am so sorry i brought up this thread too. Like i said days ago, i printed all > the additions to a file, and printed the list they were supposedly not found >in. The additions list contained the same items over and over again,same case, >same length, same everything,,, identical lines, more than once. And they got >added to the list that find() didn't find them in. > Saints preserve us. Kat, if there is a problem in find(), like you *CLAIM*, then surely to god you would like it resolved? I don't care anymore. I only reported it, in case someone else was logging problems and cared. I moved on before i wrote the first email. I didn't ask for any help. I am aware that in your personal life, you ferverently believe that everyone just shits on you for fun. But that is not in general the case on this list is it? That is not the case, and you know that. Fine, you want to claim *I* hate you or unfairly pick on you, no problem - if you feel so, I can assure you it is not intentional - but if you claim that everyone on this list just hates you, you need help. If we can help, we will, but if you refuse to let us help, we can't. (in various technical Euphoria-related problems, I mean) I am aware that is really really patronising and rather shitty of me, but I just don't know what else to say. It *is* meant to trigger a reaction, but not meant to be offensive, my apologies for being rather bad at this sort of thing... My reaction is: answering this thread has taken too much time from productive coding. I didn't ask for help. I'm sorry i reported a damned thing, think as badly of me as you want, no arguement from me. I've tried being nice, saying i am now using match() in that code, and it's working, and it was working BEFORE i reported the bug (as i CLAIM). But why, after all the emails, you decided to waste 3 paragraphs on making this personal, i dunno. Kat, getting as pissed as you are now.
31. Re: find/match not working
- Posted by don cole <doncole at pacbell.net> Aug 21, 2004
- 488 views
Well I feel that this has been a productive and enlightening discussion that future Euphorians and EUforum users will look back on and achieve a thorough understanding of the find() and match() functions. don cole SF
32. Re: find/match not working
- Posted by Bob Elia <bobelia200 at netzero.net> Aug 21, 2004
- 520 views
- Last edited Aug 22, 2004
> >On 20 Aug 2004, at 16:10, Derek Parnell wrote: > > > > > posted by: Derek Parnell <ddparnell at bigpond.com> > > > > Kat wrote: > > > > > > On 20 Aug 2004, at 2:14, don cole wrote: > > > > > > > > Kat wrote: > > > > > Like i said, no. The code is running as is *now*, looping thru > the list > > > > > and using match on each item in the list. I don't want to > interrupt it and > > > > > waste internet bandwidth on a test, in case someone is waiting on the > > > > > results of the code run. > > > > > > > > I'm a fairly newbbie to this board and might be missing something > here, but > > > > I don't see how do you expect to resolve this issue if you are > unwilling to > > > > CHANGE YOUR CODE. > > > > > > I DID bloody change the code, i am not using find() anymore, i am using > > > match(). I have said that repeatedly, what's the problem? > > > > Problem: Kat says find() doesn't work. Gives example using match(), > > which also is faulty. > >How is that faulty?? For the strings i am using, it is working. > > > Solution: Multiple examples in which find() is shown to work. > > > > Result: Kat refuses to try find() again. That is Kat refuses to change her > > code (yet again) to use find() as per any of the solutions. > >See below. > > > Consequence: Confusion as to whether or not Kat really wants help. > > > > Kat, if don't want help with find(), why bring it up? If you do want > > help with find(), why don't do accept it? > >I had a problem with find(). I reported it. I recoded the apps to use >match(). >Programs are now running fine, as of when i gave up on find() and began >using match() and reported the problem here. My mistakes were reporting >the problem here, and not saving the code and data that didn't run; but i was >in a hurry (i noticed the dupes after 4 hrs of it running, which put me 4 hrs >behind), and gave up on find() after 30 min or so, and simply over-wrote the >bad code with something which works. As an aside, the person i was >counting on to have working demo code for the data i obtained, hasn't written >a line of code YET. So much for a tight schedule and being reliable. And i >am am pretty sure i won't report any problems here again if i am in a hurry. >This thread really takes the cake. > >I just sent Derek a screen shot of how busy the computer is. At this time, i >don't have any free cpu clocks, memory, or bandwidth to test anything. >Unless i shut things down,, and become the person who isn't getting things >done. > >Kat
-- NODUPES.EX include sort.e include misc.e sequence text atom t integer fn, l sequence fname object line --fname = "findmatch not working.txt" --fname = "DUPES.TXT" fname = "FINDMACH.TXT" --file of URLs Kat posted --fname = "randdata.txt" fn = open(fname, "r") if fn = -1 then printf(1, "Unable to open %s\n", {fname}) abort(0) end if -- with trace -- trace(1) text = {} -- remove new-line while 1 do line = gets(fn) if atom(line) then exit end if l = length(line) if equal(line[l], '\n') then line = line[1..l - 1] end if -- putting all the data into this is, of course, unnecessary. -- I just did it to aid development. text = append(text, line) end while printf(1, "length(text) = %d\n", {length(text)}) -- I dont know the significance of "<done>" in the data, so I'm removing it. for i = 1 to length(text) do line = text[i] l = length(line) - 7 if match(" <done>", line) = l + 1 then text[i] = line[1..l] end if end for sequence uniqueUrls uniqueUrls = {} printf(1, "collecting uniqueUrls...\n", {}) t = time() for i = 1 to length(text) do if not find(text[i], uniqueUrls) then uniqueUrls = append(uniqueUrls, text[i]) else printf(1, "DUP FOUND: %s\n", {text[i]}) end if end for t = time() - t printf(1, "elapsed time is %f seconds.\n", {t}) printf(1, "length(text) = %d\n", {length(text)}) printf(1, "length(uniqueUrls) = %d\n", {length(uniqueUrls)}) uniqueUrls = sort(uniqueUrls) pretty_print(1, uniqueUrls, {2}) puts(1, "\n\n")
This spits out one duplicate: DUP FOUND: http://www.ed.gov Also, I noticed such things as: "http:// www.buydirectory.com" has an embedded space; "http://www.hub..terc.edu" 2 periods in a row; "http://www.iee org.uk" has an embedded space; "http://www.ipl,irg/reading/books" has an embedded comma Sorry for the delay. I was trying to come up with a solution that wouldn't take hours to run. Please tell me if you want me to continue. Bob
33. Re: find/match not working
- Posted by Matt Lewis <matthewwalkerlewis at yahoo.com> Aug 24, 2004
- 481 views
It's of course, very late in the discussion (been out of town), but here is my contribution to the thread. I couldn't find Kat's original code (just her re-code using match), so like everyone else, I can't comment on what the problem is with find. I've never had a problem with it, except for the slowness when a sequence gets too big. Here's a routine (with some modificatins) that I use to maintain a non-duplicated list. It's very fast, because it keeps the items sorted, and uses a binary search to find duplicates. I had some data that contained about 70,000 unique entries over 1,300,000 lines. By switching from find() to this, I went from a load time of ~1.5hr to ~1min. The routine itself does two things. It either finds the item in the list, or adds it to the list in the correct spot.
constant blank_add = repeat( 0, 1024 ) sequence entry_list integer entry_last function no_dup( sequence entry ) integer lo, hi, mid, c, first, last lo = 1 hi = entry_last - 1 mid = floor( (hi+lo)/2 ) while lo < hi do c = compare( entry, entry_list[mid] ) if c > 0 then if lo = mid then lo = hi else lo = mid end if elsif c < 0 then hi = mid else exit end if mid = floor( (lo + hi) / 2 ) end while if mid then c = compare( entry, entry_list[mid] ) else entry_list[1] = entry entry_last += 1 return 1 end if if c and entry_last > length( entry_list ) then entry_list &= blank_add end if if c then if c > 0 then -- needs to be added after current mid += 1 end if if mid = entry_last then entry_list &= blank_add end if entry_list[mid+1..entry_last+1] = entry_list[mid..entry_last] entry_list[mid] = entry entry_last += 1 end if return mid end function entry_list = {} entry_list &= blank_add entry_last = 0 ? no_dup( "one" ) ? no_dup( "two" ) ? no_dup( "three" ) ? no_dup( "one" ) ? no_dup( "two" ) ? no_dup( "three" ) ? no_dup( "one" ) ? no_dup( "two" ) ? no_dup( "three" ) include misc.e pretty_print( 1, entry_list[1..entry_last], {}) include get.e abort(wait_key())
Matt Lewis