1. Re[4]: exists of file on www-server?

> On 8 Apr 2002, at 21:53, Thomas Parslow (PatRat) wrote:

>> 
>> > This might be difficult as some webservers don't return 404 when
>> > page not found, but create a HTML (with graphics and ads)
>> > saying it to you...
>> 
>> >     Martin
>> 
>> They should still send a 404 status code in the headers though...
>
> Or the redirect to the page with the not-found explained in english,,, or
> both.
> Or see the post i just sent on how to play it smart, no matter what they 
> return. Been there, done that. smile
>
> Kat

I've never seen a web server that doesn't return a status code of 404
with a not found page, not that I've looked very hard. Do you have an
example we could look at to see if there would be another way of
determining it?

Thomas Parslow (PatRat)
E-Mail/Jabber: tom at almostobsolete.net
ICQ: 26359483

new topic     » topic index » view message » categorize

2. Re: Re[4]: exists of file on www-server?

On 9 Apr 2002, at 10:21, Thomas Parslow (PatRat) wrote:

> 
> > On 8 Apr 2002, at 21:53, Thomas Parslow (PatRat) wrote:
> 
> >> 
> >> > This might be difficult as some webservers don't return 404 when
> >> > page not found, but create a HTML (with graphics and ads)
> >> > saying it to you...
> >> 
> >> >     Martin
> >> 
> >> They should still send a 404 status code in the headers though...
> >
> > Or the redirect to the page with the not-found explained in english,,, or
> > both. Or see the post i just sent on how to play it smart, no matter what
> > they
> > return. Been there, done that. smile
> >
> > Kat
> 
> I've never seen a web server that doesn't return a status code of 404
> with a not found page, not that I've looked very hard. Do you have an
> example we could look at to see if there would be another way of
> determining it?

I remember Compuserv and Xoom wouldn't return a 404 at one time, they 
returned a html page in the frame below their nav bar to the average user. But 
i didn't bother to remember the urls. Xoom doesn't exist anymore, and 
Compuserv might as well not exist. But, like i said, if you deliberately send a 
bad url as the first one, you will know what they return, and can use equal() 
or a couple match()s against it to see if the subsequent fetches are bad. If 
the 404 page includes the bad url, using equal() on the whole page won't 
work.

In my clumsily written example attached to another email, sent a few 
minutes ago, i use these tests, since i couldn't get the real code from 
Ieulibnet:

  if ( length(webpage) < 300 ) then
-- cannot use equal because the bad url is in the page
-- and the bad url changes for each bad url
-- but it's a *small* page
          puts(1,"bad url\n")
          goodword = "no"
	  exit
       end if
      if match("<TITLE>",webpage) then
        if match("Service Temporarily Unavailable",webpage) then
-- their blanket "we are busy, come back later"
-- but it may not be a 404, just try again
	  mineword = "BAD"
	  exit
        end if
        goodword = ""
        exit
      end if

I'd prefer people used your asynchttp or tcp4u or some other version of 
srvsckip.ew. 
smile

Kat

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu