Re: needed: webpage getter!

new topic     » goto parent     » topic index » view thread      » older message » newer message

Kat wrote:

> On 25 Jun 2004, at 13:32, Juergen Luethje wrote:
>
>> Kat wrote:
>>
>>> Greets again all, webpages bugging me again. The url in question is
>>>
>>> wwws.sheetmusicplus.com
>>>
>>> and almost every page after that. The only page getter that i have tried
>>> (written in Eu) is Webshepard, and it pesters me for cookies 5 times for
>>> each
>>> page.
>
> Actually, that should have said "The only page getter that gets the page
> is..."
>
>> I'd like to have a look at 'Webshepard'. Where can I get it?
>
> It's in the user archives (http://www.rapideuphoria.com/webshep.zip).

Thanks! I didn't find it when searching for 'Webshepard' on the User
Contributions page. Now I know, that I should have looked for
'Web Shepherd'. smile

> Beware of 3 things with it:
>
> 1) it apparently saves the file on C: before moving it to the file you
> specify, so
> there's a lot of fileswapping going on after the file is retrieved, which is
> significant with 50meg files.
>
> 2) the Webshepherd window cannot be moved or minimized after you click
> on "Download"
>
> 3) If you have previously allowed cookies in IE (or netscrape, or etc), but
> now
> use a proxy to stop or fake cookies, WebShepherd will accept cookies

Thanks for the hints!
But when I tell Web Shepherd to get the page
"http://www.rapideuphoria.com/index.html",
it just says "A parameter in the canonicalize URL function is bad."

"spider" by Daniel Kluss and "eulibcURL" by Ray Smith look promising in
general, but unfortunately can't get the page that you mentioned, too.

>>> In all the other webgetters, no cookie is bothered with, but the urls on the
>>> pages are munged to fit their search engine, and are essentially useless.
>>> The
>>> urls are ok in IE5.0, and much much shorter too.  I am looking witless, can
>>> anyone tell me why Eu apps cannot get this page properly? Internet Exporer
>>> gets it fine, with and without the http proxy. Tcp4u won't get it with the
>>> tcp4u_ calls, nor the http4u_ calls.
>>
>> I just tried to get that page, using the demo program that ships with
>> 'http.zip' by PatRat --> No reaction at all. sad
>>
>> WinHTTrack <http://www.httrack.com> says:
>> "File has moved from wwws.sheetmusicplus.com/
>>  to http://www.sheetmusicplus.com/?r=emdb2"
>>
>> This is consistent with the behaviour of IE 5.5.
>
> Yes, if cookies are refused, the domain makes up new urls. I can't find a way
> yet that will work on all webpages. And this dialup is so slow, fetching an
> email can take 5 minutes, and i have had webpages take 10 minutes to
> open. Prefetching with Eu to cache locally is almost a necessity.

If you want, I can get the page for you, using WinHTTrack (at least try
to do so), and then send it to you as ZIP file by mail, or offer the ZIP
file for downlod on my website.

Regards,
   Juergen

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu