Re: http_get does not retrieve page content

new topic     » goto parent     » topic index » view thread      » older message » newer message
DerekParnell said...
Kat said...


My recent post on this topic in the ticket section was deleted. In question is how Eu parses a http header, to retrieve webpages.

Not all http servers adhere exactly to the "standard". By strictly following the "standard" Eu will "break" when attempting to fetch a page with any deviation from what is acceptable to Eu. This procedure is now broken.

It's not Eu place to enforce the "standards", and it's unacceptable that Eu is voluntarily broken, and intolerant of any slight difference in the "standard". Eu will now refuse to get those webpages, when it is quite possible to gracefully get them.

This sounds very reasonable to me.

What exactly is it in the webpage data that's causing the Eu library routines to reject those webpages?


The decision, as i understand it, is which chars are line terminators in the header, and what order and quantity they are. There is an RFC and a "standard", which isn't always followed to the letter, and this is a common situation online (the frequent cause of browser wars).

It's my contention that trim() will solve for all "standard" situations as well as non-standard situations.

useless

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu