Euphoria Ticket #831: http_get does not retrieve page content

Hello,

object r= http_get("http://docs.openstack.org/essex/openstack-compute/admin/content/creating-a-windows-image.html") 

returns following header indicating a non-null content, but does not retrieve the corresponding content:

 HTTP/1.1 200 OK 
 server Apache/2.2 
 content-type text/html; charset=UTF-8 
 date Fri, 04 Jan 2013 06:43:58 GMT 
 accept-ranges bytes 
 connection close 
 set-cookie X-Mapping-jebomepa=59D88FA58BBDA5FE9C10CF182D6499F6; path=/ 
 set-cookie X-Mapping-jebomepa=59D88FA58BBDA5FE9C10CF182D6499F6; path=/ 
 last-modified Sat, 15 Dec 2012 04:08:48 GMT 
 content-length 59109 
 
r[1][1][1] = 'HTTP/1.1' 
r[1][1][2] = '200' 
r[1][1][3] = 'OK' 
r[1][2][1] = 'server' 
r[1][2][2] = 'Apache/2.2' 
r[1][3][1] = 'content-type' 
r[1][3][2] = 'text/html; charset=UTF-8' 
r[1][4][1] = 'date' 
r[1][4][2] = 'Fri, 04 Jan 2013 06:43:58 GMT' 
r[1][5][1] = 'accept-ranges' 
r[1][5][2] = 'bytes' 
r[1][6][1] = 'connection' 
r[1][6][2] = 'close' 
r[1][7][1] = 'set-cookie' 
r[1][7][2] = 'X-Mapping-jebomepa=59D88FA58BBDA5FE9C10CF182D6499F6; path=/' 
r[1][8][1] = 'set-cookie' 
r[1][8][2] = 'X-Mapping-jebomepa=59D88FA58BBDA5FE9C10CF182D6499F6; path=/' 
r[1][9][1] = 'last-modified' 
r[1][9][2] = 'Sat, 15 Dec 2012 04:08:48 GMT' 
r[1][10][1] = 'content-length' 
r[1][10][2] = '59109\r' 
 
r[2] = {} 

Regards

Jean-Marc

Details

Type: Bug Report Severity: Major Category: Interpreter
Assigned To: jimcbrown Status: Fixed Reported Release: v4.1.0 development, Revision D
Fixed in SVN #: View VCS: none Milestone: 4.0.6

1. Comment by jimcbrown Jan 04, 2013

See: hg:euphoria/rev/fd5e5231784c

changeset: 5899:fd5e5231784c branch: 4.0 tag: tip user: Jim C. Brown date: Fri Jan 04 07:13:42 2013 -0500 files: include/std/net/http.e description:

  • Trim out whitespace in content-length header
  • Fixes ticket:831

2. Comment by jimcbrown Jan 04, 2013

See: hg:euphoria/rev/7728398269c3

changeset: 5900:7728398269c3 parent: 5702:f269927c332b user: Jim C. Brown date: Fri Jan 04 07:10:05 2013 -0500 files: include/std/net/http.e description:

  • Trim out whitespace in content-length header
  • Fixes ticket:831

3. Comment by jmduro Jan 04, 2013

OK Jim,

I replaced line 287

content_length = to_number(this_header[2]) 

by this

content_length = to_number(trim(this_header[2])) 

It does the job. Thanks Jean-Marc

4. Comment by useless_ Jan 04, 2013

Best to trim(line,"\n\r ") all the header lines. Long ago and far away (the past 16 years i have been fetching webpages) "\n\r" ended all lines, and "\n\r\n\r" (or vice versa) ended the header. But i have seen all mixes of '\n' and '\r', however "illegal", the one fairly constant (but not always) is the two '\r' and two '\n' at the end of the header. I did a shortcut using strtok, and parsed the entire return on "\r\n\r" (not the same as {10,13,10}) and then parsed the header on {10,13}. That gave me the parsed header, and the body started at the first '<' , altho many page bodies are sent starting with ' ' or '\t' or some combination of more {10,13}'s. Point is, if you follow the spec exactly, then any deviation by the server and Eu becomes "broken".

useless

5. Comment by CoJaBo2 Jan 09, 2013

The fix checked in is completely wrong.

All it does is mask the actual bug a few lines up-
sequence raw_header = content[1..header_end_pos]
should be:
sequence raw_header = content[1..header_end_pos-1]

There is no need to trim extra whitespace in header values; servers that add any would not function in modern browsers. Indeed, doing so is a bad idea per the above.

6. Comment by jimcbrown Jan 09, 2013

See: hg:euphoria/rev/10f7a86e5fd1

changeset: 5905:10f7a86e5fd1 tag: tip parent: 5903:e146b072d53e user: Jim C Brown date: Wed Jan 09 04:44:25 2013 -0500 files: include/std/net/http.e description:

  • Fixes ticket:831
  • Use better method suggeted by CoJaBo

7. Comment by jimcbrown Jan 09, 2013

See: hg:euphoria/rev/570575b971cc

changeset: 5906:570575b971cc branch: 4.0 tag: tip parent: 5899:fd5e5231784c user: Jim C Brown date: Wed Jan 09 04:46:14 2013 -0500 files: include/std/net/http.e description:

  • Fixes ticket:831
  • Use better method suggested by CoJaBo

8. Comment by SDPringle May 04, 2018

See: hg:euphoria/rev/5b8912a4a2cf

changeset: 6468:5b8912a4a2cf branch: 4.0 user: Shawn David Pringle B.Sc. <shawn.pringle@gmail.com> date: Fri May 04 10:00:59 2018 -0300 files: docs/release/4.0.6.txt description:

  • updated the release notes. ticket 831, ticket 907, ticket 803, ticket 853, ticket 928, ticket 938, ticket 752, ticket 915, ticket 948, ticket 921

Search



Quick Links

User menu

Not signed in.

Misc Menu