1. RE: Geting web pages via Linux [and etc]

jbrown105 at speedymail.org wrote:
> I know that a lot of eu programs can access the net, and that
> i can use linux progs (like lynx) to emulate access to the net,
> but is there any way to access stuff such as web pages via
> an Eu lib? if not, any way to write one or wrap a C lib?

Sure, it just takes a little bit of sockets code. 
You could get a start by downloading my EuMail code, 
and modifying it to use the correct protocol and port.
I wrote a demo which could download and save web pages,
but I think I must have deleted it sometime back.

See downloads page at http://take.maxleft.com

Regards,
Irv

new topic     » topic index » view message » categorize

2. RE: Geting web pages via Linux [and etc]

> jbrown105 at speedymail.org wrote:
> > I know that a lot of eu programs can access the net, and that
> > i can use linux progs (like lynx) to emulate access to the net,
> > but is there any way to access stuff such as web pages via
> > an Eu lib? if not, any way to write one or wrap a C lib?

A little more searching turned up the socks code to 
download a web page. I've forwarded it to jbrown.
Anyone else who wants it please let me know.

BTW: it can download most webpages, except for RDS. 
Instead, I get the main addr.com page.
It may have something to do with RDS being on a 
virtual server.

Irv

new topic     » goto parent     » topic index » view message » categorize

3. RE: Geting web pages via Linux [and etc]

irv at take.maxleft.com wrote:
> BTW: it can download most webpages, except for RDS. 
> Instead, I get the main addr.com page.
> It may have something to do with RDS being on a 
> virtual server.

Hi Irv,

I was having the same problem with my euTCP4u library.
I haven't looked much into yet but I was suspecting because euTCP4u 
only handles the HTTP 1.0 protocol and these sites use HTTP: 1.1
protocol.
I'm probably completely wrong :( ... but that's what I was going to
look at.
 
Regards,

Ray Smith
http://rays-web.com

new topic     » goto parent     » topic index » view message » categorize

4. RE: Geting web pages via Linux [and etc]

--Message-Boundary-27689
Content-description: Mail message body

On 3 Jun 2002, at 4:54, Ray Smith wrote:

> 
> 
> irv at take.maxleft.com wrote:
> > BTW: it can download most webpages, except for RDS. 
> > Instead, I get the main addr.com page.
> > It may have something to do with RDS being on a 
> > virtual server.
> 
> Hi Irv,
> 
> I was having the same problem with my euTCP4u library.
> I haven't looked much into yet but I was suspecting because euTCP4u 
> only handles the HTTP 1.0 protocol and these sites use HTTP: 1.1
> protocol.

I suspect if you use myHttpGetFileEx and spec the rest of the header, you 
can get the site. I used the attached file to get the RDS site. Prolly some 
errors in it, Ray can fix them.

Also, in tcp4u.ew, there is an error in line 505, because connect_socket 
should be an atom.

Kat


--Message-Boundary-27689
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Content-description: Text from file 'get_file2.exw'

--
-- Downloads a file from the web
-- Ray Smith
-- 22/8/2000
--

include tcp4u.ew
with trace

integer ret, server_port, sock
sequence proxy, TheWebPage
sequence remote_file
sequence local_file
sequence sock_receive, server_ip
atom connected, writefile

-- setup some defaults
TheWebPage = ""
server_port = 80

-- Resolved addr.com to 209.249.147.252
-- Resolved www.rapideuphoria.com to 209.249.147.13
server_ip = "209.249.147.13"
proxy = ""
remote_file = "http://www.rapideuphoria.com/"
local_file = "rds.txt"

---------------------------------------------------------------------------------------------------------

global function ServerNeedsAttention()
object ret

ret = tcp4u_is_data_avail(sock)
return ret

end function -- ServerNeedsAttention()

-----------------------------------------------------------------------------------

global function ReadServer()
sequence databuffer
databuffer = ""
sock_receive = ""
if tcp4u_is_data_avail(sock) then
   sock_receive = tcp4u_receive(sock,1000,0)
   if sock_receive[1] > 0 then
     databuffer = databuffer & sock_receive[2][1..sock_receive[1]]
   end if
end if

return databuffer

end function -- readserver

-----------------------------------------------------------------------------------

global procedure SendToServer(sequence data)
atom ret

  ret = tcp4u_send(sock, data, length(data))

end procedure -- SendToServer(sequence data)

-----------------------------------------------------------------------------------

global procedure make_connection()
sequence sock_connect

if tcp4u_init() != TCP4U_SUCCESS then
   wait_abort("tcp4u_init error")
   connected = 0
end if

sock_connect = tcp4u_connect(server_ip, NULL, server_port)
if sock_connect[tcp4u_ret] != TCP4U_SUCCESS
  then
printf(1, "tcp4u_connect error
    '%s'\n",{tcp4u_error_string(sock_connect[tcp4u_ret])} )
    puts(1,"\naborting on any keypress")
    connected = 0
  else
    connected = 1
    sock = sock_connect[2]
end if

end procedure -- make_connection()

-------------------------------------------------------------------------------------------------------------



-- show a little intro
puts(1, "Download Web File Demo\n")

http4u_set_timeout(60)

-- get the file
printf(1, "...downloading file \nfrom '%s', \nto '%s' \nusing proxy '%s'\n",
		{remote_file, local_file, proxy} )
--ret = http4u_get_file(remote_file, proxy, local_file)

-- knock on the door
make_connection()

-- tell it what we want
SendToServer("GET / HTTP/1.0\r\n"&
             "Referer: http://www.rapideuphoria.com/\r\n"&
             "Accept: */*\r\n"&
             "Accept-Language: en-us\r\n"&
           --  "Accept-Encoding: gzip, deflate\r\n"&
             "User-Agent: <sigh>\r\n"&
             "Host: rapideuphoria.com\r\n"&
             "Forwarded: rapideuphoria.com\r\n"&
             "\r\n")


-- give the server time to think about it..
while not ServerNeedsAttention() do sleep(2) end while


-- ok, data there, get it!
while ServerNeedsAttention() do
   TheWebPage &= ReadServer()
   sleep(1)
end while


--close tcp4u
ret = tcp4u_cleanup()
if ret != TCP4U_SUCCESS then
   printf(1, "Error on tcp4u_cleanup '%s'\n", {http4u_error_string(ret)} )
end if

writefile = open(local_file,"w")
puts(writefile,TheWebPage)
close(writefile)

-- finished
puts(1, "\n\npress any key to abort.")
ret = wait_key()


--Message-Boundary-27689--

new topic     » goto parent     » topic index » view message » categorize

5. RE: Geting web pages via Linux [and etc]

Ray Smith wrote:
> 
> irv at take.maxleft.com wrote:
> > BTW: it can download most webpages, except for RDS. 
> > Instead, I get the main addr.com page.
> > It may have something to do with RDS being on a 
> > virtual server.
> 
> Hi Irv,
> 
> I was having the same problem with my euTCP4u library.
> I haven't looked much into yet but I was suspecting because euTCP4u 
> only handles the HTTP 1.0 protocol and these sites use HTTP: 1.1
> protocol.
> I'm probably completely wrong :( ... but that's what I was going to
> look at.

Yep; It's due to the 'virtual hosting'. See 
http://www.webcom.com/glossary/http1.1.shtml for a clear 
explanation.

Irv

new topic     » goto parent     » topic index » view message » categorize

6. RE: Geting web pages via Linux [and etc]

The solution is simple:

I changed the 'GET' request to:

 write(SOCKET, sprintf("GET /%s HTTP/1.1\nHost: %s\n\n",
                       {filename,hostname}))

It now works fine with virtual domains.

Irv

new topic     » goto parent     » topic index » view message » categorize

7. RE: Geting web pages via Linux [and etc]

On 3 Jun 2002, at 13:43, irv at take.maxleft.com wrote:

> 
> The solution is simple:
> 
> I changed the 'GET' request to:
> 
>  write(SOCKET, sprintf("GET /%s HTTP/1.1\nHost: %s\n\n",
>                        {filename,hostname}))
> 
> It now works fine with virtual domains.

Yes, but that doesn't work in the http functions in tcp4u.

Kat

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu