OpenEuphoria: Forum: Web scraping

1. Web scraping

Posted by Craig Welch <craig at singmail.com> Oct 17, 2005
560 views

Has anyone written a web scraper in any form?

new topic » topic index » view message » categorize

2. Re: Web scraping

Posted by Michael J. Sabal <m_sabal at yahoo.com> Oct 17, 2005
548 views

Craig Welch wrote:
> 
> Has anyone written a web scraper in any form?
> 
> 

I'm not sure what you mean by a "web scraper", but I have a primitive version
of a library I started for use with my natural language project.  

http://cvs.sourceforge.net/viewcvs.py/teknik/components-english/webtools.eu

The project is GPL, but the routines in this library are so basic I don't 
think it applies.  I'm sure others in the community have done things that
are a little closer to what you're looking for, but hopefully this will
give you a push in the right direction.  

O, did I mention it was written for Linux?

Michael J. Sabal

Project page:
https://sourceforge.net/projects/teknik

new topic » goto parent » topic index » view message » categorize

3. Re: Web scraping

Posted by Craig Welch <craig at singmail.com> Oct 18, 2005
564 views

Michael J. Sabal wrote:

>I'm not sure what you mean by a "web scraper", but I have a primitive version
>of a library I started for use with my natural language project.  
>
>http://cvs.sourceforge.net/viewcvs.py/teknik/components-english/webtools.eu
>  
>
Thanks, those routines are most useful for me.

 From Wikipedia: "Screen scraping is the act of capturing data from a
system or program by capturing and interpreting the contents of some
display that is not actually intended for data transport or inspection
by programs".

 From the website of a commercial web-scraping product:  You might use
our technology and services to:

    * Extract product information from an e-commerce web site, then
download it to a spreadsheet
    * Build a meta-search engine that queries multiple search engines
simultaneously in real-time
    * Generate an RSS feed from a company intranet announcements page
    * Integrate multiple web-based applications into a single interface

new topic » goto parent » topic index » view message » categorize

4. Re: Web scraping

Posted by Craig Welch <craig at singmail.com> Oct 18, 2005
569 views

Michael J. Sabal wrote:

>I'm not sure what you mean by a "web scraper", but I have a primitive version
>of a library I started for use with my natural language project.  
>
>http://cvs.sourceforge.net/viewcvs.py/teknik/components-english/webtools.eu
>  
>
Thanks, those routines are most useful for me.

 From Wikipedia: "Screen scraping is the act of capturing data from a 
system or program by capturing and interpreting the contents of some 
display that is not actually intended for data transport or inspection 
by programs".

 From the website of a commercial web-scraping product:  You might use 
our technology and services to:

    * Extract product information from an e-commerce web site, then 
download it to a spreadsheet
    * Build a meta-search engine that queries multiple search engines 
simultaneously in real-time
    * Generate an RSS feed from a company intranet announcements page
    * Integrate multiple web-based applications into a single interface

-- 
Craig

new topic » goto parent » topic index » view message » categorize

OpenEuphoria

1. Web scraping

2. Re: Web scraping

3. Re: Web scraping

4. Re: Web scraping

Search

Include:

Quick Links

User menu

Misc Menu