Re: Need Code for This [Cklester and/or Pete]

new topic     » topic index » view thread      » older message » newer message

> 
> 
> posted by: don cole <doncole at pacbell.net>
> 
> Kat wrote:
> > 
> > 
> > <snip>
> > 
> > That is a task getxml() was written for around 1999 (i wrote it in
> > Turbo Pascal way before that, early 1990's), in the first strtok.e.
> > You can parse the page by <table> and the data lines on the webpage
> > by <tr>, then each <td> inside is a data item for that <tr>. You can
> > ask for the <td> #1, or #2, etc in each <tr>. 
> > 
> > I still can't help CK tho.
> > 
> > Kat
> > 
> > 
>   I don't do it that way. But I'll look into parsing with strtok.e I
> have that in my include files so I must be using it in something.
> 
>   Does the WebMaster or Mistess always use the same <table>, <tr>,
>   <td> scheme? and do all WebMasters and Mistesses use the same
>   scheme?

No, you can count on webpages being different between each 
domain. There is no convention between using <TR> or <Tr> or 
<tr>. Even "<font face" can be "<FONT ecaf". No table on any 
webpage is like any other table on another domain's pages. And 
some domains like to add new "features" and delete others 
occasionally. Comments will be changed, so will "class" names. 
Some sites change the web address of the pictures on a page 
every 5 minutes (so people cannot link to them). Advertising is 
inserted or deleted in the html, and the domain of the adserver will 
be changed. Javascript or style sheets will be edited as authors 
find whatever they write won't work the same on all browsers. You 
may notice some html tags can include other html tags, on some 
pages, but some other site will use separate tags. It's a good idea 
to verify the page format you write code for hasn't changed 
periodically. Put in some code so that if the page's format has 
changed, or the data item in a table is not what you expect, it 
alerts you. Perhaps you search for "<!-- begin data -->", and 
someone changes that to "<!-- Data Goes Below -->". It's also a 
good idea to make sure your automatic "browser" doesn't register 
as a spammer or as a feeble denial of service attack by accessing 
the site too often.

Kat

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu