Re: Parsing
- Posted by Lucius L Hilley III <luciuslhilleyiii at JUNO.COM> Apr 07, 1997
- 820 views
I lost the original message and a couple other messages too. Someone asked about parsing. They even mentioned <HTML>. Here is a throw together HTML parser. This strips out MOST <tags>, replaces linefeeds with spaces, and replaces <BR>, <HR>, </H????>, </TITLE> with line feed. ----------Parses and displays file.htm--------- -----Few comments involved include wildcard.e--used for changing some text to upper case. sequence buffer object line integer handle integer l, g --l is lessthan --g is greaterthan integer lf --lf is linefeed lf = 10 handle = open("file.htm", "r") buffer = {} while 1 do line = gets(handle) if atom(line) then exit end if line[length(line)] = 32 buffer = buffer & line end while l = find('<', buffer) while l do g = find('>', buffer) buffer[l..g] = upper(buffer[l..g]) if compare(buffer[l..g], "<BR>") = 0 then buffer = buffer[1..l - 1] & 10 & buffer[g + 1..length(buffer)] elsif compare(buffer[l..g], "<HR>") = 0 then buffer = buffer[1..l - 1] & 10 & buffer[g + 1..length(buffer)] elsif compare(buffer[l..g], "</TITLE>") = 0 then buffer = buffer[1..l - 1] & 10 & buffer[g + 1..length(buffer)] elsif compare(buffer[l..l + 2], "</H") = 0 then buffer = buffer[1..l - 1] & 10 & buffer[g + 1..length(buffer)] else buffer = buffer[1..l - 1] & buffer[g + 1..length(buffer)] end if l = find('<', buffer) end while puts(1, buffer) ------------------End file------------ --Lucius Lamar Hilley III -- E-mail at luciuslhilleyiii at juno.com -- I support transferring of files less than 60K. -- I can Decode both UU and Base64 format.