1. .doc file to .txt ?
- Posted by "Kat" <gertie at visionsix.com> Aug 04, 2005
- 494 views
Does anyone know how Eu can transform a .doc file (like http://www.nwo.usace.army.mil/html/od-tl/pn/200380625t.doc) into a plain ascii.txt file? Will different .doc versions require different decoders, and will the proper version be part of the .doc file? My win95b box won't read some .doc that winxp will read, but i prefer them to be in .txt anyhow, for various reasons. Kat
2. Re: .doc file to .txt ?
- Posted by Ward Turner <captaincorc at isp.com> Aug 05, 2005
- 508 views
Forgive me if this is a very stupid way to do it. I know it's not clean and pretty, but it will leave you with plain text, some of which will literally be garbage, but at least you can open it in notepad and just delete the stuff that's garbage, leaving the body of the document wrappable and readable. atom char integer fn integer of sequence buffer fn = open("c:\\my download files\\200380625t.doc","rb") char = getc(fn) of = open("c:\\luxor options\\doc.txt","w") buffer = {} while char > -1 do -- -1 is EOF if char > 31 and char < 127 then buffer = append(buffer,char) elsif char = 10 or char = 13 then buffer = append(buffer,char) if char = 13 then buffer = append(buffer,'\n') end if end if char = getc(fn) end while puts(of,buffer) close(fn) close(of) I know you probably will find this really dumb, but I didn't want you to think nobody was looking :) Ward
3. Re: .doc file to .txt ?
- Posted by DB James <larch at adelphia.net> Aug 05, 2005
- 524 views
Ward Turner wrote: > > Forgive me if this is a very stupid way to do it. I know it's not clean and > pretty, but it will leave you with plain text, some of which will literally be > garbage, but at least you can open it in notepad and just delete the stuff > that's garbage, leaving the body of the document wrappable and readable. > > atom char > integer fn > integer of > sequence buffer > > fn = open("c:\\my download files\\200380625t.doc","rb") > char = getc(fn) > of = open("c:\\luxor options\\doc.txt","w") > buffer = {} > > while char > -1 do -- -1 is EOF > > if char > 31 and char < 127 then > buffer = append(buffer,char) > elsif char = 10 or char = 13 then > buffer = append(buffer,char) > if char = 13 then > buffer = append(buffer,'\n') > end if > end if > char = getc(fn) > > > end while > puts(of,buffer) > close(fn) > close(of) > > I know you probably will find this really dumb, but I didn't want you to think > nobody was looking :) > > Ward Hi Ward, I tried this and it worked pretty well. I'm sure Kat can tweak your code to get a good "returns" response, something like ignoring 10's and substituting '\n's for 13's or something. But basically it does the trick. Nice of you to take the time amid all the sound and fury of Linux vs. M$, etc. Now, if you could just figure a way to extract the maps too..:^D --Quark
4. Re: .doc file to .txt ?
- Posted by ags <eu at 531pi.co.nz> Aug 06, 2005
- 501 views
DB James wrote: > > I tried this and it worked pretty well. I'm sure Kat can tweak your code to > get a > good "returns" response, something like ignoring 10's and substituting '\n's > for 13's > or something. But basically it does the trick. Nice of you to take the time > amid > all the sound and fury of Linux vs. M$, etc. > > Now, if you could just figure a way to extract the maps too..:^D Speaking of Linux vs M$, have you looked at antiword? http://www.winfield.demon.nl/ I'm not sure if you want to be dependant on an external program (nor even if antiword will do what you want, re images) but it is available on many, many platforms even 16-bit MS-DOS. Gary
5. Re: .doc file to .txt ?
- Posted by "Kat" <gertie at visionsix.com> Aug 06, 2005
- 498 views
On 6 Aug 2005, at 6:24, ags wrote: > > > posted by: ags <eu at 531pi.co.nz> > > DB James wrote: > > > > I tried this and it worked pretty well. I'm sure Kat can tweak your code to > > get a good "returns" response, something like ignoring 10's and substituting > > '\n's for 13's or something. But basically it does the trick. Nice of you > > to > > take the time amid all the sound and fury of Linux vs. M$, etc. > > > > Now, if you could just figure a way to extract the maps too..:^D > > Speaking of Linux vs M$, have you looked at antiword? > > http://www.winfield.demon.nl/ > > I'm not sure if you want to be dependant on an external program (nor even if > antiword will do what you want, re images) but it is available on many, many > platforms even 16-bit MS-DOS. Kool: (2) save the text version of the Word document in Latin2, in a file antiword -m cp852.txt filename.doc > filename.txt (1) save the PostScript version of the Word document in Latin1, in a file generate PostScript for printing on European A4 size paper antiword -p a4 -m 8859-1.txt filename.doc > filename.ps (2) save the PostScipt version of the Word document in Latin2, in a file generate PostScript for printing on American letter size paper antiword -p letter -m 8859- 2.txt filename.doc > filename.ps Thanks, ags! Kat
6. Re: .doc file to .txt ?
- Posted by "Kat" <gertie at visionsix.com> Aug 06, 2005
- 482 views
On 4 Aug 2005, at 17:13, Ward Turner wrote: > > > posted by: Ward Turner <captaincorc at isp.com> > > Forgive me if this is a very stupid way to do it. I know it's not clean and > pretty, but it will leave you with plain text, some of which will literally be > garbage, but at least you can open it in notepad and just delete the stuff > that's garbage, leaving the body of the document wrappable and readable. > > atom char > integer fn > integer of > sequence buffer > > fn = open("c:\\my download files\\200380625t.doc","rb") > char = getc(fn) > of = open("c:\\luxor options\\doc.txt","w") > buffer = {} > > while char > -1 do -- -1 is EOF > > if char > 31 and char < 127 then > buffer = append(buffer,char) > elsif char = 10 or char = 13 then > buffer = append(buffer,char) > if char = 13 then > buffer = append(buffer,'\n') > end if > end if > char = getc(fn) > > > end while > puts(of,buffer) > close(fn) > close(of) > > I know you probably will find this really dumb, but I didn't want you to think > nobody was looking :) Thanks, i wondered if there was some undocumented winapi call in someone's library to do this all pretty, with pics. Well,, hmm,, yes,, ok. Kat
7. Re: .doc file to .txt ?
- Posted by ags <eu at 531pi.co.nz> Aug 07, 2005
- 510 views
Kat wrote: > > <a href="http://www.winfield.demon.nl/">http://www.winfield.demon.nl/</a> > > > > I'm not sure if you want to be dependant on an external program (nor even if > > antiword will do what you want, re images) but it is available on many, many > > platforms even 16-bit MS-DOS. > > Kool: > > (2) save the text version of the Word document in Latin2, in a file antiword > -m > cp852.txt filename.doc > filename.txt > > (1) save the PostScript version of the Word document in Latin1, in a file > generate > > PostScript for printing on European A4 size paper antiword -p a4 -m 8859-1.txt > > filename.doc > filename.ps > > (2) save the PostScipt version of the Word document in Latin2, in a file > generate > PostScript for printing on American letter size paper antiword -p letter -m > 8859- > 2.txt filename.doc > filename.ps > > Thanks, ags! De nada, nichts zu danken. Openwebmail uses antiword to give fast previews of word docs which is what impressed me about it. I don't think it handles tables too well though. Gary