Stripping HTML Tags from a Text File
- Posted by tbohon May 28, 2009
- 1344 views
I've been asked to look into a tool that will strip out all HTML tags, leaving the 'plain text' which will then be loaded into a new system. Apparently the old system only produces HTML output and the vendor isn't interested in helping us leave their software system.
Does anyone have such a routine? I can use a Regex in several other languages but it leaves a lot of 'stuff' which will have to be cleaned up manually ... and there are going to be a lot of very large files to be processed.
Thanks in advance.
Tom