Re: Stripping HTML Tags from a Text File
- Posted by euler May 28, 2009
- 1418 views
This is very easy to handle using a regex capable text editor, or you can use specific tools for. There are some freely available and others very cheap. RegexBuddy comes to mind for the latter and V-Grep for the former.
Both can use a regex like
<([A-Z][A-Z0-9]*)[^>]*>(.*?)</\1>
with the "cleaned" text stored into the second capturing block (backreference). You may need to run this several times.
HTH