OpenEuphoria: Forum: Re: PDF reader

Re: PDF reader

new topic » goto parent » topic index » view thread » older message » newer message

Posted by jimcbrown (admin) Nov 24, 2013
2836 views

EUWX said...

jimcbrown said...

EUWX said...

You can take a short cut by using the Euphoria system command to invoke a third party software, take the result into Euphoria, do a software extraction using Euphoria, and then reconvert using third party software. One to two months is what you will need working single-handed or with one collaborator.

Err - by using pdftohtml, it's a few minutes, not one or two months. Single-handedly.

Anybody can use a preexisting software to convert "a few minutes" "Single-handedly".

Agreed.

EUWX said...

When I talked about "one or two months", I was talking about programmatically ... to correct the mistakes,

You did not mention this in your original quote, reproduced below. If this is what you meant, then you should say so.

EUWX said...

But I would agree - often times errors in the text of a PDF are hidden by the font being used, and can be a real pain to fix by hand after the text is extracted. If using OCR to pull text out of an embedded image, you more-or-less have to deal with the same issue. Dealing with this without human intervention is not an easy task. Probably not something gwalters needs to do either.

new topic » goto parent » topic index » view thread » older message » newer message

OpenEuphoria

Re: PDF reader

Search

Include:

Quick Links

User menu

Misc Menu