Howto create pdf to whatever conversion
Pdf format is not meant for neither editing nor simple text extraction etc. It
can be impossible to create word/line/column representation from som pdf files. Despite these limitations, most pdf files are enough "sane" that we can managet to extract text and to build words from letters, lines from words and columns from lines.
Text output design in pdfedit allows adding arbitrary output formats very easily.
See example
pdftoxml:
conversion from pdf to xml
See howto
pdftoxml:
howto convert from pdf to xml
There are 2 comments on this page. [Display comments]