Accessible Document Converter Solution

VerseOne's Accessible Document Converter (ADC) is designed to make it easier to convert PDF and Word documents to Accessible HTML web pages. ADC is not designed to faithfully display heavily designed documents as though viewing in a graphic design application or even a PDF — this is why the module includes a link to download the original document.

Although continually being improved, HTML is fundamentally more limited than even most Word Processors, e.g. there is no concept of "tiered numbering" in lists: we can make it look as though there are by using CSS, but this will not help those using screen-readers, for instance.

How the converter works

Our conversions are provided by two services: PDF to Word uses Adobe’s professional PDF Services API, and the Word to HTML is handled by a library called PanDoc. So, a PDF will go through Adobe first, and then through PanDoc: Word only goes through PanDoc.

Once we receive the HTML, we can do a number of transforms ourselves — to make up for some translations errors where possible, and to ensure that we don’t have multiple identical images, etc. This does allow us to finesse some elements of the HTML, provided we have some way of determining the original data.

Although we are constantly trying to improve the module, some elements are beyond our control. The below outlines known issues with the ADC, short-term workarounds, and any development progress that we have made or are researching.

  1. Graphs

    Graphs fom Word / PDF documents do not render properly: this is unlikely to be fixed. This describes the root issue, and a workaround.

  2. Tables

    Blank table cells aren't not rendered in the conversion, which can lead to incorrectly offset headings, etc.

  3. Tiered Numbered Lists

    HTML has no concept of "tiered numbering"in lists, e.g. 1.1, 1.1.1: we can make it look as though there are by using CSS, but this will not help those using screen-readers.

  4. Vector Images

    Resolution-independent vector graphics are a useful way of rendering images, but are a relatively new technology for browsers.

Total results: 4