Convert PDF to XML in C# and .NET


PDF to XML
  • Supports all PDF versions (1.0 - 2.0, PDF/A)
  • Password protected docs.
  • Allows to convert custom PDF pages.
  • Produces well-formed XML document.
  • Offers two conversion modes: convert all text or only tabular data.

     Let's see how to add "PDF to XML feature" into any .NET application. First of all, to give your .NET application ability to convert PDF documents to XML, add a reference to the "SautinSoft.PdfFocus.dll" assembly. You may download it here, 63.3 Mb .

After launching this code you will get XML-document produced from Table.pdf. Since we have set the property "ConvertNonTabularDataToSpreadsheet" to false, all textual data will be skipped. In other words, only tables will be converted to XML.