Transform Scanned PDF to Word in C# and .NET

Converting scanned PDF documents to Word files can be a difficult task, especially if the document contains text in another language. In this article, we will look at how to use SautinSoft.Pdf. NET to perform this task using C# and .NET.

Step-by-step guide:

  1. Load the required language file.
  2. Add SautinSoft.PDF from NuGet.
  3. Load a PDF document.
  4. Extract the images from the first page.
  5. Perform the OCR of the first page.
  6. Save the document in DOCX format.

Input file: simple text.pdf

convert scanned PDF with other language to word input

Output result:

convert scanned PDF with other language to word output

Complete code