How to convert PDF to Word (DOCX) in C#?

PDF Focus .Net

.Net assembly which gives API to convert PDF to All: DOCX, RTF, HTML, XML, Text, Excel, Images in .Net and C#.
PDF to Word, DOCX scheme

PDF Focus .Net

How to convert PDF to Word (DOCX) in C#?
PDF to Word, DOCX scheme

Introduction

Another interesting feature of "PDF Focus .Net" is the API to convert PDF to DOCX. The beauty of this approach is in that you only need to add a reference to the SautinSoft.PdfFocus.dll and type four (4) lines of code. For example, to convert a PDF to DOCX in C#:

           SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
            f.OpenPdf(@"d:\Invoice for a Pastry Shop.pdf");
            if (f.PageCount > 0)
                f.ToWord(@"Invoice for a Pastry Shop.docx");
          

"PDF Focus .Net" has own PDF reader and DOCX renderer, according to PDF 1.7 and Office Open XML (Ecma 4th edition) specifications. Thus your .NET application will be able to convert any PDF documents to DOCX on fly without any dependencies from MS Office or Adobe Acrobat.

All features are supported and an output DOCX document will contain paragraphs, columns, tables, hyperlinks, images, page breaks, and so forth.

Another point of interest is that PDF Focus .Net can understand and recreate real tables with rows and cells. Maybe it sounds obvious, but PDF documents don't have real tables. If you see a table inside a PDF, it is in fact a set of lines. To further accentuate this point, download PDF Focus .Net, 69.3 Mb and evaluate it now!


Download

To see this functionality firsthand, download the freshest «PDF Focus .Net» with code examples, 69.3 Mb.

Limitations

PDF Focus .Net The limitations of the free version are: The trial notice "Created by unlicensed version of PDF Focus .Net" and the random addition of the word "TRIAL".


Some examples to convert PDF to DOCX in C# and VB.Net

Want to adjust a result of PDF to DOCX conversion? See our tips ...

1. Convert PDF file to DOCX file in C#:

           SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
            f.OpenPdf(@"d:\History.pdf");

            if (f.PageCount > 0)
            {
                int result = f.ToWord(@"d:\History.docx");

                // Open Word document
                if (result==0)
                {
                    System.Diagnostics.Process.Start(@"d:\History.docx");
                }
            }
      

2. Convert PDF to DOCX in memory using C#:

           string pdfFile = @"c:\book.pdf";
            MemoryStream docxStream = new MemoryStream();
            // Convert PDF to word in memory
            SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();

            // Assume that we already have a PDF document as stream.
            using (FileStream pdfStream = new FileStream(pdfFile, FileMode.Open, FileAccess.Read))
            {
                f.OpenPdf(pdfStream);

                if (f.PageCount > 0)
                {
                    int res = f.ToWord(docxStream);

                    // Save docxStream to a file for demonstration purposes.
                    if (res == 0)
                    {
                        string docxFile = Path.ChangeExtension(pdfFile, ".docx");
                        File.WriteAllBytes(docxFile, docxStream.ToArray());
                        System.Diagnostics.Process.Start(docxFile);
                    }
                }
            }

3. Convert 2nd-3rd pages of PDF document Word in VB.Net:

        Dim f As New SautinSoft.PdfFocus()
        f.OpenPdf("http://somesite.com/forprint.pdf")

        If f.PageCount > 2 Then
            'Convert only pages 2 - 3 to Word
            Dim result As Integer = f.ToWord("f:\foredit.docx", 2, 3)

            'Show Word document
            If result = 0 Then
                System.Diagnostics.Process.Start("f:\foredit.docx")
            End If
        End If

      

4. Export PDF to Word in ASP.Net/C#:


        SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
        f.OpenPdf(FileUpload1.FileBytes);

        byte [] docx = null;

        if (f.PageCount > 0)
        {
            //Let's whole PDF document to DOCX
            docx = f.ToWord();
        }

        //show result
        if (docx != "")
        {
            Response.Buffer = true;
            Response.Clear();
            Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
            Response.AddHeader("Content-Disposition:", "attachment; filename=Result.docx");
            Response.Write(docx);
            Response.Flush();
            Response.End();
        }
      

5. Convert PDF file to Word file in VB.Net:

        Dim f As New SautinSoft.PdfFocus()
        f.OpenPdf("c:\Simple Text.pdf")

        If f.PageCount > 0 Then
            Dim result As Integer = f.ToWord("c:\Result.docx")

            'Show Word document
            If result = 0 Then
                System.Diagnostics.Process.Start("c:\Result.docx")
            End If
        End If

Requirements and Technical Information

Requires .NET Framework 4.0 or higher. Our product is compatible with all .NET languages and supports all Operating Systems where .NET Framework and .NET Core can be used. Note that PDF Focus .Net is entirely written in managed C#, which makes it absolutely standalone and an independent library.

.Net Framework 4.0 and higher and .Net Core 2.0 and higher

.NET Framework 4.5, 4.6.1 and higher.The old version for old .NET 2.0 can be found here

.NET Standard 2.0

.NET Core, .NET 5.0 and higher.


Multi-platform component, runs on:


Our component has proven itself on cloud platforms and services:

  • Microsoft Azure
  • Amazon Web Services (AWS)
  • Google Cloud Platform
  • SharePoint
  • Docker
  • Xamarin Forms
  • etc.