Convert each PDF page to separate DOCX documents in C# and .NET


PDF processing has become an integral part of modern document automation solutions. One common task is splitting PDFs into individual pages and converting them to a more editable format, such as DOCX. In this guide, we'll detail how to automatically convert each PDF page into a separate DOCX document using the PDF Focus .NET component of the popular SautinSoft library.

Using this method offers several advantages:

  • Automation: quickly split large PDFs into smaller documents without manual intervention.
  • Editability: the resulting DOCX documents are easy to refine and edit.
  • Compatibility: the DOCX format is widely supported by many editors.
  • Efficiency: reduces the time spent preparing documents from PDFs.

The mechanism for splitting and converting PDF pages into individual DOCX documents is especially useful for:

  • Automatic processing of legal and financial documents.
  • Creating reports where each page is a separate document.
  • Importing PDF reports into systems such as Word documents for further editing.
  • Processing large PDF batches for page splitting.

This simple Console App shows how to convert each page of PDF document into a separate DOCX file with the name "{filename} - page {number}.docx".

Complete code

using System;
using System.IO;

namespace Sample
{
    class Sample
    {
        static void Main(string[] args)
        {
            // Before starting, we recommend to get a free key:
            // https://sautinsoft.com/start-for-free/
            
            // Apply the key here:
            // SautinSoft.PdfFocus.SetLicense("...");
			
            // Convert whole PDF document to separate Word documents.
            // Each PDF page will be converted to a single Word document.

            // Path to a PDF file.
            string pdfPath = Path.GetFullPath(@"..\..\..\simple text.pdf");

            // Directory to store Word documents.
            string docxDir = Directory.GetCurrentDirectory();
			
            SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();

            f.OpenPdf(pdfPath);

            // Convert each PDF page to separate Word document.
            // simple text - page 1.docx, simple text- page 2.docx ... simple text - page N.doc.
            for (int page = 1; page <= f.PageCount; page++)
            {
                // You may select between Docx and Rtf formats.
                f.WordOptions.Format = SautinSoft.PdfFocus.CWordOptions.eWordDocument.Docx;

                byte [] docxBytes = f.ToWord(page, page);

                string tempName = Path.GetFileNameWithoutExtension(pdfPath) + String.Format(" - page {0}.docx", page);
                string docxPath = Path.Combine(docxDir, tempName);
                File.WriteAllBytes(docxPath, docxBytes);

                // Let's show first and last Word pages.
                if (page == 1 || page==f.PageCount)
                    System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(docxPath) { UseShellExecute = true });
            }
        }
    }
}

Download

Imports System.IO
Imports System.Drawing.Imaging
Imports System.Collections.Generic
Imports SautinSoft

Module Sample

    Sub Main()
        ' Before starting, we recommend to get a free key:
        ' https://sautinsoft.com/start-for-free/

        ' Apply the key here
        ' SautinSoft.PdfFocus.SetLicense("...");

        ' Convert whole PDF document to separate Word documents.
        ' Each PDF page will be converted to a single Word document.

        ' Path to a PDF file.
        Dim pdfPath As String = Path.GetFullPath("..\..\..\simple text.pdf")

        ' Directory to store Word documents.
        Dim docxDir As String = Directory.GetCurrentDirectory()
		
        Dim f As New SautinSoft.PdfFocus()

        f.OpenPdf(pdfPath)

        ' Convert each PDF page to separate Word document.
        ' simple text - page 1.docx, simple text- page 2.docx ... simple text - page N.doc.
        For page As Integer = 1 To f.PageCount

            ' You may select between Docx and Rtf formats.
            f.WordOptions.Format = SautinSoft.PdfFocus.CWordOptions.eWordDocument.Docx

            Dim docxBytes() As Byte = f.ToWord(page, page)

            Dim tempName As String = Path.GetFileNameWithoutExtension(pdfPath) & String.Format(" - page {0}.docx", page)
            Dim docxPath As String = Path.Combine(docxDir, tempName)
            File.WriteAllBytes(docxPath, docxBytes)

            ' Let's show first and last Word pages.
            If page = 1 OrElse page = f.PageCount Then
                System.Diagnostics.Process.Start(New System.Diagnostics.ProcessStartInfo(docxPath) With {.UseShellExecute = True})
            End If
        Next page
    End Sub
End Module

Download


If you need a new code example or have a question: email us at support@sautinsoft.com or ask at Online Chat (right-bottom corner of this page) or use the Form below:


Captcha

Questions and suggestions from you are always welcome!

We are developing .Net components since 2002. We know PDF, DOCX, RTF, HTML, XLSX and Images formats. If you need any assistance with creating, modifying or converting documents in various formats, we can help you. We will write any code example for you absolutely free.