Handling images in PDF documents with C# and .NET

PDF .Net supports exporting images from PDF files in JPEG, BMP, PNG, and TIFF image formats. Extracting images from PDF documents can be a crucial task for various applications, such as data analysis, digital archiving, and content repurposing. Using C# and .NET, you can efficiently extract images from PDFs with the help of the Sautinsoft.PDF library. This article will guide you through the process of extracting images from PDFs using this powerful library.

Extracting images from PDFs can be useful for:

  • Re using images in other documents or presentations.
  • Analyzing visual data.
  • Archiving images separately for better organization.
  • Enhancing content management systems.

The following example shows how to export a single image from a PDF file:

  1. Add SautinSoft.PDF from NuGet.
  2. Load a PDF document.
  3. Iterate through PDF pages.
  4. Get all image content elements on the page.
  5. Export the first image element to an image file.
  6. Save the image.

Input file:

Output result:

Complete code

using System;
using System.IO;
using System.Linq;
using SautinSoft;
using SautinSoft.Pdf;
using SautinSoft.Pdf.Content;

namespace Sample
{
    class Sample
    {
        /// <summary>
        /// Export and import images to PDF file.
        /// </summary>
        /// <remarks>
        /// Details: https://sautinsoft.com/products/pdf/help/net/developer-guide/extract-images-from-pdf.php
        /// </remarks>
        static void Main(string[] args)
        {
            // Before starting this example, please get a free 100-day trial key:
            // https://sautinsoft.com/start-for-free/

            // Apply the key here:
            // PdfDocument.SetLicense("...");

            string pdfFile = Path.GetFullPath(@"..\..\..\simple text.pdf");

            using (var document = PdfDocument.Load(pdfFile))
            {
                // Iterate through PDF pages.
                foreach (var page in document.Pages)
                {
                    // Get all image content elements on the page.
                    var imageElements = page.Content.Elements.All().OfType<PdfImageContent>().ToList();

                    // Export the first image element to an image file.
                    if (imageElements.Count > 0)
                    {
                        imageElements[0].Save("Export Images.jpeg");
                        System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo("Export Images.jpeg") { UseShellExecute = true });
                        break;
                    }
                }
            }
        }
    }
}

Download

Option Infer On

Imports System
Imports System.IO
Imports System.Linq
Imports SautinSoft
Imports SautinSoft.Pdf
Imports SautinSoft.Pdf.Content

Namespace Sample
	Friend Class Sample
		''' <summary>
		''' Export and import images to PDF file.
		''' </summary>
		''' <remarks>
		''' Details: https://sautinsoft.com/products/pdf/help/net/developer-guide/extract-images-from-pdf.php
		''' </remarks>
		Shared Sub Main(ByVal args() As String)
			' Before starting this example, please get a free license:
			' https://sautinsoft.com/start-for-free/

			' Apply the key here:
			' PdfDocument.SetLicense("...");

			Dim pdfFile As String = Path.GetFullPath("..\..\..\simple text.pdf")

			Using document = PdfDocument.Load(pdfFile)
				' Iterate through PDF pages.
				For Each page In document.Pages
					' Get all image content elements on the page.
					Dim imageElements = page.Content.Elements.All().OfType(Of PdfImageContent)().ToList()

					' Export the first image element to an image file.
					If imageElements.Count > 0 Then
						imageElements(0).Save("Export Images.jpeg")
						System.Diagnostics.Process.Start(New System.Diagnostics.ProcessStartInfo("Export Images.jpeg") With {.UseShellExecute = True})
						Exit For
					End If
				Next page
			End Using
		End Sub
	End Class
End Namespace

Download


If you need a new code example or have a question: email us at support@sautinsoft.com or ask at Online Chat (right-bottom corner of this page) or use the Form below:



Questions and suggestions from you are always welcome!

We are developing .Net components since 2002. We know PDF, DOCX, RTF, HTML, XLSX and Images formats. If you need any assistance with creating, modifying or converting documents in various formats, we can help you. We will write any code example for you absolutely free.