Extract all images from 1st PDF page in C# and .NET


PDF file processing is an important task for many developers, whether it's integration with document management systems, creating automated reports, or editing content. One common scenario is to extract images from certain pages of a document. In this article, we'll look at how to use the SautinSoft PdfFocus .NET library implement the extraction of all images from the first page of a PDF file.

Extracting images from PDF is relevant in a variety of cases:

  • Restoration of illustrations from old documents.
  • Analysis of graphic content for learning or machine vision.
  • Create galleries or collections of images for further processing.
  • Integration of images into other systems or applications.

The frequency of use depends on the tasks of a particular project: in business, in research, or in document processing automation — there are many such scenarios.
This code serves as the foundation for automated document processing:

  • In the automatic collection of illustrative materials.
  • In archiving and cataloging systems.
  • In the preparation of data for training machine vision models.
  • In digital content management solutions.

The advantage is that it can be easily integrated into larger systems, automate processes, and improve PDF efficiency.

Step-by-Step:

  1. Add SautinSoft.PdfFocus from Nuget.
  2. Load a PDF document.
  3. Extract all images from 1st PDF page.
  4. Show all extracted images.

Complete code

using System;
using System.IO;
using System.Collections.Generic;
using SautinSoft;

namespace Sample
{
    class Sample
    {
        static void Main(string[] args)
        {
            // Before starting, we recommend to get a free key:
            // https://sautinsoft.com/start-for-free/
            
            // Apply the key here:
            // SautinSoft.PdfFocus.SetLicense("...");
			
            // Extract all images from 1st PDF page			
            SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();

            string pdfFile = Path.GetFullPath(@"..\..\..\simple text.pdf");
            string imageDir = new DirectoryInfo(Directory.GetCurrentDirectory()).CreateSubdirectory("images").FullName;

            List<PdfFocus.PdfImage> pdfImages = null;

            f.OpenPdf(pdfFile);

            if (f.PageCount > 0)
            {
               f.ImageOptions.SelectedPages = new int[] {0};
               

                    pdfImages = f.ExtractImages();

                    // Show all extracted images.
                    if (pdfImages != null && pdfImages.Count > 0)
                    {
                        for (int i = 0; i < pdfImages.Count; i++)
                        {
                            string imageFile = Path.Combine(imageDir, String.Format("img{0}.png", i + 1));
                            pdfImages[i].Picture.Encode(new FileStream(imageFile, FileMode.Create), SkiaSharp.SKEncodedImageFormat.Png, 100);
                        }
                        //System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(imageDir) { UseShellExecute = true });
                    }
               
            }
        }
    }
}

Download

Imports System
Imports System.IO
Imports System.Collections.Generic
Imports SautinSoft

Namespace Sample
	Friend Class Sample
		Shared Sub Main(ByVal args() As String)
			' Before starting, we recommend to get a free key:
			' https://sautinsoft.com/start-for-free/

			' Apply the key here
			' SautinSoft.PdfFocus.SetLicense("...");

			' Extract all images from 1st PDF page
			Dim f As New SautinSoft.PdfFocus()

			Dim pdfFile As String = Path.GetFullPath("..\..\..\simple text.pdf")
			Dim imageDir As String = (New DirectoryInfo(Directory.GetCurrentDirectory())).CreateSubdirectory("images").FullName

			Dim pdfImages As List(Of PdfFocus.PdfImage) = Nothing

			f.OpenPdf(pdfFile)

			If f.PageCount > 0 Then
				pdfImages = f.ExtractImages()

				' Show all extracted images.
				If pdfImages IsNot Nothing AndAlso pdfImages.Count > 0 Then
					For i As Integer = 0 To pdfImages.Count - 1
						Dim imageFile As String = Path.Combine(imageDir, String.Format("img{0}.png", i + 1))
						pdfImages(i).Picture.Encode(New FileStream(imageFile, FileMode.Create), SkiaSharp.SKEncodedImageFormat.Png, 100)
					Next i
					System.Diagnostics.Process.Start(New System.Diagnostics.ProcessStartInfo(imageDir) With {.UseShellExecute = True})
				End If
			End If
		End Sub
	End Class
End Namespace

Download


If you need a new code example or have a question: email us at support@sautinsoft.com or ask at Online Chat (right-bottom corner of this page) or use the Form below:


Captcha

Questions and suggestions from you are always welcome!

We are developing .Net components since 2002. We know PDF, DOCX, RTF, HTML, XLSX and Images formats. If you need any assistance with creating, modifying or converting documents in various formats, we can help you. We will write any code example for you absolutely free.