Extract all images from 1st PDF page in C# and .NET
PDF file processing is an important task for many developers, whether it's integration with document management systems, creating automated reports, or editing content. One common scenario is to extract images from certain pages of a document. In this article, we'll look at how to use the SautinSoft PdfFocus .NET library implement the extraction of all images from the first page of a PDF file.
Extracting images from PDF is relevant in a variety of cases:
- Restoration of illustrations from old documents.
- Analysis of graphic content for learning or machine vision.
- Create galleries or collections of images for further processing.
- Integration of images into other systems or applications.
The frequency of use depends on the tasks of a particular project: in business, in research, or in document processing automation — there
are many such scenarios.
This code serves as the foundation for automated document processing:
- In the automatic collection of illustrative materials.
- In archiving and cataloging systems.
- In the preparation of data for training machine vision models.
- In digital content management solutions.
The advantage is that it can be easily integrated into larger systems, automate processes, and improve PDF efficiency.
Step-by-Step:
- Add SautinSoft.PdfFocus from Nuget.
- Load a PDF document.
- Extract all images from 1st PDF page.
- Show all extracted images.
Complete code
using System;
using System.IO;
using System.Collections.Generic;
using SautinSoft;
namespace Sample
{
class Sample
{
static void Main(string[] args)
{
// Before starting, we recommend to get a free key:
// https://sautinsoft.com/start-for-free/
// Apply the key here:
// SautinSoft.PdfFocus.SetLicense("...");
// Extract all images from 1st PDF page
SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
string pdfFile = Path.GetFullPath(@"..\..\..\simple text.pdf");
string imageDir = new DirectoryInfo(Directory.GetCurrentDirectory()).CreateSubdirectory("images").FullName;
List<PdfFocus.PdfImage> pdfImages = null;
f.OpenPdf(pdfFile);
if (f.PageCount > 0)
{
f.ImageOptions.SelectedPages = new int[] {0};
pdfImages = f.ExtractImages();
// Show all extracted images.
if (pdfImages != null && pdfImages.Count > 0)
{
for (int i = 0; i < pdfImages.Count; i++)
{
string imageFile = Path.Combine(imageDir, String.Format("img{0}.png", i + 1));
pdfImages[i].Picture.Encode(new FileStream(imageFile, FileMode.Create), SkiaSharp.SKEncodedImageFormat.Png, 100);
}
//System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(imageDir) { UseShellExecute = true });
}
}
}
}
}
Imports System
Imports System.IO
Imports System.Collections.Generic
Imports SautinSoft
Namespace Sample
Friend Class Sample
Shared Sub Main(ByVal args() As String)
' Before starting, we recommend to get a free key:
' https://sautinsoft.com/start-for-free/
' Apply the key here
' SautinSoft.PdfFocus.SetLicense("...");
' Extract all images from 1st PDF page
Dim f As New SautinSoft.PdfFocus()
Dim pdfFile As String = Path.GetFullPath("..\..\..\simple text.pdf")
Dim imageDir As String = (New DirectoryInfo(Directory.GetCurrentDirectory())).CreateSubdirectory("images").FullName
Dim pdfImages As List(Of PdfFocus.PdfImage) = Nothing
f.OpenPdf(pdfFile)
If f.PageCount > 0 Then
pdfImages = f.ExtractImages()
' Show all extracted images.
If pdfImages IsNot Nothing AndAlso pdfImages.Count > 0 Then
For i As Integer = 0 To pdfImages.Count - 1
Dim imageFile As String = Path.Combine(imageDir, String.Format("img{0}.png", i + 1))
pdfImages(i).Picture.Encode(New FileStream(imageFile, FileMode.Create), SkiaSharp.SKEncodedImageFormat.Png, 100)
Next i
System.Diagnostics.Process.Start(New System.Diagnostics.ProcessStartInfo(imageDir) With {.UseShellExecute = True})
End If
End If
End Sub
End Class
End Namespace
If you need a new code example or have a question: email us at support@sautinsoft.com or ask at Online Chat (right-bottom corner of this page) or use the Form below: