PdfFocusToHtml(Int32, Int32, ListSKBitmap) Method |
Saves a specific PDF page or diapason of pages to HTML document and returns it as string
Namespace: SautinSoftAssembly: SautinSoft.PdfFocus (in SautinSoft.PdfFocus.dll) Version: 2024.9.26
Syntax public string ToHtml(
int fromPage,
int toPage,
List<SKBitmap> extractedImages
)
Public Function ToHtml (
fromPage As Integer,
toPage As Integer,
extractedImages As List(Of SKBitmap)
) As String
Parameters
- fromPage Int32
- The starting page for exporting to HTML
- toPage Int32
- The ending page for exporting to HTML
- extractedImages ListSKBitmap
- The list with extracted images, must be not null.
Return Value
String
String with HTML document - in case of converting successful.
null - in case of converting failed.
Example How to convert PDF to HTML in memory and get List with all images in C#
using System;
using System.IO;
using System.Collections.Generic;
using System.Drawing;
namespace Sample
{
class Sample
{
static void Main(string[] args)
{
ConvertPdfBytesToHtml();
}
private static void ConvertPdfBytesToHtml()
{
string pdfFile = Path.GetFullPath(@"..\..\..\simple text.pdf");
string htmlFile = "Result.htm";
List<Image> imgCollection = new List<Image>();
SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
f.HtmlOptions.IncludeImageInHtml = true;
f.HtmlOptions.Title = "Simple text";
byte[] pdf = File.ReadAllBytes(pdfFile);
f.OpenPdf(pdf);
if (f.PageCount > 0)
{
string htmlString = f.ToHtml(1, f.PageCount, imgCollection);
if (htmlString != null)
{
Console.WriteLine("After converting we've got {0} image(s):", imgCollection.Count);
DirectoryInfo imgDir = new DirectoryInfo("Extracted Images");
if (!imgDir.Exists)
imgDir.Create();
int count = 1;
foreach (Image img in imgCollection)
{
Console.WriteLine("\t {0,4} x {1,4} px", img.Width, img.Height);
string imageFileName = Path.Combine(imgDir.FullName, String.Format($"pict{count}.jpg"));
img.Save(imageFileName, System.Drawing.Imaging.ImageFormat.Jpeg);
count++;
}
File.WriteAllText(htmlFile, htmlString);
System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(htmlFile) { UseShellExecute = true });
System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(imgDir.FullName) { UseShellExecute = true });
}
}
}
}
}
How to convert PDF to HTML in memory and get List with all images in VB.Net
Imports Microsoft.VisualBasic
Imports System
Imports System.IO
Imports System.Collections.Generic
Imports System.Drawing
Namespace Sample
Friend Class Sample
Shared Sub Main(ByVal args() As String)
ConvertPdfBytesToHtml()
End Sub
Private Shared Sub ConvertPdfBytesToHtml()
Dim pdfFile As String = Path.GetFullPath("..\..\..\simple text.pdf")
Dim htmlFile As String = "Result.htm"
Dim imgCollection As New List(Of Image)()
Dim f As New SautinSoft.PdfFocus()
f.HtmlOptions.IncludeImageInHtml = True
f.HtmlOptions.Title = "Simple text"
Dim pdf() As Byte = File.ReadAllBytes(pdfFile)
f.OpenPdf(pdf)
If f.PageCount > 0 Then
Dim htmlString As String = f.ToHtml(1, f.PageCount, imgCollection)
If htmlString IsNot Nothing Then
Console.WriteLine("After converting we've got {0} image(s):", imgCollection.Count)
Dim imgDir As New DirectoryInfo("Extracted Images")
If Not imgDir.Exists Then
imgDir.Create()
End If
Dim count As Integer = 1
For Each img As Image In imgCollection
Console.WriteLine(vbTab & " {0,4} x {1,4} px", img.Width, img.Height)
Dim imageFileName As String = Path.Combine(imgDir.FullName, String.Format($"pict{count}.jpg"))
img.Save(imageFileName, System.Drawing.Imaging.ImageFormat.Jpeg)
count += 1
Next img
File.WriteAllText(htmlFile, htmlString)
System.Diagnostics.Process.Start(New System.Diagnostics.ProcessStartInfo(htmlFile) With {.UseShellExecute = True})
System.Diagnostics.Process.Start(New System.Diagnostics.ProcessStartInfo(imgDir.FullName) With {.UseShellExecute = True})
End If
End If
End Sub
End Class
End Namespace
See Also