Load a document


Document .Net supports these formats:

To load a document from a file, a single line is enough:


            //It's easy to load any document.
            DocumentCore dc = DocumentCore.Load(@"d:\Book.pdf");
DocumentCore is root class, it represents a document itself.

In this example, the method Load() detects that a loadable document is PDF from the extension ".pdf".

You can also explicitly set the type of loadable document as second parameter. For example, PdfLoadOptions or DocxLoadOptions or another:


DocumentCore dc = DocumentCore.Load(@"d:\Book.pdf", new PdfLoadOptions()            
{
	// 'false' - means to load vector graphics as is. Don't transform it to raster images.
    RasterizeVectorGraphics = false,

	// The PDF format doesn't have real tables, in fact it's a set of orthogonal graphic lines.
	// In case of 'true' the component will detect and recreate tables from graphic lines.
	DetectTables = false,

	// 'Disabled' - Never load embedded fonts in PDF. Use the fonts with the same name installed at the system or similar by font metrics.
	// 'Enabled' - Always load embedded fonts in PDF.
	// 'Auto' - Load only embedded fonts missing in the system. In other case, use the system fonts.
    PreserveEmbeddedFonts = PropertyState.Auto
});

All load options are derived from the base abstract class LoadOptions.

After loading you get the full Tree Of Objects and can do anything you want: Find, Replace, Remove, Insert, Modify, Save to another format.

Document .Net provides you by the full tree of its objects after the loading document.
 

Load from a Stream is also straightforward:


            // Let us say we already have a DOCX document as array of bytes.        
            DocumentCore dc = null;
            using (MemoryStream docxStream = new MemoryStream(docxBytes))
            {
                dc = DocumentCore.Load(docxStream, new DocxLoadOptions());
            }
            // Here we can do with our document 'dc' anything we need.
 

Complete code

using System.IO;
using SautinSoft.Document;
using System;

namespace Example
{
    class Program
    {
        static void Main(string[] args)
        {
            LoadFromFile();
            //LoadFromStream();
            //LoadFromBytes()
        }

        /// <summary>
        /// Loads a document into DocumentCore (dc) from a file.
        /// </summary>
        /// <remarks>
        /// Details: https://www.sautinsoft.com/products/document/help/net/developer-guide/load-document.php
        /// </remarks>
        static void LoadFromFile()
        {
            string filePath = @"..\..\example.docx";
            // The file format is detected automatically from the file extension: ".docx".
            // But as shown in the example below, we can specify DocxLoadOptions as 2nd parameter
            // to explicitly set that a loadable document has Docx format.
            DocumentCore dc = DocumentCore.Load(filePath);

            if (dc!=null)
                Console.WriteLine("Loaded successfully!");

            Console.ReadKey();
        }

        /// <summary>
        /// Loads a document into DocumentCore (dc) from a Stream.
        /// </summary>
        /// <remarks>
        /// Details: https://www.sautinsoft.com/products/document/help/net/developer-guide/load-document.php
        /// </remarks>
        static void LoadFromStream()
        {
            // We've knowingly created an empty DocumentCore instance before "Using {}"
            // to continue work with it after stream will be closed.
            DocumentCore dc = null;
            using (FileStream fs = new FileStream(@"..\..\example.docx", FileMode.Open))
            {

                // Here we explicitly set that a loadable document is Docx.
                dc = DocumentCore.Load(fs, new DocxLoadOptions());
            }
            if (dc != null)
                Console.WriteLine("Loaded successfully!");

            Console.ReadKey();
        }

        /// <summary>
        /// Loads a document into DocumentCore (dc) from an array of bytes.
        /// </summary>
        /// <remarks>
        /// Details: https://www.sautinsoft.com/products/document/help/net/developer-guide/load-document.php
        /// </remarks>
        static void LoadFromBytes()
        {
            // Get document bytes from a file.
            byte[] fileBytes = File.ReadAllBytes(@"..\..\example.pdf");

            DocumentCore dc = null;
            using (MemoryStream ms = new MemoryStream(fileBytes))
            {

                // With PdfLoadOptions we explicitly set that a loadable document is PDF.
                PdfLoadOptions pdfLO = new PdfLoadOptions()
                {

                    // 'false' - means to load vector graphics as is. Don't transform it to raster images.
                    RasterizeVectorGraphics = false,

                    // The PDF format doesn't have real tables, in fact it's a set of orthogonal graphic lines.
                    // In case of 'true' the component will detect and recreate tables from graphic lines.
                    DetectTables = false,

                    // 'Disabled' - Never load embedded fonts in PDF. Use the fonts with the same name installed at the system or similar by font metrics.
					// 'Enabled' - Always load embedded fonts in PDF.
					// 'Auto' - Load only embedded fonts missing in the system. In other case, use the system fonts.
                    PreserveEmbeddedFonts = PropertyState.Auto,

                    // Load only first 2 pages from the document.
                    PageIndex = 0,
                    PageCount = 2
                };
                dc = DocumentCore.Load(ms, pdfLO);
            }
            if (dc != null)
                Console.WriteLine("Loaded successfully!");

            Console.ReadKey();
        }
    }
}

Download.

        
            Imports System
Imports System.IO
Imports SautinSoft.Document

Module Sample
    Sub Main()
        LoadFromFile()
        'LoadFromStream()
        'LoadFromBytes()
    End Sub

    ''' <summary>
    ''' Loads a document into DocumentCore (dc) from a file.
    ''' </summary>
    ''' <remarks>
    ''' Details: https://www.sautinsoft.com/products/document/help/net/developer-guide/load-document.php
    ''' </remarks>
    Sub LoadFromFile()
        Dim filePath As String = "..\example.docx"
        ' The file format is detected automatically from the file extension: ".docx".
        ' But as shown in the example below, we can specify DocxLoadOptions as 2nd parameter
        ' to explicitly set that a loadable document has Docx format.
        Dim dc As DocumentCore = DocumentCore.Load(filePath)

        If dc IsNot Nothing Then
            Console.WriteLine("Loaded successfully!")
            Console.ReadKey()
        End If
    End Sub

    ''' <summary>
    ''' Loads a document into DocumentCore (dc) from a Stream.
    ''' </summary>
    ''' <remarks>
    ''' Details: https://www.sautinsoft.com/products/document/help/net/developer-guide/load-document.php
    ''' </remarks>
    Sub LoadFromStream()
        ' We've knowingly created an empty DocumentCore instance before "Using {}"
        ' to continue work with it after stream will be closed.
        Dim dc As DocumentCore = Nothing
        Using fs As New FileStream("..\example.docx", FileMode.Open)

            ' Here we explicitly set that a loadable document is Docx.
            dc = DocumentCore.Load(fs, New DocxLoadOptions())
        End Using
        If dc IsNot Nothing Then
            Console.WriteLine("Loaded successfully!")
            Console.ReadKey()
        End If
    End Sub

    ''' <summary>
    ''' Loads a document into DocumentCore (dc) from an array of bytes.
    ''' </summary>
    ''' <remarks>
    ''' Details: https://www.sautinsoft.com/products/document/help/net/developer-guide/load-document.php
    ''' </remarks>
    Sub LoadFromBytes()
        ' Get document bytes from a file.
        Dim fileBytes() As Byte = File.ReadAllBytes("..\example.pdf")

        Dim dc As DocumentCore = Nothing
        Using ms As New MemoryStream(fileBytes)

            ' With PdfLoadOptions we explicitly set that a loadable document is PDF.
            Dim pdfLO As New PdfLoadOptions() With {
            .RasterizeVectorGraphics = False,
            .DetectTables = False,
            .PreserveEmbeddedFonts = PropertyState.Auto,
            .PageIndex = 0,
            .PageCount = 2
        }

        ' RasterizeVectorGraphics = False
        ' This means to load vector graphics as is. Don't transform it to raster images.

        ' DetectTables = False
        ' This means don't detect tables.
        ' The PDF format doesn't have real tables, in fact it's a set of orthogonal graphic lines.
        ' Set it to 'True' and the component will detect and recreate tables from graphic lines.

        'PreserveEmbeddedFonts = PropertyState.Auto
        ' True - Means to load embedded fonts from PDF document, 
        ' even if the font with the same name is installed in your System.
			
            dc = DocumentCore.Load(ms, pdfLO)
        End Using
        If dc IsNot Nothing Then
            Console.WriteLine("Loaded successfully!")
            Console.ReadKey()
        End If
    End Sub
End Module

Download.


If you need a new code example or have a question: email us at support@sautinsoft.com or ask at Online Chat (right-bottom corner of this page) or use the Form below:



Questions and suggestions from you are always welcome!

We are developing .Net components since 2002. We know PDF, DOCX, RTF, HTML, XLSX and Images formats. If you need any assistance with creating, modifying or converting documents in various formats, we can help you. We will write any code example for you absolutely free.