Read text from PDF files in C# and VB.NET

SautinSoft.Pdf can read PDF files from C# or VB.NET applications at very high speeds; it can read the text of a 1,000 page PDF file (almost 500,000 words) in just 3 seconds.

Text extraction is fairly easy to perform. With a simple API and just a few lines of code, the entire text content from a PDF file can be extracted in a single String, ready for your further processing.

The following example shows how to easily read the text content of each page of a PDF document.

Complete code

using System;
using System.IO;
using SautinSoft;
using SautinSoft.Pdf;
using SautinSoft.Pdf.Content;

namespace Sample
    class Sample
        /// <summary>
        /// Create a page tree.
        /// </summary>
        /// <remarks>
        /// Details:
        /// </remarks>
        static void Main(string[] args)
            // Before starting this example, please get a free 30-day trial key:

            // Apply the key here:
            // PdfDocument.SetLicense("...");
            string pdfFile = Path.GetFullPath(@"..\..\..\simple text.pdf");

            // Load PDF Document.
            using (var document = PdfDocument.Load(pdfFile))
                foreach (var page in document.Pages)
                    // Write text from pdf file to console.


If you need a new code example or have a question: email us at or ask at Online Chat (right-bottom corner of this page) or use the Form below:

Questions and suggestions from you are always welcome!

We are developing .Net components since 2002. We know PDF, DOCX, RTF, HTML, XLSX and Images formats. If you need any assistance with creating, modifying or converting documents in various formats, we can help you. We will write any code example for you absolutely free.