Exporting Table Data to TXT Files with C# and .NET

Working with PDF documents often requires extracting data from tables and exporting it to text files for further analysis or processing. In this article, we will look at how to use C# and .NET perform these tasks using the SautinSoft PDF.Net library. Extracting data from tables in PDF documents can be useful for analyzing data, converting it to other formats, or for further processing.

After extracting data from tables, it may be necessary to export them to a text file for further analysis or processing. Let's look at an example code that demonstrates how to find and extract tables from a PDF document, as well as how to export table data to a text file:

  1. Add SautinSoft.PDF from NuGet.
  2. Load a PDF document.
  3. Find tables.
  4. Get content data from tables.
  5. Save the txt document format.

Input file:

Output result:

Complete code

using System;
using System.Globalization;
using System.IO;
using System.Reflection;
using System.Text.Json;
using SautinSoft;
using SautinSoft.Pdf;
using SautinSoft.Pdf.Content;

namespace Sample
{
    class Sample
    {
        /// <summary>
        /// Find Tables
        /// </summary>
        /// <remarks>
        /// Details: https://sautinsoft.com/products/pdf/help/net/developer-guide/export-data-from-table-to-txt.php
        /// </remarks>
        static void Main(string[] args)
        {
            // Before starting this example, please get a free 100-day trial key:
            // https://sautinsoft.com/start-for-free/

            // Apply the key here:
            // PdfDocument.SetLicense("...");

            string pdfFile = Path.GetFullPath(@"..\..\..\tables.pdf");
            var writer = new StringWriter(CultureInfo.InvariantCulture);

            using (var document = PdfDocument.Load(pdfFile))
            {
                // Find Tables.
                var tables = document.Pages[0].Content.FindTables();

                string format = "{0,-20}|{1,-20}", separator = new string('-', 40);

                // Get text from tables.
                foreach (var table in tables) 
                {
                    foreach (var row in table.Rows)
                    {
                        writer.WriteLine(format, row.Cells[0].ToString(), row.Cells[1].ToString());
                        writer.WriteLine(separator);
                    }
                    writer.WriteLine();
                }
            }

            var file = new FileStream("Output.txt", FileMode.Create);
            StreamWriter streamWriter = new StreamWriter(file);
            streamWriter.WriteLine(writer.ToString());
            streamWriter.Close();
            file.Close();

            System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo("Output.txt") { UseShellExecute = true });
        }
    }
}

Download

Option Infer On

Imports System
Imports System.Globalization
Imports System.IO
Imports System.Reflection.Metadata
Imports System.Text.Json
Imports SautinSoft
Imports SautinSoft.Pdf
Imports SautinSoft.Pdf.Content

Namespace Sample
	Friend Class Sample
		''' <summary>
		''' Find Tables
		''' </summary>
		''' <remarks>
		''' Details: https://sautinsoft.com/products/pdf/help/net/developer-guide/export-data-from-table-to-txt.php
		''' </remarks>
		Shared Sub Main(ByVal args() As String)
			' Before starting this example, please get a free 100-day trial key:
			' https://sautinsoft.com/start-for-free/

			' Apply the key here:
			' PdfDocument.SetLicense("...");

			Dim pdfFile As String = Path.GetFullPath("..\..\..\tables.pdf")
			Dim writer = New StringWriter(CultureInfo.InvariantCulture)

			Using document = PdfDocument.Load(pdfFile)
				' Find Tables.
				Dim tables = document.Pages(0).Content.FindTables()

				Dim format As String = "{0,-20}|{1,-20}", separator As New String("-"c, 40)

				' Get text from tables.
				For Each table In tables
					For Each row In table.Rows
						writer.WriteLine(format, row.Cells(0).ToString(), row.Cells(1).ToString())
						writer.WriteLine(separator)
					Next row
					writer.WriteLine()
				Next table
			End Using

			Dim file = New FileStream("Output.txt", FileMode.Create)
			Dim streamWriter As New StreamWriter(file)
			streamWriter.WriteLine(writer.ToString())
			streamWriter.Close()
			file.Close()

			System.Diagnostics.Process.Start(New System.Diagnostics.ProcessStartInfo("Output.txt") With {.UseShellExecute = True})
		End Sub
	End Class
End Namespace

Download


If you need a new code example or have a question: email us at support@sautinsoft.com or ask at Online Chat (right-bottom corner of this page) or use the Form below:



Questions and suggestions from you are always welcome!

We are developing .Net components since 2002. We know PDF, DOCX, RTF, HTML, XLSX and Images formats. If you need any assistance with creating, modifying or converting documents in various formats, we can help you. We will write any code example for you absolutely free.