How to convert DOCX to HTML in C# and .NET

RTF to HTML .Net

.Net assembly to convert Text, RTF and DOCX to HTML 3.2, 4.01, XHTML and HTML5 in .Net and C#.
How to convert DOCX to HTML in C# and .NET. Various examples.

RTF to HTML .Net

How to convert DOCX to HTML in C# and .NET
How to convert DOCX to HTML in C# and .NET. Various examples.

Introduction

With the help of "RTF to HTML .Net", any .NET application can easily convert DOCX documents to HTML and XHTML format. For example, to convert a DOCX to HTML in C# you will only need to add a reference to the .dll and type a few lines of code:

string inpFile = @"..\..\..\..\example.docx";
            string outfile = Path.GetFullPath("Result.html");
            
            RtfToHtml r = new RtfToHtml();
            r.Convert(inpFile, outfile, new HtmlFixedSaveOptions() {Title = "SautinSoft Example." });

The library gives you a full set of API to convert DOCX to HTML. Furthermore, during conversion to HTML you may adjust following:

  • Various output format: HTML 3.2, HTML 4.01, HTML 5, XHTML.
  • Generating output document in plain HTML 3.2 without CSS.
  • Whether to store images on filesystem or embed them into HTML document using base64 encoding.
  • Save CSS data between tags <style>...</style> or as inline styles <tag style="...">.
  • Specify encoding of output HTML.
  • Set up document Title; create only the part of HTML between <body>...</body> tags.
  • Set up a common font, size and color for a whole document.
  • Detect hyperlinks from text and make them real hyperlinks.
  • Override the table borders visibility.

Download

To see this functionality firsthand, download the freshest «RTF to HTML .Net» with code examples, 20.4 Mb.

Limitations

RTF to HTML .Net The limitations of the free version are: The trial notice "Created by unlicensed version of RTF to HTML .Net" and the random addition of the word "TRIAL".


Some examples to convert DOCX to HTML in C# and VB.NET

1. Convert DOCX file to HTML file in C#:

static void Main(string[] args)
        {
            ConvertDocxToHtml();
        }
        /// <summary>
        /// Convert DOCX file to HTML file.
        /// </summary>
        static void ConvertDocxToHtml()
        {
            string inpFile = @"..\..\..\..\example.docx";
            string outfile = Path.GetFullPath("Result.html");
            
            RtfToHtml r = new RtfToHtml();
            r.Convert(inpFile, outfile, new HtmlFixedSaveOptions() {Title = "SautinSoft Example." });

            // Open the result for demonstration purposes.
            System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(outfile) { UseShellExecute = true });
        }
2. Convert DOCX to HTML using MemoryStream in C#.
 RtfToHtml r = new RtfToHtml();

            using (MemoryStream inpMS = new MemoryStream(File.ReadAllBytes(inpFile)))
            {
                using (MemoryStream outMS = new MemoryStream())
                {
                    r.Convert(inpMS, outMS, new HtmlFixedSaveOptions() { Title = "SautinSoft Example." });
                    // Save the result from MemoryStream to the file to show the result.
                    File.WriteAllBytes(outfile, outMS.ToArray());
                }
            }  
3. Convert DOCX to HTML in C#; Specify CSS Stream.
RtfToHtml r = new RtfToHtml();

            // Create a separate file to store css.
            FileStream fs = new FileStream(cssFile, FileMode.Create);

            HtmlFlowingSaveOptions opt = new HtmlFlowingSaveOptions()
            {
                CssStream = fs,
                KeepCssStreamOpen = false,
                CssExportMode = CssExportMode.External,
                CssFileName = cssFile,
                Title = "Working with CSS."
            };

            try
            {
                r.Convert(inpFile, outFile, opt);
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Conversion failed! {ex.Message}");
            }
4. Convert DOCX to HTML in C#; Set directory to store images.
RtfToHtml r = new RtfToHtml();

            // Set images directory
            HtmlFixedSaveOptions opt = new HtmlFixedSaveOptions()
            {
                ImagesDirectoryPath = Path.Combine(imgDir, "Result_images"),
                ImagesDirectorySrcPath = "Result_images",
                // Change to store images as physical files on local drive.
                EmbedImages = false
            };

Requirements and Technical Information

«RTF to HTML .Net» can be used on 32 and 64-bits platforms with .NET Framework 4.5 and higher, .NET Core 2.0 and higher. The component doesn't require Internet Explorer, Microsoft Office or any other software. It's absolutely standalone and independent library.

The DOCX conversion works starting from .NET Framework 4.5 and higher, .NET Core 2.0 and higher. If you are looking for a standalone C# library to create and parse Word documents, try our Document .Net.

Our product is compatible with all .NET languages and supports all Operating Systems where .NET Framework can be used. Note that «RTF to HTML .Net» is entirely written in managed C#.

.Net Framework 4.0 and higher and .Net Core 2.0 and higher

.NET Framework 4.5, 4.6.1 and higher.

.NET Standard 2.0

.NET Core, .NET 5.0 and higher.


Multi-platform component, runs on:


Our component has proven itself on cloud platforms and services:

  • Microsoft Azure
  • Amazon Web Services (AWS)
  • Google Cloud Platform
  • SharePoint
  • Docker
  • Xamarin Forms
  • etc.