Logo

dev-resources.site

for different kinds of informations.

The Struggle of Finding a Free Excel to PDF Converter: My Journey and Solution

Published at
1/12/2025
Categories
go
libreoffice
excel
pdf
Author
wteja
Categories
4 categories in total
go
open
libreoffice
open
excel
open
pdf
open
Author
5 person written this
wteja
open
The Struggle of Finding a Free Excel to PDF Converter: My Journey and Solution

Converting Excel files to PDF is a common task in many projects, whether for generating reports, sharing data, or creating documents. Like many developers, I initially believed this would be an easy task to automate. However, my search for a free, reliable solution turned into a frustrating journey filled with limitations, compatibility issues, and expensive tools.

Here’s how I overcame these challenges, built my own Excel-to-PDF converter, and made it available as an open-source tool for others who may be struggling like I did.


The Frustration

Commercial Tools

My initial search brought me to paid solutions like Aspose.Cells, Syncfusion, and others. While they offered robust features, they came with steep licensing costs—well beyond what I could justify for small or personal projects.

Online Services

Free online converters seemed like a promising alternative, but they were unsuitable for automation. These tools often raised privacy concerns (since files are uploaded to third-party servers), had file size limits, and didn’t provide programmatic APIs.

Open-Source Libraries

I also explored open-source libraries, but most lacked the ability to convert Excel files to PDF. Even those that did were either unreliable or didn’t support modern Microsoft Office formats.


Discovering LibreOffice in Headless Mode

After weeks of searching, I stumbled upon the idea of using LibreOffice in headless mode. LibreOffice is a free, open-source office suite that can convert various file formats, including Excel, to PDF. When run in headless mode, it operates via the command line, making it perfect for automation.


How My Solution Works

To make this approach developer-friendly, I built a lightweight Go-based HTTP server that acts as a REST API. This server wraps LibreOffice’s functionality and allows any programming language to interact with it via HTTP requests.

Key Features

  1. Multiple File Format Support: Supports .xlsx, .xls, .csv, .docx, .pptx, and more.
  2. Automatic Cleanup: Temporary files are automatically deleted after one hour to save disk space.
  3. Custom Fonts: You can mount custom fonts by cloning the GitHub repository or using Docker volumes.
  4. Cross-Language Integration: Works with any programming language that supports HTTP.

The Temporary Directory Approach

Instead of relying on the system’s temporary directory, I opted to use a custom ./tmp directory. This ensures consistent behavior, as system temp directories sometimes have unpredictable permissions.


Implementation Details

How It Works

  1. File Upload: Clients upload an Excel file via the /convert endpoint using a POST request.
  2. Temporary Storage: The server saves the file in the ./tmp directory with a timestamp-based filename.
  3. Conversion: LibreOffice is called in headless mode to convert the file to PDF and save the result in the same directory.
  4. File Cleanup: A background goroutine deletes files older than one hour.
  5. Response: The converted PDF is returned as the HTTP response.

Getting Started

GitHub Repository

You can find the source code at https://github.com/wteja/pdf-converter.

Docker Image

The project is also available as a Docker image: wteja/pdf-converter.

Running the Docker Container

docker pull wteja/pdf-converter
docker run -p 5000:5000 wteja/pdf-converter
Enter fullscreen mode Exit fullscreen mode

Examples of Integrating with Other Languages

Since the service is exposed via HTTP, you can use any programming language to interact with it.

C#

var client = new HttpClient();
var fileContent = new ByteArrayContent(File.ReadAllBytes("example.xlsx"));
var formData = new MultipartFormDataContent { { fileContent, "file", "example.xlsx" } };

var response = await client.PostAsync("http://localhost:5000/convert", formData);
var pdfBytes = await response.Content.ReadAsByteArrayAsync();
File.WriteAllBytes("output.pdf", pdfBytes);
Enter fullscreen mode Exit fullscreen mode

Node.js

const axios = require("axios");
const FormData = require("form-data");
const fs = require("fs");

const form = new FormData();
form.append("file", fs.createReadStream("example.xlsx"));

axios.post("http://localhost:5000/convert", form, { headers: form.getHeaders() })
  .then(response => fs.writeFileSync("output.pdf", response.data))
  .catch(console.error);
Enter fullscreen mode Exit fullscreen mode

Python

import requests

with open("example.xlsx", "rb") as f:
    response = requests.post("http://localhost:5000/convert", files={"file": f})

with open("output.pdf", "wb") as f:
    f.write(response.content)
Enter fullscreen mode Exit fullscreen mode

Go

package main

import (
    "bytes"
    "io"
    "mime/multipart"
    "net/http"
    "os"
)

func main() {
    file, _ := os.Open("example.xlsx")
    defer file.Close()

    body := &bytes.Buffer{}
    writer := multipart.NewWriter(body)
    part, _ := writer.CreateFormFile("file", "example.xlsx")
    io.Copy(part, file)
    writer.Close()

    req, _ := http.NewRequest("POST", "http://localhost:5000/convert", body)
    req.Header.Set("Content-Type", writer.FormDataContentType())

    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()

    out, _ := os.Create("output.pdf")
    defer out.Close()
    io.Copy(out, resp.Body)
}
Enter fullscreen mode Exit fullscreen mode

Challenges and Trade-Offs

Image Size

The Docker image is 2.67 GB due to the dependencies required by LibreOffice. While I tested smaller images like Alpine, they shipped with an older version of LibreOffice that wasn’t compatible with modern Microsoft Office formats. Debian, although offering the latest LibreOffice, resulted in an even larger image (~3 GB).

Why It’s Worth It

The large image size is a reasonable trade-off when compared to the cost of commercial solutions. Once set up, the image can be reused across multiple projects without any additional licensing fees.


Conclusion

The frustration of finding a free Excel-to-PDF converter led me to build my own solution using LibreOffice in headless mode. While it’s not perfect, it’s free, reliable, and flexible. If you’re facing the same challenge, I hope this project saves you time and effort.

Check out the project on GitHub or pull the Docker image from Docker Hub. Let me know how it works for you or if you have suggestions for improvement.

pdf Article's
30 articles in total
Favicon
Transforming Starlight into PDF: experience and insights
Favicon
Intelligent PDF Data Extraction and database creation
Favicon
The Struggle of Finding a Free Excel to PDF Converter: My Journey and Solution
Favicon
Guess what? You can make a game inside a PDF!
Favicon
What is Instafill.ai and why it works?
Favicon
How to Save and Open PDFs in Files App with Shortcuts: Specify Path and Filename for Better Access
Favicon
23 Free Online Tools for PDF/Image Conversion & Data Extraction
Favicon
How to Insert Signatures into PDF Documents with HTML5 and JavaScript
Favicon
Easily Manage Multiple PDFs Simultaneously Using Flutter PDF Viewer
Favicon
How to Generate Invoice PDF in Laravel?
Favicon
Using LangChain to Search Your Own PDF Documents
Favicon
Add hyperlink to any Text to another field of same PDF in Angular
Favicon
🚀 Generate Dynamic PDFs in Laravel with DomPDF
Favicon
🛠 Build a Professional CV in PDF with Markdown and Hugo
Favicon
Printer Scanners VS Mobile Scanner - Do Printers Still Have a Role?
Favicon
Merge PDFs Recursively - Python
Favicon
Replace Text in PDFs Using Python
Favicon
Top 9 PDF Generator APIs in 2024
Favicon
HTML2PDF.Lib: A melhor forma de converter HTML para PDF com .Net
Favicon
How to Sign PDFs Online for Free with BoldSign
Favicon
How to Detect and Save Documents to PDF with HTML5 and JavaScript
Favicon
uniapp 入门实战 19:将前端页面导出成pdf
Favicon
Identify and Highlight Spelling Errors in PDFs Using Flutter PDF Viewer
Favicon
Combine PDF Files with PDF API
Favicon
6 Effective Ways to Merge PDF Files Using C#
Favicon
Decoding 1D/2D Barcodes from Multi-Page PDFs Using C++ and Node.js
Favicon
How to add image to PDF in C# (Developer Tutorial)
Favicon
How to Read DataMatrix and Other 1D/2D Barcodes from PDF Files in HTML5 and JavaScript
Favicon
Ferrum Doesn’t Work on Heroku?
Favicon
Unlocking Text from Embedded-Font PDFs: A pytesseract OCR Tutorial

Featured ones: