Logo

dev-resources.site

for different kinds of informations.

PDF Scan File Size: What To Do About It.

Published at
9/14/2024
Categories
productivity
docker
environment
efficiency
Author
realvorl
Author
8 person written this
realvorl
open
PDF Scan File Size: What To Do About It.

badge

In today's digital age, we're constantly creating, sharing, and storing documents and media. While the cost of storage has dropped and internet speeds have skyrocketed, there's a hidden cost we often overlook—our environmental impact. A wonderful resource for insights on how our digital activities affect the environment is The Green Web Foundation, particularly their calculators.

The Environmental Cost of Large Files

It might not be immediately obvious, but the size of the files we share and store contributes to global energy consumption. Data centers, which store everything from photos to PDFs, use massive amounts of electricity. Smaller files mean less data transferred, processed, and stored—leading to a direct reduction in energy usage and, ultimately, CO₂ emissions.

The 'Ubuntools' Docker Image

This is where my project, the ubuntools Docker image, comes in handy. Initially created as a basic toolkit for probing APIs and aggregating data, I expanded the project to include tools for compressing PDFs and media files. Why? Because reducing file size doesn’t just save space—it helps reduce the environmental impact of our digital lives.

Why Document and Media Compression Matters

Even though cloud storage seems limitless and fiber internet offers instant downloads, the carbon footprint of transferring large files still matters. Every megabyte of data requires energy to transmit and store. By compressing documents and media, we can make a small but meaningful contribution to minimizing our environmental impact. Here's an example of how I incorporated this idea into ubuntools:

I planned to send an article from a magazine to a contact of mine as an email attachment since there was no online version to simply share the link. So, I fired up my flatbed scanner and scanned the three pages:

can't believe my eyes

To my surprise, for a 300 dpi resolution, the result was a 1.77 MB PDF file. That would not even fit on a standard floppy disk back in the day—unacceptable!

I planned to do some editing of the files anyway, using GIMP (for image corrections, cropping, fine rotation, etc.), so I told myself, "Once the shading is gone and the colors are uniform, the PDF size will surely reduce."

BTW, if you're interested in a tutorial on editing PDFs with GIMP (all free and open source software), leave a comment. If there's enough interest, I'll write up a tutorial on the top 10 things you need and how to accomplish them using GIMP.

But then I got an even bigger surprise. After rotation, color correction, and exporting the layers as pages of the PDF, I felt like this:

how I felt

Well... 💩 The file was now 11.4 MB—kind of going in the wrong direction!

So, I taught ubuntools some new tricks. Under the pdf-processing tag, you'll find a base Ubuntu Docker image with the following tools:

  • ghostscript
  • pdftk-java
  • poppler-utils
# start ubuntools in the directory where your big PDF file are
docker run -it --rm -v $(pwd):/work --workdir /work viorelpe/ubuntools:pdf-processing /bin/bash
Enter fullscreen mode Exit fullscreen mode
# execute the following command
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=compressed.pdf original.pdf
Enter fullscreen mode Exit fullscreen mode

This command reduces the scanned document's size while maintaining high quality—perfect for email attachments or archiving. How much did it reduce the size? It came down to 0.71 MB, which is a considerable improvement.

Here’s the finished product and the original side-by-side:

compare uncompressed and compressed

Check the difference for yourself on GitHub.

Expanding with Media Compression

From here, it's easy to integrate other media compression utilities, such as FFmpeg, to reduce the size of videos and images. These tools, combined with ubuntools, make it real easy for you, because all you have to do, is to run two commands:

  1. Start ubuntools with the appropriate tools (via tag).
  2. Run a command and feed it your files.

Conclusion: Think Small, Act Big

File size might seem trivial in an era of "unlimited" storage and bandwidth, but it's the small, cumulative actions that matter. Compress your files, shrink your media, and contribute to a greener future.

badge

efficiency Article's
30 articles in total
Favicon
Top 10 Books for Boosting Efficiency, Productivity, and Performance
Favicon
How Automation in DevOps Minimizes Human Errors and Boosts Efficiency
Favicon
Streamlining Supply Chains: How Incident Response and Automation Platforms Transform Logistics
Favicon
IntelliJ Shortcuts (for Mac)
Favicon
Enhancing Your Batch Processing System: Strategies for Efficiency and Scalability
Favicon
Empowering Ops Teams Driving Efficiency and Stability
Favicon
Leadership in the Balance - Navigating Execution and Innovation
Favicon
Maximizing Efficiency and Collaboration with Gig Workers
Favicon
PDF Scan File Size: What To Do About It.
Favicon
Benefits of Training & Development for Employees
Favicon
Boost Your Efficiency with the Ultimate Awesome Efficiency List
Favicon
Unlocking Efficiency And Engagement: Workday’s Latest Innovations In 2024R1 Release
Favicon
50% Faster App Development: The Power of Partnering with Us
Favicon
What Can We Achieve with Artificial Intelligence in Customer Service?
Favicon
Unleashing Efficiency: The Power of DevOps Automation in Modern Software Development
Favicon
How IoT is Transforming Construction Site Safety and Efficiency
Favicon
Design It Practical and Simple (DIPS)
Favicon
Understanding LoRA - Low-Rank Adaptation for Efficient Machine Learning
Favicon
Are You Wasting Money in Your Software Development Project?
Favicon
Eat That Frog Method: The Ultimate Guide to Boosting Productivity
Favicon
Effective Branch Management Strategies with Git
Favicon
Big O Notation
Favicon
Exploring the depths of Java's Stream API
Favicon
Dive into optimizing Node.js applications for speed and efficiency.
Favicon
Azure Monitor Parte Final: Dica para alcançar a Eficiência De Desempenho de VMs no Azure
Favicon
Empowering Efficiency: Exploring the World of Codeless Automation
Favicon
Speed Meets Efficiency: Revolutionizing AI with Faster, Lighter Diffusion Models
Favicon
Node.js Worker Threads Vs. Child Processes: Which one should you use?
Favicon
Achieving High-Level Atomic Operations in Go
Favicon
Revolutionizing Testing Efficiency: Embrace Regression Testing Seamless Automation

Featured ones: