Logo

dev-resources.site

for different kinds of informations.

Django 3.2 - News on compressed fixtures and fixtures compression

Published at
4/7/2021
Categories
benchmark
compression
django
python
Author
pauloxnet
Author
9 person written this
pauloxnet
open
Django 3.2 - News on compressed fixtures and fixtures compression

In the Django 3.2 version just released I contributed with new features related to compressed fixtures and fixtures compression. In this article I have explored into the topic and produced some sample benchmarks.

Management Commands

As reported in the documentation, the changes are related to the scope of the management commands.

  • loaddata now supports fixtures stored in XZ archives (.xz) and LZMA archives (.lzma).

  • dumpdata now can compress data in the bz2, gz, lzma, or xz formats.

loaddata

The loaddata command searches for and loads the contents of the named fixture into the database.

Compressed fixtures

In the Django 3.2 version was addes support for xz archives (.xz) and lzma archives (.lzma).

Fixtures may be compressed in zip, gz, bz2, lzma, or xz format.

For example $ django-admin loaddata mydata.json would look for any of mydata.json, mydata.json.zip, mydata.json.gz, mydata.json.bz2, mydata.json.lzma, or mydata.json.xz.

The first file contained within a compressed archive is used.

dumpdata

The dumpdata outputs all data in the database associated with some or installed applications. The output of dumpdata can be used as input for loaddata.

Fixtures compression

In the Django 3.2 version was addes support to dump data directly to a compressed file.

The output file can be compressed with one of the bz2, gz, lzma, or xz formats by ending the filename with the corresponding extension.

For example, to output the data as a compressed JSON file $ django-admin dumpdata -o mydata.json.gz

Benchmarks

After the development of the new fixtures compression function I carried out benchmarks for all supported file formats starting from different databases, from small projects to larger ones.

The benchmark were performed on my pc and are only examples of the relationship between time, file size, memory and cpu occupation that is needed to export data directly into different type of compressed files.

System info

import os
import platform

print(f"Architecture:\t{platform.architecture()[0]}")
print(f"Machine type:\t{platform.machine()}")
print(f"System glibc:\t{platform.libc_ver()[1]}")
print(f"System memory:\t{os.sysconf('SC_PAGE_SIZE') * os.sysconf('SC_PHYS_PAGES')}")
print(f"System release:\t{platform.release()}")
print(f"System type:\t{platform.system()}")
print(f"Python impl.:\t{platform.python_implementation()}")
print(f"Python version:\t{platform.python_version()}")
Enter fullscreen mode Exit fullscreen mode
Architecture:   64bit
Machine type:   x86_64
System glibc:   2.32
System memory:  33402449920
System release: 5.8.0-48-generic
System type:    Linux
Python impl.:   CPython
Python version: 3.8.6
Enter fullscreen mode Exit fullscreen mode

Benchmark 01

type time memory cpu size
txt 0.75 70300 99 826
gz 0.66 70920 99 312
bz2 0.69 70372 99 351
xz 0.67 86832 99 336

Benchmark 01

Benchmark 02

type time memory cpu size
txt 0.67 70260 99 1202
gz 0.66 70868 99 501
bz2 0.66 70560 99 538
xz 0.68 86860 99 532

Benchmark 02

Benchmark 03

type time memory cpu size
txt 1.03 72856 98 872126
gz 1.08 72904 99 30446
bz2 1.21 79024 99 20664
xz 1.14 96608 99 23252

Benchmark 03

Benchmark 04

type time memory cpu size
txt 1.53 71304 98 2138437
gz 1.60 72004 98 257593
bz2 1.71 77732 98 198347
xz 2.42 107252 99 164072

Benchmark 04

Benchmark 05

type time memory cpu size
txt 2.10 74240 98 5252952
gz 2.25 74236 98 405580
bz2 2.69 80592 98 334556
xz 3.22 137004 99 238432

Benchmark 05

Benchmark 06

type time memory cpu size
txt 55.31 87012 73 12092981
gz 71.97 87200 71 845193
bz2 53.74 93372 74 688968
xz 73.19 180936 73 768812

Benchmark 06

Benchmark 07

type time memory cpu size
txt 118.74 86344 74 36035128
gz 183.11 86572 71 3936656
bz2 158.76 93272 73 2719186
xz 220.65 181636 73 2586748

Benchmark 07

Benchmark 08

type time memory cpu size
txt 532.92 89192 79 394846146
gz 711.72 89944 77 94789125
bz2 673.47 96284 78 73823620
xz 1217.50 184724 79 64908128

Benchmark 08

Conclusions

From the benchmarks carried out with various starting data in exporting data directly to compressed files, it is clear that:

  • the xz format almost always produces the smallest files in the face of greater memory and cpu occupation
  • the gz and bz2 formats almost always have execution times comparable to saving on simple and uncompressed text files in the face of a strong reduction in the space occupied
  • the space gain in the generated compressed files compared to the uncompressed text file ranges from 55% to 98%
  • export execution times for compressed files are in the worst case (xz) about double the export in an uncompressed file and in the best case (gz) a tenth faster

The export of fixtures directly to compressed files therefore allows a strong reduction of the space occupied in the face of a small increase in the time and resources required for creation.

In addition there is the possibility for the user to choose the best file type for their use case, opting for maximum compression (xz) or for greater portability (gz).

External links

  • PR #12871: Added tests for loaddata with gzip/bzip2 compressed fixtures.
  • Ticket #31552: Loading lzma compressed fixtures.
  • PR #12879: Fixed #31552 -- Added support for LZMA and XZ fixtures to loaddata.
  • Ticket #32291: Add support for fixtures compression in dumpdata.
  • PR #13797: Fixed #32291 -- Added fixtures compression support to dumpdata.

License

This article and related presentation is released with Creative Commons Attribution ShareAlike license (CC BY-SA)

Original

Originally posted on my blog:

https://www.paulox.net/2021/04/06/django-32-news-on-compressed-fixtures-and-fixtures-compression/

compression Article's
30 articles in total
Favicon
AVIF File Format: The Evolution in Web Image Compression
Favicon
Lightweight, Transparent, Animated: Get All of Them by WebP Format
Favicon
Flutter Image Compression: Ensuring High Quality
Favicon
Compression Is About Knowing Your Data
Favicon
How to use compression in Node.js server for better bandwidth ?
Favicon
Dall.E Image Gen, And Size Comparison Of Image Formats
Favicon
When life gives you lemons ...
Favicon
Image Compression in JavaScript/TypeScript
Favicon
How to Use IMGCentury For Bulk & Unlimited Image Compressions?
Favicon
Created simple phone number compression functions.
Favicon
Exploring the LZ77 Compression Algorithm
Favicon
Ultimate 3D Compression: echo3D Introduces Compression Tools for 3D Content to Accelerate 3D Development
Favicon
Reduce Network Usage - With This 1 Simple Trick!
Favicon
How Finding the Right Compression Level Can Speed Up your Website
Favicon
Client-side image compression with Firebase and Compressor.js
Favicon
Client-side image compression with Supabase Storage
Favicon
COS Cost Optimization Solution
Favicon
Is there any way to compress the data while using mongo persistence with NEventStore?
Favicon
Search safe number compression algorithm(feat. Python)
Favicon
Golang HTTP Handler With Gzip
Favicon
How to use AVIF today!
Favicon
Large documents in redis: is it worth compressing them (Part 1)
Favicon
Open or create TAR files in C# with Aspose.ZIP
Favicon
Supercharge your API with Compression
Favicon
Django 3.2 - News on compressed fixtures and fixtures compression
Favicon
How I made my own file compressor using Node.js
Favicon
Compressed GraalVM Native Images: the best startup for Java apps comes in tiny packages
Favicon
What is lossy and lossless compression
Favicon
did you ever ask yourself how file compression works ? the maths behind greedy algos
Favicon
Huffman coding from scratch with Elixir

Featured ones: