Logo

dev-resources.site

for different kinds of informations.

A Comparison of Database Drivers for MySQL

Published at
10/18/2021
Categories
database
connectivity
integration
data
Author
jerodimusprime
Author
14 person written this
jerodimusprime
open
A Comparison of Database Drivers for MySQL

MySQL is "The world's most popular open source database" and it is widely-used to store and access your data. In fact, there are many cloud and on-premise databases (like MemSQL and Google Cloud SQL) that use the MySQL interface. Several native, open-source drivers are available for connecting to your MySQL data from other applications.

In this article, we will compare the read and write performance of the native MySQL Connectors (JDBC and ODBC)1 and the CData Software JDBC and ODBC Drivers for MySQL2 when working with larger datasets. We show how the CData Driver is able to read large data sets approximately 20% faster than the native driver and write 1 million rows nearly 40% faster than the native drivers.

The Data

In order to provide a reproducible comparison for reading data, we copied the US Amazon book reviews from the Amazon Customer Reviews dataset3 into a MySQL database. The details for the amazon_book_reviews table are below:

Table Size : 9.7 GB

Table Number of Rows : 10,534,179

Number of Columns : 15

# JDBC Driver Read Performance

The main goal of this investigation was to compare the related performance of the JDBC drivers. We did this by running the same set of queries with each JDBC driver. We used LIMIT clauses to change the size of the dataset returned in each query, but we requested the same columns for each query.

Base Query

SELECT

marketplace,

customer_id,

review_id,

product_id,

product_parent,

product_title,

product_category,

star_rating,

helpful_votes,

total_votes,

vine,

verified_purchase,

review_headline,

review_body,

review_date,

FROM

cdata.amazon_book_reviews;

Limit by Query

  1. LIMIT 1000000
  2. Full Dataset

To test the drivers, we connected to MySQL using a basic Java application and executed the above queries repeatedly. The results were read and stored in a new variable (based on the datatype) for each column in each row.

Query Times by Driver (in seconds)

Query | CData JDBC Driver | MySQL Connector/J
1 (1,000,000 rows) 14.02 (+26%) 17.67
2 (~10,000,000 rows) 171.09 (+20.7%) 206.42

As can be seen in the results, the CData Driver regularly outperformed the native MySQL Driver, largely due to the way the CData JDBC Driver uses client-side resources.

JDBC Driver Resource Usage

While testing the read performance of the JDBC drivers, we also measured client-side resource usage, looking specifically at memory. The charts below were found by running a sample Java program and using Java VisualVM to capture the memory usage. We used Java OpenJDK version 12.0.2 with a maximum heap size of 8 Gigabytes.

For this comparison, we ran the query for the full dataset.

CData Driver

Image description

Native Driver*

Image description

  • Note the change in scale for the Heap graph.

The CData Driver averages approximately 150 MB of heap usage for the duration of the run. However, the native driver continues to use more and more client resources for the duration of the run. In fact, we had to set the max Heap size to 8 GB in order for the native MySQL Driver to complete its execution. Despite the fact that the native driver is using significantly more resources, it still takes longer to read the data than it takes the CData JDBC Driver.

ODBC Driver Read Performance

The main goal of this investigation was to compare the related performance of the ODBC Drivers. We did this by running the same set of queries with each ODBC Driver. We used LIMIT clauses to change the size of the dataset returned in each query, but we requested the same columns for each query.

Base Query

SELECT

marketplace,

customer_id,

review_id,

product_id,

product_parent,

product_title,

product_category,

star_rating,

helpful_votes,

total_votes,

vine,

verified_purchase,

review_headline,

review_body,

review_date,

FROM

cdata.amazon_book_reviews;

Limit by Query

  1. LIMIT 1000000
  2. Full Dataset

To test the drivers, we connected to MySQL using a basic C++ application and executed the above queries repeatedly. The results were read and stored in a new variable (based on the datatype) for each column in each row.

Query Times by Driver (in seconds)

Query | CData ODBC Driver | MySQL Connector/ODBC
1 (1,000,000 rows) 14.95 (+96.7%) 29.40
2 (~10,000,000 rows) 177.51 (+72.3%) 305.77

As can be seen in the results, the CData ODBC Driver regularly outperformed the MySQL Connector/ODBC, largely due to better use of available client resources.

ODBC Driver Resource Usage

While testing the read performance of the ODBC drivers, we also measured client-side resource usage, looking specifically at processing capacity and network bandwidth. The charts below were found by executing the simple C++ application and using the Windows Resource Monitor.

For this comparison, we ran the query for the full dataset.

CData Driver

Image description

Native Driver

Image description

Both drivers appear to use 20 - 25% of the available processing capacity, with CData using slightly more. The larger difference is in the network bandwidth used, where the CData Driver uses around 250Mbps compared to the 100Mbps used by the native driver.

Million Row Challenge

In addition to measuring read performance, we also want to compare the write performance of the drivers. In short, the CData MySQL JDBC Driver is able to write 1 million rows nearly 40% faster than the native MySQL Connector.

We used a simple Java program to add the rows to a copy of the amazon_book_reviews table referenced above.3. For our testing, we inserted the data in 100 batches of 10,000 rows.

Sample Code

//one batch

Connection connection = DriverManager.getConnection("jdbc:mysql:server=" + myServer + ";port=" + myPort + ";user=" + myUser + ";password=" + myPassword + ";database=" + myDatabse + ";");

String cmd = "INSERT INTO cdata.amazon_book_reviews_insert (marketplace, customer_id, review_id, product_id, product_parent, product_title, product_category, star_rating, helpful_votes, total_votes, vine, verified_purchase, review_headline, review_body, review_date) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)";

PreparedStatement pstmt = connection.prepareStatement(cmd);

for (int row = 0; row < rows.length; row++){

pstmt.setString(1, marketplace);

pstmt.setLong(2, customer_id);

pstmt.setString(3, review_id);

pstmt.setLong(4, product_id);

pstmt.setLong(5, product_parent);

pstmt.setString(6, product_title);

pstmt.setString(7, product_category);

pstmt.setLong(8, star_rating);

pstmt.setLong(9, helpful_votes);

pstmt.setLong(10, total_votes);

pstmt.setString(11, vine);

pstmt.setString(12, verified_purchase);

pstmt.setString(13, review_headline);

pstmt.setString(14, review_body);

pstmt.setDate(15, review_date);

pstmt.addBatch();

}

int[] affected = pstmt.executeBatch();

Results

Time (in Seconds) to Insert 1m Rows

CData JDBC Driver | MySQL Connector/J
77.6 127.9

Conclusion

Most technologies that emerge are all about being more efficient and providing more functionality in a smaller package. If you can meet your data management needs with fewer tools, then it’s a win-win for cost-effectiveness, efficiency, and ease of use.

The CData Driver offers better querying of large datasets over the native connector, processing the largest dataset at least 20% faster by making better use of the available client resources.

When it comes to inserting data, the CData Driver stands apart in its ability to rapidly insert large sets of data. The CData JDBC Driver is able to insert 1 million rows in under 80 seconds, nearly 40% faster than the native connector.

connectivity Article's
30 articles in total
Favicon
Service VS Private Endpoint
Favicon
Load Balancer
Favicon
8 EASY WAYS ON HOW TO CREATE A GOOGLE MAIL (GMAIL) ACCOUNT IN 2024
Favicon
Does Technology Affect Human Communication?
Favicon
Access Points: Enhancing Wireless Connectivity for Businesses and Homes
Favicon
So You Work in Private VPCs and Want CLI Access to Your Linux EC2s?
Favicon
Wireless Network Test Equipment Market to Surpass USD 9.06 Billion by 2030 Owing to Enhanced Connectivity
Favicon
A Comprehensive Comparison of Cisco Routing Protocols: Making the Right Choice for Your Network
Favicon
How Passpoint Delivers a Seamless Wi-Fi Experience at Airports
Favicon
How to connect to MySQL DB
Favicon
The Rise of 5G Technology
Favicon
Easy win WIFI set up on Ubuntu
Favicon
A Technical Sneak Peek into Camunda’s Connector Architecture
Favicon
Access All Your Data in Google Data Studio with CData Connect Cloud
Favicon
Analyze Live Salesforce Data in Infragistics Reveal
Favicon
Editing API Driver API Profiles
Favicon
Connecting to Amazon DocumentDB with MongoDB Drivers
Favicon
SQL Access to Redis Data
Favicon
Perform API Operations Using Stored Procedures in CData SSIS Components
Favicon
A Comparison of Database Drivers for MySQL
Favicon
Creating API Driver API Profiles
Favicon
Run Queries Across Many Data Sources at Once with the CData Query Federation Driver
Favicon
Introducing the new OEE App for Cumulocity IoT
Favicon
CData AWS Glue Connector for Salesforce Deployment Guide
Favicon
Use the CData SSIS Components to Insert New or Update Existing Salesforce Records from SQL Server
Favicon
Easily Integrate with Any RESTful API Using the CData API Driver
Favicon
Leading-Edge NoSQL Drivers for NoSQL Analytics & Integration
Favicon
CData Connect: Derived Views and Query Federation
Favicon
Use SQLAlchemy ORMs to Access MongoDB Data in Python
Favicon
Connect to PostgreSQL as an External Data Source using PolyBase

Featured ones: