Logo

dev-resources.site

for different kinds of informations.

Quick tip: Using R with SingleStore Notebooks

Published at
4/12/2024
Categories
singlestoredb
r
jupyter
Author
veryfatboy
Categories
3 categories in total
singlestoredb
open
r
open
jupyter
open
Author
10 person written this
veryfatboy
open
Quick tip: Using R with SingleStore Notebooks

Update: As of September 2024, running R is no longer possible on the SingleStore Portal. I will find another way to show the integration between SingleStore and R.

Abstract

SingleStore provides a Jupyter-based notebook environment with support for Python, SQL and Markdown. However, we can also install and use the R programming language. In this article, we'll see how.

The notebook file used in this article is available on GitHub.

Create a SingleStore Cloud account

A previous article showed the steps to create a free SingleStore Cloud account. We'll use the following settings:

  • Workspace Group Name: R Demo Group
  • Cloud Provider: AWS
  • Region: US East 1 (N. Virginia)
  • Workspace Name: r-demo
  • Size: S-00

Create a new notebook

From the left navigation pane in the cloud portal, we'll select DEVELOP > Data Studio.

In the top right of the web page, we'll select New Notebook > New Notebook, as shown in Figure 1.

Figure 1. New Notebook.

Figure 1. New Notebook.

We'll call the notebook r_demo, select a Blank notebook template from the available options, and save it in the Personal location.

Create a database

In our SingleStore Cloud account, let's use the SQL Editor to create a new database. Call this iris_db, as follows:

DROP DATABASE IF EXISTS iris_db;
CREATE DATABASE IF NOT EXISTS iris_db;
Enter fullscreen mode Exit fullscreen mode

Fill out the notebook

First, let's install the R kernel and some other packages we need for this article:

!conda install -y --quiet -c conda-forge r-irkernel r-rjava r-rjdbc r-ggplot2
Enter fullscreen mode Exit fullscreen mode

Next, we need to change the kernel. Refreshing the page will help the notebook detect any changes, including the installation of a new kernel.

In the top right, we can see that Python is currently selected, as shown in Figure 2.

Figure 2. Python 3 (ipykernel).

Figure 2. Python 3 (ipykernel).

Selecting Python 3 will present a box with a pull-down as shown in Figure 3.

Figure 3. Select Kernel.

Figure 3. Select Kernel.

Clicking the pull-down will show some options and R should be one of the options. We'll choose R, as shown in Figure 4.

Figure 4. Select Kernel.

Figure 4. Select Kernel.

Next, we'll click the Select button.

To connect to SingleStore, we'll use JDBC, as follows:

library(RJDBC)
Enter fullscreen mode Exit fullscreen mode

This will also load DBI and rJava:

Loading required package: DBI

Loading required package: rJava
Enter fullscreen mode Exit fullscreen mode

Next, we'll download the SingleStore JDBC Client and save it in a jars directory:

# URL of the JDBC driver file
driver_url <- "https://repo1.maven.org/maven2/com/singlestore/singlestore-jdbc-client/1.2.1/singlestore-jdbc-client-1.2.1.jar"

# Set the JDBC driver class name
driver <- "com.singlestore.jdbc.Driver"

# Local directory to save the driver file
local_dir <- "jars"
dir.create(local_dir, showWarnings = FALSE, recursive = TRUE)

# Check if the driver file already exists
driver_file <- file.path(
    local_dir,
    "singlestore-jdbc-client-1.2.1.jar"
)

if (!file.exists(driver_file)) {
    # Download the JDBC driver file if it doesn't exist
    download.file(
        driver_url,
        destfile = driver_file,
        mode = "wb",
        quiet = TRUE
    )
}

# Check if the driver file has been downloaded successfully
if (file.exists(driver_file)) {
    print("Driver file downloaded successfully")
} else {
    print("Failed to download the driver file")
}
Enter fullscreen mode Exit fullscreen mode

Now we'll create the connection details to SingleStore:

host <- "<HOST>"
port <- 3306
database <- "iris_db"
user <- "admin"
password <- "<PASSWORD>"

url <- paste0("jdbc:singlestore://", host, ":", port, "/", database)
Enter fullscreen mode Exit fullscreen mode

Replace <HOST> and <PASSWORD> with the values for your environment. These values can be obtained from the workspace using Connect > SQL IDE.

Next, let's load the iris dataset, make some small adjustments to the column names and show the first few rows:

# Load the iris dataset
data(iris)

# Replace "." with "_" in column names
colnames(iris) <- gsub("\\.", "_", colnames(iris))

# Print the first few rows of the dataset
head(iris)
Enter fullscreen mode Exit fullscreen mode

Example output:

  Sepal_Length Sepal_Width Petal_Length Petal_Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
Enter fullscreen mode Exit fullscreen mode

We'll now prepare the connection to SingleStore, write the iris dataset to the database and read it back again.

# Establish the JDBC connection
conn <- dbConnect(
    drv = JDBC(driver, driver_file),
    url = url,
    user = user,
    password = password
)

# Write the iris dataset to the database
dbWriteTable(conn, "iris", iris, overwrite = TRUE)

# Read the iris dataset from the database
iris_from_db <- dbReadTable(conn, "iris")

# Print the first few rows of the dataset read from the database
head(iris_from_db)

# Close the JDBC connection
dbDisconnect(conn)
Enter fullscreen mode Exit fullscreen mode

Example output:

  Sepal_Length Sepal_Width Petal_Length Petal_Width    Species
1          5.7         2.9          4.2         1.3 versicolor
2          5.1         3.8          1.5         0.3     setosa
3          5.4         3.0          4.5         1.5 versicolor
4          4.3         3.0          1.1         0.1     setosa
5          5.5         2.5          4.0         1.3 versicolor
6          6.4         2.9          4.3         1.3 versicolor
Enter fullscreen mode Exit fullscreen mode

Bonus: Create visualisations

We can easily create some plots using ggplot:

library(ggplot2)
Enter fullscreen mode Exit fullscreen mode

First, a scatter plot of Sepal Length vs Sepal Width, as shown in Figure 5.

# Scatter plot of Sepal Length vs Sepal Width
ggplot(iris_from_db, aes(x = Sepal_Length, y = Sepal_Width, color = Species)) +
    geom_point() +
    labs(x = "Sepal Length", y = "Sepal Width", title = "Sepal Length vs Sepal Width")
Enter fullscreen mode Exit fullscreen mode

Figure 5. Sepal Length vs Sepal Width.

Figure 5. Sepal Length vs Sepal Width.

Next, a box plot of Petal Length by Species, as shown in Figure 6.

# Box plot of Petal Length by Species
ggplot(iris_from_db, aes(x = Species, y = Petal_Length, fill = Species)) +
    geom_boxplot() +
    labs(x = "Species", y = "Petal Length", title = "Petal Length by Species")
Enter fullscreen mode Exit fullscreen mode

Figure 6. Petal Length by Species.

Figure 6. Petal Length by Species.

Finally, a histogram of Petal Width, as shown in Figure 7.

# Histogram of Petal Width
ggplot(iris_from_db, aes(x = Petal_Width)) +
    geom_histogram(binwidth = 0.1, fill = "skyblue", color = "black") +
    labs(x = "Petal Width", y = "Frequency", title = "Petal Width")
Enter fullscreen mode Exit fullscreen mode

Figure 7. Petal Width.

Figure 7. Petal Width.

Summary

In this short article, we've seen how to install R, how to connect to SingleStore from R, and how to write and read data using R and SingleStore. We've also quickly and easily created several powerful visualisations.

singlestoredb Article's
30 articles in total
Favicon
Quick tip: Visualising the Air Quality Index (AQI) across Punjab, Pakistan and India
Favicon
Quick tip: Using SingleStore with OpenAI's Swarm
Favicon
Quick tip: Using SingleStore and WebAssembly for Sentiment Analysis of Stack Overflow Comments
Favicon
Quick tip: Building Predictive Analytics for Loan Approvals
Favicon
Quick tip: Build Vector Embeddings for Video via Python Notebook & OpenAI CLIP
Favicon
Quick tip: SingleStore Kai support for MongoDB $vectorSearch
Favicon
Quick tip: Using SingleStore with PyIceberg
Favicon
Quick tip: Using SingleStore for Iceberg Catalog Storage
Favicon
Quick tip: Using picoGPT in the SingleStore portal
Favicon
Quick tip: Ollama + SingleStore - LangChain = :-(
Favicon
Quick tip: How to Build Local LLM Apps with Ollama and SingleStore
Favicon
Quick tip: Using R, OpenAI and SingleStore Notebooks
Favicon
Quick tip: Write numpy arrays directly to the SingleStore VECTOR data type
Favicon
Quick tip: Using R, Rayshader and SingleStore Notebooks
Favicon
Quick tip: Using R with SingleStore Notebooks
Favicon
Quick tip: Using Apache Spark and GraphFrames with SingleStore Notebooks
Favicon
Quick tip: Using Apache Spark Structured Streaming with SingleStore Notebooks
Favicon
Quick tip: Using SingleStore Spark Connector's Query Pushdown with SingleStore Notebooks
Favicon
Quick tip: Using the SingleStore Spark Connector with SingleStore Notebooks
Favicon
Quick tip: Using Apache Spark with SingleStore Notebooks for Fraud Detection
Favicon
Quick tip: Cosine Similarity revisited in SingleStore
Favicon
Quick tip: Using Apache Spark with SingleStore Notebooks
Favicon
Quick tip: Using Approximate Nearest Neighbor (ANN) Search with SingleStoreDB
Favicon
Quick tip: Using the new VECTOR data type and Infix Operators in SingleStoreDB
Favicon
Quick tip: Dot Product, Euclidean Distance and Cosine Similarity in SingleStoreDB
Favicon
Vector Databases & AI Applications for Dummies
Favicon
Quick tip: Analysing Stock Tick Data in SingleStoreDB using LangChain and OpenAI's Whisper
Favicon
Quick tip: Replicating JSON data from MongoDB to SingleStore Kai and creating OpenAI embeddings
Favicon
Quick tip: Streaming data from MongoDB Atlas to SingleStore Kai using Kafka and CDC
Favicon
Quick tip: Using LangChain's SQLDatabaseToolkit with SingleStoreDB

Featured ones: