dev-resources.site
for different kinds of informations.
Predicting & Visualizing Petroleum Production with MindsDB and Tableau
Introduction
MindsDB is an open-source machine-learning tool that brings automated machine learning to your database. MindsDB offers predictive capabilities in your database. Tableau lets you visualize your data easily and intuitively. In this tutorial, we will be using MindsDB to predict the hourly electricity demand in the United States and visualize results in Tableau. To complete this tutorial, you are required to have a working MindsDB connection, either locally or viaĀ cloud.mindsdb.com. You can use thisĀ guideĀ to connect to the MindsDB cloud.
Data Setup
Connecting the data as a file
Follow the steps below to upload a file to MindsDB Cloud.
- Log in to yourĀ MindsDB CloudĀ account to open the MindsDB Editor.
- Navigate toĀ
Add data
Ā the section by clicking theĀAdd data
Ā button located in the top right corner.
- Choose the Files tab.
- Choose theĀ
Import File
Ā option. - Upload a file (
[national_nameplate_capacity.csv](https://www.kaggle.com/datasets/maurosrgiofigueira/fuel-output-capacity-brazil), name a table used to store the file data (here it is
petroleum_productionĀ ), and click theĀ
Save and Continue`Ā button.
Once you are done uploading, you can query the data directly with;
sql
SELECT * FROM files.petroleum_production LIMIT 10;
The output would be:
Understanding the Dataset
This dataset presents Brazilian petroleum products ouput capacity evolution, from 2011 to 2020 in each Brazilian refinery. Data is displayed in barrels. From the Brazilian National Petroleum Agency.
Context
- Refinery - They are 19 refinery in this dataset.
- Year - The year of this dataset ranges from 2011 to 2020.
- Nameplate Capacity - Nameplate capacity, also known as theĀ rated capacity,Ā nominal capacity ,Ā installed capacity, orĀ maximum effect, is the intended full-load sustained output of a facility such as aĀ power station, fuel plant and etc. (Source: Wikipedia)
Creating the Predictor
To being, letās create a predictor that uses the date to predict the Petroleum Production for refineries. You can learn more about creating a predictor by checkingĀ here. You can predict a classification series model using the following syntax
sql
CREATE PREDICTOR mindsdb.[predictor_name]
FROM [integration_name]
(SELECT [sequential_column], [partition_column], [other_column], [target_column]
FROM [table_name])
PREDICT [target_column]
-
CREATE PREDICTOR
: Creates a predictor with the nameĀpredictor_name
Ā in theĀmindsdb
Ā table. -
FROM files
: Points to the table containing the data. -
PREDICT
: Dictates the column to predict.
`sql
CREATE PREDICTOR mindsdb.petroleum_output
FROM files (SELECT refinery, Year, Nameplate_Capacity FROM petroleum_production)
PREDICT Nameplate_Capacity
`
On execution we get:
Query successfully completed
Status of a Predictor
A predictor may take a couple of minutes for the training to complete. You can monitor the status of the predictor by using this SQL command:
sql
SELECT status
FROM mindsdb.predictors
WHERE name='petroleum_output';
Your output should be:
sql
+------------+
| status |
+------------+
| complete |
+------------+
Making Predictions
Now that we have our Prediction Model, we can simply execute some simple SQL query statements to predict the target value based on the feature parameters:
sql
SELECT Year, Nameplate_Capacity
FROM mindsdb.petroleum_output
JOIN files.petroleum_production
LIMIT 100;
Connecting MindsDB to Tableau
Tableau lets you visualize your data easily and intuitively. In this tutorial will use Tableau to create visualizations of our predictions.
How to Connect
- First, create a new workbook in Tableau and open theĀ ConnectorsĀ tab in theĀ Connect to DataĀ window.
- Click on MySQL
ā¢ Input ācloud.mindsdb.comā for Server, ā3306ā for Port, āmindsdbā for Database āyour mindsdb cloud emailā for Username, āyour mindsdb cloud passwordā for Password, and Sign in.
Now you are connected and your page should look like this:
Visualizing our Data
Before you can visualize predictions in Tableau, you must first choose a data source. And because the predictions in this article are generated using a SQL statement, you will need to create a custom SQL query in Tableau to generate the data source. To do this:
- First, select theĀ
New Custom SQL
on the left side of the window and use the query below to generate the petroleum production for each year. You can preview the results or directly load the data into Tableau.
sql
SELECT Year, Nameplate_Capacity
FROM mindsdb.petroleum_output
JOIN files.petroleum_production
- Create an extract of the data under the connection heading at the top right of the window. You do this to facilitate data conversion to the appropriate data type. The extraction should take some time.
- Move to the Sheet tab on the bottom left and right-click on the
Nameplate_Capacity
andYear
to convert their data types to Number(whole) and Date, respectively. Additionally, when right-clicking on theNameplate_Capacity
andDate
, choose the option to convert it to a continuous measure.
- Drag the
Nameplate_Capacity
measure to the row shelf and theYear
dimension to the column shelf. - You can also switch the visualization also to line, area and etc.
Whatās Next?
Have fun while trying it out yourself!
- Star theĀ MindsDB repository on GitHub.
- Sign up for aĀ free MindsDB account
- Engage with the MindsDB community onĀ SlackĀ orĀ GitHubĀ to ask questions and share your ideas and thoughts.
Give a like or a comment if this tutorial was helpful
Featured ones: