dev-resources.site
for different kinds of informations.
Understanding Sequelize's 'escape' function and why ideally we should avoid using it manually
So a disclaimer. As a software engineer, I'm not a security expert. I rely heavily on those who know better, especially OWASP. So all feedback is welcome to help fix any flaws in the article or improve it.
Sequelize is a very robust database ORM for Node.js. Knowing how to use it is one thing. Knowing how it works under the hood is another.
So when I was asked recently, why I had not used the escape function in Sequelize's where clause, I said, properties in the where clause are already escaped. But was I correct? Or was that an assumption? Even more, was I following the best practices for preventing SQL injections.
To answer, first, let's explore how SQL injection attacks happen.
Preparing our database
We want to prepare a database to test how a SQL injection could happen.
You can use any database you like to do this but it'll be easier if you use SQLite or Postgres as they have similar escaping syntaxes whereas MySQL is a law unto itself 😁.
Create a new SQLite database.
sqlite3 database.sqlite
Create a basic users table.
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
);
Insert some users into the database.
INSERT INTO users (name, email) VALUES ('John Doe', '[email protected]');
INSERT INTO users (name, email) VALUES ('Bruch Wayne', '[email protected]');
Select those users from the database.
SELECT * FROM users;
Now you should see two users.
id | name | |
---|---|---|
1 | John Doe | [email protected] |
2 | Bruch Wayne | [email protected] |
Selecting a single user
If you have a multi-tenant application, where users can only see their data, you'll usually do lookups by a single user ID. That way, you don't expose information to the wrong person.
Query a user by their id.
SELECT * FROM users where id = 1;
Now we only see one user.
id | name | |
---|---|---|
1 | John Doe | [email protected] |
Attempting a SQL injection attack
But what if we pretend to we have a SQL injection attack? Add or '1' = '1'
to the end of the SQL query, and now that the last statement is always true, all users are returned.
SELECT * FROM users where id = '1' or '1' = '1';
id | name | |
---|---|---|
1 | John Doe | [email protected] |
2 | Bruch Wayne | [email protected] |
Users shouldn't have direct access to our database, so how would this disastrous situation happen in reality.
Creating a Node.js app as an example
Initalise npm in a new folder.
npm init -y
Install Sequelize and its dependencies.
npm install sequelize sequelize-cli sqlite3
Create a basic index.js file which connects to the sqlite database (or your flavour of it).
const { Sequelize, DataTypes } = require('sequelize');
const sequelize = new Sequelize({
dialect: 'sqlite',
storage: './database.sqlite',
});
(async () => {
try {
// test the connection is Ok
await sequelize.authenticate();
// Get the argument from the input
const id = process.argv[2];
const [results] = await sequelize.query(`SELECT * FROM Users WHERE id = ${id}`);
console.log('Raw query results:', results);
} catch (error) {
console.error('Unable to connect to the database:', error);
} finally {
await sequelize.close();
}
})();
- It works by taking an argument from the input in the terminal.
- It runs the query.
- It outputs the results.
Running a safe query against our script
Running the query produces the following output:
node index.js 1
Executing (default): SELECT 1+1 AS result
Executing (default): SELECT * FROM Users WHERE id = 1
Raw query results: [ { id: 1, name: 'John Doe', email: '[email protected]' } ]
That successfully returned the user.
Running a SQL injection attack against our script
Repeating the SQL injection attack we tried before, the outcome shows the disaster. We can access all users when we shouldn't be able to!
node index.js "1 or '1' = '1'"
Executing (default): SELECT 1+1 AS result
Executing (default): SELECT * FROM Users WHERE id = 1 or '1' = '1'
Raw query results: [
{ id: 1, name: 'John Doe', email: '[email protected]' },
{ id: 2, name: 'Bruch Wayne', email: '[email protected]' }
]
So how can we fix this?
First attempt trying to escape user input
Sequelize wraps the query in quotes. So we'll do just that. Replace the const id
line with this.
const id = "'" + process.argv[2] + "'";
And voila. That's fixed the injection...partly.
node 02_index.js "1 or 1=1"
Executing (default): SELECT 1+1 AS result
Executing (default): SELECT * FROM Users WHERE id = '1 or 1=1'
Raw query results: []
Selecting users by id with 1 or 1=1
doesn't match anything. So nothing is returned.
But what if we were to try and inject some apostrophes to end the first statement and add our own OR
statement.
node 02_index.js "1' or 1=1 or '"
Executing (default): SELECT 1+1 AS result
Executing (default): SELECT * FROM Users WHERE id = '1' or 1=1 or ''
Raw query results: [
{ id: 1, name: 'John Doe', email: '[email protected]' },
{ id: 2, name: 'Bruch Wayne', email: '[email protected]' }
]
Now the disaster returns! We can get all users again!
Escaping user inputted quotes
We need to escape the quotes as well, just like Sequelize does.
const id = "'" + process.argv[2].replace(/'/g, "''") + "'";
node 03_index.js "1' or 1=1 or '"
Executing (default): SELECT 1+1 AS result
Executing (default): SELECT * FROM Users WHERE id = '1'' or 1=1 or '''
Raw query results: []
You can see how the whole paramater is a string in the last query.
And this is what the Sequelize escape function does but is far more comprehensive, covering different sql dialects.
Sequelize escape function
Instead of doing our own escaping, it's best to leave it to libraries that are battle-tested and proven.
It is safer just to do this.
const id = sequelize.escape(process.argv[2]);
Escaping is not the best option against SQL injection attacks
OWASP puts escaping user input as last on its list of acceptable ways of preventing sql injections. It's strongly discouraged!
Why?
Potentially the responsibility is on the developer to remember to escape each and every single user input. Even using Sequelize here, if the developer forgot, then this would be a security breach waiting to happen.
What's better than escaping
Using Sequelize's query builder is better than escaping manually.
As an example, create a user model.
npx sequelize-cli model:generate --name User --attributes name:string,email:string
Add the user model to the main script and replace the raw query with the model-based one.
const { Sequelize, DataTypes } = require('sequelize');
const sequelize = new Sequelize({
dialect: 'sqlite',
storage: './database.sqlite',
});
const User = require('./models').User;
(async () => {
try {
await sequelize.authenticate();
const [results] = await User.findAll({
where: {
id: process.argv[2]
}
});
console.log('Raw query results:', results);
} catch (error) {
console.error('Unable to connect to the database:', error);
} finally {
await sequelize.close();
}
})();
When we run our SQL injection attack again, we now get undefined. No need to use Sequelize escape, the where clause does it for us.
node 05_index.js "1' or 1=1 or '"
Executing (default): SELECT 1+1 AS result
Executing (default): SELECT `id`, `name`, `email` FROM `Users` AS `User` WHERE `User`.`id` = '1'' or 1=1 or ''';
Raw query results: undefined
And what about if we use Op.like in a where statement?
We would never do this on an ID but the principle is the same.
const [results] = await User.findAll({
where: {
id: {
[Sequelize.Op.like]: '%' + process.argv[2] + '%'
}
}
});
The final query shows the entire statement is escaped.
Bind parameter. Better than escaping.
Parameterised queries are great because we send the sql query separately from the data.
Note that was use a bind parameter, '%:id%'
.
const [results] = await User.findAll({
where: {
id: {
[Sequelize.Op.like]: '%:id%'
}
}
},
{
bind: { id: process.argv[2]}
});
Notice what is sent to the database.
Executing (default): SELECT `id`, `name`, `email` FROM `Users` AS `User` WHERE `User`.`id` LIKE '%:id%';
Raw query results: undefined
The database engine knows exactly what to do with the data, and how to escape it. If we send the query syntax plus the data all as one, then we have to escape it before it gets to the database.
Parameterised queries are the recommended way (according to OWASP) of making queries of any kind with user input, in the database.
Conclusion
It seems using the where statement without bind parameters doesn't follow best practices from OWASP. So that's something I need to improve there.
And thank you for reading. If I've made any mistakes, do let me know.
Reading material
Featured ones: