Advanced Filtering in REST API
In this article we will look at how to implement advanced filtering in REST API when a GET request a list of results. In our API project that will be GET /v1/travels
and GET /v1/tours
endpoints. Although setup for filtering is similar to sorting, the implementation is a bit more complex.
First we are going to use filter_by
parameter that will look something like ?filter_by=name:"Cool Tour" AND price:[10 TO 30]
This is Liqe Query Language (LQL), which is heavily inspired by Lucene.
Next we need to convert Liqe expression into an abstract syntax tree (AST). Liqe npm package will do that for us.
Finally we create a Sequelize Where clause object from the AST. Although our implementation of AST to Where clause object translator won’t be complete, it will give you a good starting point so you can customize it and add to it to fit your project.
Safety and Performance Concerns
While we implement filtering, we have to keep in mind the following safety and performance concerns:
- Since we are using user input, there’s a chance of SQL injection. Because we are converting the AST into Sequelize Where clause object, Sequelize will sanitize user input for us. If you are planning to convert AST to plain SQL, please use SQL builder that will allow you compose SQL fragments safely by using tagged template literals.
- Our search interface can be used as a vector for DDoS attacks. Someone, intentionally or accidentally, can put something like this
name:A OR name:B OR name:C or name:D...
Such query will be computationally expensive. Since we have AST of the query, we can calculate complexity of the query and set an upper limit. In our case we will implement it by calculating how many levels deep AST object is and set an upper limit of 5. - Someone can run queries against the database fields that are not optimized for filtering, for example
description
field. This will be also computationally expensive. We will prevent that by usingallowedFilterByFields
array that will include only fields we want the user to run filtering against.
How to Implement Filtering in REST API
Now, with the theory out of the way, let’s get started. If you prefer video tutorials, please check out our YouTube video on how to implement advanced filtering in REST API.
First let’s install Liqe library with the command:
yarn add liqe
As usual we are going to update update OpenApi document to include filter_by
query parameter in GET /v1/travels
and GET /v1/tours
routes.
- in: query name: filter_by schema: type: string required: false description: Filter by query parameter. F.e. *name:"My Cool Travel"*
Similar to query_by
filter_by
parameter is also in query, type of string and is not required.
Let’s add the same parameter to GET /v1/tours
- in: query name: filter_by schema: type: string required: false description: Filter by query parameter. F.e. *price:<30*
Let’s go ahead an update the code to handle filter_by
query parameter.
We update listTravels
method in travels controller to include filterBy
option in options passed to repository.getAll()
method. filterBy
is assigned the value of req.query.filer_by
const travels = TravelResource.collection( await repository.getAll({ sortBy: req.query.sort_by, filterBy: req.query.filter_by, }) );
We update listTours
method in tours controller to include the same option filterBy
const tours = TourResource.collection( await repository.getAll({ sortBy: req.query.sort_by, filterBy: req.query.filter_by, }) );
Now, let’s update BaseRepository
class to handle filterBy
property in the options
parameter.
First, in BaseRepository.ts
file add protected class property called allowedFilterByFields
protected allowedFilterByFields: Array<string> = [];
A default value of an empty array is assigned to the property. As we mentioned before, we can’t allow users to run filters on all the fields in the table of the resource. This may result in computationally expensive queries, high database load, and, as a result, slow API. We want to allow a user filter only by a limited number of fields that are usually indexed.
Let’s put the same property in child classes TravelRepository
and TourRepository
We will allow filter travels by name and number of days.
protected allowedFilterByFields = ["name", "number_of_days"];
We will allow filter tours by name, price, starting date, ending date
protected allowedFilterByFields = [ "name", "price", "starting_date", "ending_date", ];
Let’s go back to BaseRepository
and update getAll
method to handle filterBy
option.
if (options.filterBy) { const filterBy = this.getFilterBy(options.filterBy); delete options.filterBy; options.where = filterBy; }
We will check if filter by exists on options
passed to getAll
method. If it does, we call getFilterBy
method on BaseRepository
and pass it filter by options. Then we will remove filterBy
property on options object and add a where
property and assign it the value returned from getFilterBy
method. where
property is used Sequelize to query data.
Now let’s create getFilterBy
method in BaseRepository
protected getFilterBy(filterBy: string): Record<string, any> { try { return getSequelizeWhereClause( parse(filterBy), this.allowedFilterByFields ); } catch (error: any) { throw new ApiError({ name: "FILTER_BY_ERROR", message: error.message, status: 400, code: "ERR_FTB", }); } }
getFilterBy
receives filterBy
string and returns an object. The method attempts to call getSequelizeWhereClause
function. As a first argument, it passes filterBy
string parsed by parse
method from Liqe library. The second argument to getSequelizeWhereClause
function is allowedFilterByFields
If the call is successful, result of the function is returned. If there is an error, it is handled by the catch
statement and gets re-thrown as an ApiError
with name of FILTER_BY_ERROR
and code ERR_FTB
The ErrorHandler
will catch it and return it back to the API user as an error response.
Finally, in utils
folder, let’s create getSequelizeWhereClause
function. This function we turn abstract syntax tree (AST) parsed by Liqe library into Sequelize where clause object.
import { Op } from "sequelize"; import { type LiqeQuery } from "liqe"; import { depthOfObject } from "./functions"; const maxSearchQueryDepth = 5; const operatorsMap: Record<string, string> = { ":": "eq", ":=": "eq", ":>": "gt", ":>=": "gte", ":<": "lt", ":<=": "lte", }; const getSequelizeWhereClause = ( searchQuery: LiqeQuery, allowedFilterByFields: Array<string> ) => { if (depthOfObject(searchQuery) > maxSearchQueryDepth) { throw new Error("Filter by has too many expressions"); } const translateTagExpression = (ast: LiqeQuery) => { if (ast.type !== "Tag") { throw new Error("Expected a tag expression."); } const { field, expression, operator, }: { field: Record<string, any>; expression: Record<string, any>; operator: Record<string, any>; } = ast; if (field.type === "ImplicitField") { throw new Error("Implicit fields are not supported"); } if (!allowedFilterByFields.includes(field.name)) { throw new Error(`Field ${field.name} is not allowed in filter by`); } if (expression.type === "RangeExpression") { return { [field.name]: { [Op.between]: [expression.range.min, expression.range.max], }, }; } if (expression.type !== "LiteralExpression") { throw new Error("Expected a literal expression."); } if (!operatorsMap[operator.operator]) { throw new Error("Operator is not supported."); } return { [field.name]: { // @ts-ignore [Op[operatorsMap[operator.operator]]]: expression.value, }, }; }; const translateExpression = (ast: LiqeQuery): Record<string, any> => { if (ast.type === "Tag") { return translateTagExpression(ast); } if (ast.type === "LogicalExpression") { if (!operatorsMap[ast.operator.operator]) { throw new Error("Logical operator is not supported."); } return { // @ts-ignore [Op[operatorsMap[ast.operator.operator]]]: [ translateExpression(ast.left), translateExpression(ast.right), ], }; } throw new Error("AST type is missing or not supported"); }; return translateExpression(searchQuery); }; export default getSequelizeWhereClause;
First we import Op
from Sequelize, type LiqeQuery
from Liqe and depthOfObject
function from utils functions. depthOfObject
will tell us how many levels deep is the AST object. Limiting depth of AST object will prevent running too expensive queries against the database and protect from possibility of DDoS attacks.
export function depthOfObject(object: Record<string, any>) { let level = 1; for (const key in object) { if (!object.hasOwnProperty(key)) continue; if (typeof object[key] == "object") { const depth = depthOfObject(object[key]) + 1; level = Math.max(depth, level); } } return level; }
Next we will define maxSearchQueryDepth
to be 5. This will limit filter by query to 3 expressions connected by two ANDs or ORs, or combination of them. For example: price:<30 AND starting_date:>2023-11-01 AND ending_date:<2023-11-30
We will also define operatorsMap
that will map AST operators to Sequelize Op
operators.
Finally we will defined getSequelizeWhereClause
function. It first makes sure AST query object doesn’t exceed the max search query depth. Then, it defines two functions translateTagExperssion
and translateExpression
. Finally it calls translateExpression
function with searchQuery
AST object as an argument and returns the result of that call.
Let’s look at translateExperssion
function. It accepts AST object that is type of LiqeQuery
as an argument. It first checks if the expression is a Tag expression. If so, it returns a call to translateTagExpressionFunction
. Next it checks if ast.type
is LogicalExpression
. If it is, it first makes sure that logical operator exists in the operators. Map. Then the function returns Sequelize Where clause object with the mapped operator and calls itself on the right and left sides of the expression. If AST type is not a Tag or Logical expression and error is thrown. Other AST types you can implement are UnaryOperator
(NOT expression) and ParenthesizedExpression
. Please see an article by Gajus Kuizinas, the creator of Liqe, in the description of the video.
Now let’s look at translateTagEpression
function. It has an AST object as a parameter. The function checks if AST type is Tag
If not, an exception is thrown. It destructs field
, expression
and operator
from the AST object. It checks if field type is ImplicitField
. For our purposes we don’t want to support implicit fields. If it makes sense for your application to have an implicit field, you can assign whatever field you want from your database if the field is not specified in the filter by expression.
Next we check if field name is included in allowed filter by fields. For safety reasons mentioned before, we don’t want let users filter through any field they want. If the field name is not in the allowed filter by array, we through an exception. Next, we check if the expression type is RangeExpression
If so, we return Sequelize Where clause object that runs between
query on expression.range.min
and expression.range.max
After that, we are almost done with the logic. The only thing is left to check if AST expression is LiteralExpression
and if operatorsMap
object has operator.operator
as its key.
If both condition are satisfied, the function returns Sequelize where clause object that runs a query on the specified field name with the operator and expression value.
We are finally done. As I mentioned before, getSequelizeWhereClause
function doesn’t cover every use case, but should give you a good idea of how to parse an AST object so you can customize it to fit your project.
Conclusion
In conclusion, mastering advanced filtering in your REST API is a pivotal step towards creating a resilient and user-friendly web service. By delving into the nuanced strategies, best practices, and practical tips shared in this article, you’ve equipped yourself with the tools to optimize data retrieval and enhance the overall efficiency of your API.