SQLX - WHERE-IN queries. The good, the bad and the ugly.

Hestia Calliari - 2025-12-28 (last changes: 2025-12-30)

I love SQLX for many reasons, but one in particular is the possibility to validate syntax and return types at compile-time with the query! macros. At least that’s what I thought. For a project recently I wanted to perform a query to get some role data from from a OAUTH 2.0 custom role claim. The query in and off itself was quite simple, but it lead me down a rabbit hole of the internals of sqlx and some limitations of MariaDB. Let’s have a look at the query first:

SELECT *
FROM roles
WHERE roles.idp_group IN ?;

Looks simple, right? Testing it out with DataGrip even works, because in the prompt you can just insert the list you want. So now we just pass our slice of strings to the query and:

error: error returned from database: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '?' at line 1
 --> src/main.rs:3:5
  |
3 |     sqlx::query!("SELECT * FROM roles WHERE idp_group IN ?;", l);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: this error originates in the macro `$crate::sqlx_macros::expand_query` which comes from the expansion of the macro `sqlx::query` (in Nightly builds, run with -Z macro-backtrace for more info)
error: could not compile `sqlx_list` (bin "sqlx_list") due to 1 previous error

Weird. Why is MariaDB suddenly complaining here. The syntax should be right, if it just replaces the ? with the list I just passed. After some experimentation, to make sure I was not missing something trivial, I reached out to the developers of SQLX on their discord. They responded me promptly that MariaDB (and also MySQL) do not support binding arrays like that.

Response from the SQLX developers: Arrays can’t be bound as a parameter in MySQL. Are you trying to use QueryBuilder?

So it doesn’t work. Now we effectively have two option: Run one query for each group and pray to god no user has so many groups that this would cause performance issues, or use the QueryBuilder, but forego the compile-time type and query checking from SQLX.

I quickly learned that the second option wasn’t feasible. Rust type system was super strict and I ran into several problems when testing it. And considering that I wanted to use similar queries in the future (in much less traversed parts than the auth code), this was a no-go.

So it’s the first option then? Well, no. This query will fetch the user auth on every request, any slowness in this part of the code will haunt the user on every single request. I do not want to have any surprises coming my way from there.

So the only solution left: build a solution that does both myself!

The implementation

The solution that I came up with was essentially another macro, that generates the code for the QueryBuilder, while generating a separate “example” query that gets validated against the database at compile-time the way SQLX does it. I’ll quickly guide you through the steps to implement such a macro, but first we need to decide what syntax to use.

The syntax I came up with had the following constraints: You specify a list of either strings or for-loops in the query. All the strings will be appended to the query, while the for-loops will use the QueryBuilder::separated function to append the items. The trick here is, that the body of the for-loop functions as a block, returning the last statement and therefor effectively working like a list-comprehension from python.

let groups = vec![String::new("/root")];
let roles = query_in_list!{
    RoleType,
    "SELECT *"
    "FROM groups"
    "WHERE idp_group IN"
    for group in groups { group }
};

Now we just need our macro to actually parse this and generate new valid Rust source code. The parsing code took huge inspiration from the instrument macro of the tracing crate as I did not know how to do this prior to actually doing it. The logic is actually quite simple, like tracing, I just used the syn crate to parse the actual source-code.

use syn::{
    Block, Ident, LitStr, Stmt, Token, Type,
    parse::{Parse, ParseStream, Result},
};
struct QueryInList {
    query_result_type: Type,
    query: Vec<QueryFragment>,
}
enum QueryFragment {
    Lit(LitStr),
    Loop { name: Ident, list: Ident, expr: Vec<Stmt> },
}
impl Parse for QueryInList {
    fn parse(input: ParseStream) -> Result<Self> {
        let query_result_type = input.parse::<Type>()?;
        input.parse::<Token![,]>()?;
        let mut fragments = vec![];
        while !input.is_empty() {
            if let Ok(literal) = input.parse::<LitStr>() {
                let mut content = literal.value();
                content.push(' ');
                fragments.push(QueryFragment::Lit(LitStr::new(&content, lit.span())));
                continue;
            }
            input.parse::<Token![for]>()?;
            let name = input.parse::<Ident>()?;
            input.parse::<Token![in]>()?;
            let list = input.parse::<Ident>()?;
            let block = input.parse::<Block>()?;
            fragments.push(QueryFragment::Loop {
                name,
                list,
                expr: block.stmts,
            });
        }
        Ok(QueryInList {
            query_result_type,
            query: fragments,
        })
    }
}

First we simply parse a type, followed by a comma from the input stream. Then we move on to the items of the query.

As you can see, in the parse loop at line 24, I try to parse a string literal from the input stream. If that does not work, I just assume it contains the for-loop and parse one after the other: the keyword for, an identifier that is used in the loop, the keyword in, another identifier for the iterable that is used and lastly a block expression that gives us our value for the query. I package everything up into a Vec and with that, we have our data prepared.

(I also push a space after each string literal, so you don’t have to remember to have the trailing whitespace after each like. You’re welcome.)

The next step now is to turn this data into an example query that we use to validate against the database. For that we just take our list of elements. If we encounter a string-literal, we simply concatenate it, if it is a for-loop, we add the literal (?, ?) as an example, and keep track of how many placeholder we inserted:

impl QueryInList {
    fn example_query(&self) -> (usize, String) {
        let mut query = String::new();
        let mut placeholder = 0;
        for fragment in &self.query {
            match fragment {
                QueryFragment::Lit(lit) => query += &lit.value(),
                QueryFragment::Loop { .. } => {
                    query += " ( ?, ? ) ";
                    placeholder += 2;
                }
            }
        }
        query.push(';');
        (placeholder, query)
    }
}

With that done, all that’s left to do is assemble all the pieces into valid Rust code, that we can output from our macro for the compile to process. The generated code should look something like this:

let roles = {
    let _ = query_as!(RoleType, "SELECT * FROM roles WHERE idp_group IN (?, ?)", true, true);
    let mut builder = sqlx::QueryBuilder::new("");
    builder.push("SELECT role_id, BIT_OR(permissions) AS permissions ");
    builder.push("FROM roles ");
    builder.push("WHERE idp_group IN  ");
    builder.push(" ( ");
    let mut separated = builder.separated(", ");
    for group in groups {
        separated.push_bind({ group });
    }
    builder.push(" ) ");
    builder
};

So I used the quote crate to generate the code. The great thing about quote is, that we can do string-interpolation on our source-code, making it much easier to audit and understand later.

As a quick aside: A lot of Rust detractors might say now: “See you need to pull in so many different crates. This is a huge supply-chain risk”. Interestingly enough though, all the crates used (quote, syn and proc-macro2) come from the same person: David Tolnay (I have sung his praises already, so I’ll spare you from it here). So the supply-chain risk is actually just the same as one crate, we would just degrade compile time and code re-usability. So please don’t just look at “number big” and assume that means the supply-chain risk is directly correlated. There goes much more into that.

Back to the topic – let’s generate the source code:

use proc_macro::TokenStream;
use quote::quote;
use syn::parse_macro_input;
#[proc_macro]
pub fn query_in_list(input: TokenStream) -> TokenStream {
    let query_in_list = parse_macro_input!(input as QueryInList);
    let (placeholders, qe) = query_in_list.example_query();
    let ph = std::iter::repeat_n(true, placeholders);
    let query_result_type = query_in_list.query_result_type;
    let fragments = query_in_list.query.iter().map(|f| match f {
        QueryFragment::Lit(lit) => quote! { builder.push(#lit); },
        QueryFragment::Loop { name, list, expr } => quote! {
            builder.push(" ( ");
            let mut separated = builder.separated(", ");
            for #name in #list {
                separated.push_bind({ #(#expr)* });
            }
            builder.push(" ) ");
        },
    });
    quote! {{
        // We are generating an example query here, but discarding the result.
        // This is only used to run at compile-time against the database to
        // ensure type-safety.
        let _ = sqlx::query_as!(#query_result_type, #qe, #(#ph),*);
        let mut builder = sqlx::QueryBuilder::new("");
        #(#fragments)*
        builder
    }}
    .into()
}

So, what’s happening here? First we transform the fragments we collected from the macro into their respective code-pieces: Just pushing the string in case of a literal, otherwise creating the Separated and pushing all the values based on the for-loop. As you can see, I basically just paste the whole code block inside the curly-braces into the Separated::push_bind function, which actually works great.

As for our example query, we actually need to push the same number of arguments into the codes, as we have bindings in the query, for that I just created an iterator of the value true that yields exactly the right amount of times. Surprisingly, this works and MariaDB does not complain about any type miss-matches either (put a pin in that, we’ll need that later again).

The Good

We have our cake and can eat it too (mostly, we’ll come to that). We can validate our query at compile-time, while leveraging the QueryBuilder to actually avoid the limitations of MariaDB. Nominally I have reached my goal, but the solution is still far from perfect, as I will explain in the next sections.

The Bad

While we actually do have the type now, to actually get the data out we still have to derive FromRow for the struct and pass through that code.

Must worse though, this solution is not 100% air-tight. Let me demonstrate this:

let groups = vec![String::new("/root")];
let roles = query_in_list!{
    RoleType,
    "SELECT *"
    "FROM groups"
    "WHERE idp_group IN"
    for group in groups { group }
};
roles.query_as::<OtherType>(&mut db).await

Since the function query_as takes a mutable reference, we cannot return just the query because the builder, to which it holds a reference, would be dropped at the end of the block. So between the macro and the actual use of the query it could still come to some type confusion.

The other alternative is to accept another parameter into the macro: the buffer itself, hoping that the user (or an especially clever LLM) has not had any ideas of pre-filling it to avoid a compiler error for the type confusion.

The second solution could be the prettier one, as we can re-use the specified type twice, but for now, I opted for solution one. This is definitely a point I need to come back on. But for now I don’t have a solution to actually for this problem, other than hoping that the borrow checker will become smarter about this, but I don’t think that can work, as reordering drops like that can break a bunch of Rust code where the block is used to force e.g. a MutexGuard to be dropped early.

Edit (2025-12-30): Fixing the bad.

This was a real thorn to my side, but I couldn’t think of a good solution. Thankfully I was on the Chaos Communication Congress and my friend Isse suggested that I should talk to the people at the Rust assembly, maybe they know something that can be done. And after some discussion we came up with an idea, that while not the prettiest, resolves this problem for good.

For the uninitiated in Rust macros: Rust proc-macros work by being compiled first, and then being called by the compile to generate more code. Right now they have the limitations that they need to live in a separate crate that can only export the macros themselves.

So first we move all the macro code into its separate crate, e.g. query_macros_impl and then create a separate crate query_macros that re-exports the macros alongside the new code.

Basically, what we came up with is the following: Create a wrapper type for the query builder, that takes the row type as a generic parameter and exposes only one function, which is to get the QueryAs type, with the correct lifetime. The actual code is really simple:

use sqlx::{Database, FromRow, QueryBuilder};
use sqlx::query::QueryAs;
pub struct SealedQueryBuilder<'a, DB: Database, T: FromRow<'a, DB::Row>> {
    builder: QueryBuilder<'a, DB>,
    _marker: PhantomData<T>,
}
impl<'a, DB: Database, T: FromRow<'a, DB::Row>> SealedQueryBuilder<'a, DB, T> {
    pub fn new(builder: QueryBuilder<'a, DB>) -> Self {
        Self {
            builder,
            _marker: PhantomData,
        }
    }
    pub fn query_as(&'a mut self) -> QueryAs<'a, DB, T, DB::Arguments<'a>> {
        self.builder.build_query_as()
    }
}

We basically just create the struct, the generics can be copied almost 1:1 from the QueryAs macro and the QueryAs::fetch_all method. Then we create a function that internally calls build_query_as and… That’s it. Now we can bring the type all the way through to the point where it is being used.

The Ugly

I actually lied to you about the implementation earlier to get to not spoiler the surprise in this section.

In the code that inserts the binding values we have a edge-case we did not cover. If the iterable is empty, we generate an invalid query and have a runtime error and we’d have to implement custom logic before every use of the macro. This is quite counter-intuitive and easy to forget. So we have two options again: Add a custom error that the user is required to handle, or make sure the query can run, even if one of the lists is empty.

I decided to go for the second option and that’s where we’ll reach back to our pin from earlier: remember how I said MariaDB will not complain, even if we have a type-missmatch in our query? That’s were it comes back useful: If we just push a true for each for-loop, we make sure the actual list is always (syntactically) valid and you’ll get an empty list back, which is what you’d probably expect anyways.

I want to also take this moment to formally apologize to any potential future co-workers that look at the queries executed in the database and then have to go into a rabbit-hole on why every list contains a true at the beginning.

While that “feature” of MariaDB to silently ignore type miss-matches is a blessing for fixing that edge-case, it’s also a curse – if you accidentally pass in a list of, say, ids instead of strings, the validation will work just-fine™ while the query can never return an actual row. But since this this is not something I can influence in the moment, I have to (begrudgingly) accept this as part of the deal, not just for my macro, but for all of SQLX. Nevertheless, I’m happy I discovered it, so I can be aware of this behavior going forward.

Like this blogpost? Consider sharing it:

Copied!