tutoriales.com

Macros in Rust: Automating Code with Declarative and Procedural Macros 🛠️

This tutorial will immerse you in the fascinating world of Rust macros. You'll learn to use declarative macros (`macro_rules!`) to eliminate code duplication and explore powerful procedural macros to dynamically generate code and create your own DSLs. With practical examples, you'll uncover how macros can take your productivity and the expressiveness of your Rust code to the next level.

Avanzado18 min de lectura100 views
Reportar error

Rust is a language renowned for its performance, safety, and control. One of its most powerful and often underestimated features is its macro system. Macros allow us to write code that writes code, a concept known as metaprogramming. This helps us avoid duplication (the DRY principle - Don't Repeat Yourself), automatically generate repetitive code, and create high-level abstractions that integrate seamlessly with the language's syntax.

In this tutorial, we'll explore two main types of macros in Rust:

  1. Declarative Macros (macro_rules!): The most common and easiest to use, ideal for simplifying repetitive code patterns.
  2. Procedural Macros: More advanced, these allow us to manipulate the Abstract Syntax Tree (AST) of Rust code to generate complex logic or define new attributes and derives.

Get ready to take your Rust skills to the next level. Let's dive in!


What Are Macros and Why Are They Useful? 🤔

Imagine you have a code pattern that repeats several times in your project, perhaps with slight variations. Without macros, you'd have to copy and paste the code, which is error-prone and difficult to maintain. Functions help, but they only operate on values, not on the structure of the code.

Macros, on the other hand, operate at compile time. They take input code, transform it, and expand it into other Rust code before the final compiler processes it. Think of them as "functions that operate on syntax."

💡 Tip: Macros are like coding assistants that help you write less repetitive code and keep your codebase cleaner and more organized.

Advantages of Using Macros ✨

  • Reduced Duplication (DRY): Avoid writing the same code repeatedly.
  • Increased Expressiveness: Allows for creating DSLs (Domain Specific Languages) or custom syntax that feels native to Rust.
  • Boilerplate Code Generation: Useful for generating trait implementations, constructors, or complex matching patterns.
  • Improved Ergonomics: Makes complex APIs easier to use.

Differences Between Functions and Macros 🆚

It's crucial to understand that macros and functions are different tools with distinct purposes:

FeatureRust FunctionsRust Macros
Execution TimeRuntimeCompile-time (expansion)
ArgumentsDefined values and typesCode fragments (tokens)
ReturnA specific valueA code fragment
Operation TypeData logicSyntax logic
OverloadingNot directly (using traits)Yes (based on matching patterns)
RecursionYesYes (but with depth limits)

Declarative Macros (macro_rules!) 📖

Declarative macros are the most common way to write macros in Rust. They are defined with the macro_rules! keyword and are based on a pattern matching system similar to match expressions. Essentially, you define patterns for what the input code should look like and what Rust code should be generated as output for that pattern.

Basic Structure of macro_rules!

A declarative macro is defined like this:

macro_rules! my_macro {
    // Rule 1: input pattern => output code
    ( $( $arg:expr ),* ) => {
        // Code that is generated when the pattern matches
        println!("Received expressions: {}", $( $arg ),*);
    };

    // Rule 2: another pattern => another output code
    ( $name:ident = $value:expr ) => {
        let $name = $value;
        println!("Variable {} set to {}", stringify!($name), $name);
    };
}

Each rule consists of a pattern (=>) followed by an expansion body. The compiler attempts to match the macro input with the patterns in order. If a match is found, it expands the corresponding code.

Code Fragments (Metavariables) 🧩

Within patterns, we use $identifier:fragment_specifier to capture code fragments. Here are some common specifiers:

  • expr: A Rust expression (e.g., 1 + 2, foo(), bar.baz).
  • ident: An identifier (e.g., variable_name, function_name).
  • ty: A type (e.g., i32, Vec<String>).
  • path: A path (e.g., std::collections::HashMap).
  • stmt: A statement (e.g., let x = 5;).
  • block: A code block (e.g., { let x = 5; x + 1 }).
  • pat: A pattern (e.g., Some(x), _).
  • item: An item (e.g., a function, a struct, an enum).
  • tt: A token tree (any sequence of tokens). The most general and least restrictive.

Repetitions with $(...),* or $(...),+ 🔁

We can capture multiple code fragments using repetitions, similar to quantifiers in regular expressions:

  • $( $fragment:specifier ),*: Zero or more repetitions, separated by commas.
  • $( $fragment:specifier ),+: One or more repetitions, separated by commas.

You can also specify other separators (e.g., ;, ).

Practical Example: A Simplified vec! Macro 🚀

Let's create a simplified version of the vec! macro, which constructs a Vec from a list of elements.

macro_rules! my_vec {
    // Case for an empty vector
    () => {
        Vec::new()
    };
    // Case for comma-separated elements
    ( $( $x:expr ),* ) => {
        {
            let mut temp_vec = Vec::new();
            $( // Repeated expansion
                temp_vec.push($x);
            )*
            temp_vec
        }
    };
}

fn main() {
    let v1: Vec<i32> = my_vec!();
    println!("v1: {:?}", v1); // v1: []

    let v2 = my_vec!(1, 2, 3);
    println!("v2: {:?}", v2); // v2: [1, 2, 3]

    let v3 = my_vec!("hello", "world");
    println!("v3: {:?}", v3); // v3: ["hello", "world"]
}
📌 Note: It's important to wrap the expansion body in a `{}` block to ensure that temporary variables like `temp_vec` do not "escape" the macro's scope and cause conflicts.

Macro for Simplified Debugging 🐛

Macros are excellent for debugging tools. Let's create a debug_print! macro that prints a variable's name and its value.

macro_rules! debug_print {
    ( $( $arg:expr ),* ) => {
        $( // For each argument
            println!("{}: {:?}", stringify!($arg), $arg);
        )*
    };
}

fn main() {
    let x = 10;
    let y = "Rust";
    let z = vec![1, 2, 3];

    debug_print!(x, y, z);
    // Expected Output:
    // x: 10
    // y: "Rust"
    // z: [1, 2, 3]
}

Here, stringify!($arg) is another built-in macro that converts the code fragment $arg into a string literal with its textual representation. It's very useful for debugging and naming generation.

Why `macro_rules!` and not `fn` for this?A function `fn debug_print(val: T)` couldn't get the *name* of the variable `val`, only its *value*. Macros operate at the syntax level, so `stringify!($arg)` can see `x` as text before `x` evaluates to `10`.

Considerations When Using macro_rules! ⚠️

  • Scope: Macros are in scope throughout the module where they are declared, or can be pub used to export them.
  • Recursion: Macros can call themselves (recursion), allowing for more complex patterns like building syntax trees. However, there are recursion depth limits to prevent infinite loops.
  • Debugging: Debugging macros can be challenging. You can use cargo expand (requires rustfmt installed) to see the code a macro generates, which is invaluable.
# To install cargo expand
cargo install cargo-expand

# To see macro expansion in your code
cargo expand

Procedural Macros (Custom Derive, Attributes, Functions) 🧠

Procedural macros are much more powerful and flexible than declarative macros, but also more complex to write. They operate directly on the AST (Abstract Syntax Tree) of Rust code. This means you can analyze and manipulate the code structure at a much deeper level. To write them, you need the proc_macro crate.

There are three types of procedural macros:

  1. Custom Derive Macros: Used with the #[derive(MyMacro)] attribute and generate trait implementations for structs and enums.
  2. Attribute Macros: Applied to items (functions, structs, modules) like #[my_attribute(key = "value")] and can modify the item they are applied to or generate additional items.
  3. Function-like Macros: Used like declarative macros my_macro!(...), but the macro body is Rust code that manipulates tokens.
🔥 Important: Procedural macros must reside in a `proc-macro` type crate. They cannot be mixed with other library or binary code in the same crate.

Setting the Stage: proc-macro Crate 📁

First, create a new proc-macro project:

cargo new my_macro_crate --lib
cd my_macro_crate

Modify your Cargo.toml to be a proc-macro crate:

[lib]
proc-macro = true

[dependencies]
syn = { version = "1.0", features = ["full"] }
quote = "1.0"
proc-macro2 = "1.0"
  • syn: A robust library for parsing Rust source code into an AST structure. Essential for reading macro input.
  • quote: A library for generating Rust code from AST structures. Very useful for building the output code.
  • proc-macro2: Facilitates the manipulation of Rust tokens and is the foundation of syn and quote.

Example: Custom Derive for Debug Printing 🎯

Let's create a #[derive(PrintDebug)] macro that automatically implements a print_debug method for a struct, printing all its fields.

First, define a simple trait that our macro will implement:

// In your main application or library crate (NOT in the proc-macro crate)
pub trait PrintDebug {
    fn print_debug(&self);
}

Now, in my_macro_crate/src/lib.rs:

extern crate proc_macro;

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, Data, Fields, Ident, ItemStruct};

#[proc_macro_derive(PrintDebug)]
pub fn print_debug_derive(input: TokenStream) -> TokenStream {
    // 1. Parse the input TokenStream into a Syn ItemStruct structure
    let input = parse_macro_input!(input as ItemStruct);

    // 2. Get the struct name
    let struct_name = &input.ident;

    // 3. Generate the code to print each field
    let field_printers = match &input.data {
        Data::Struct(data_struct) => {
            match &data_struct.fields {
                Fields::Named(fields) => {
                    let recurse = fields.named.iter().map(|f| {
                        let field_name = f.ident.as_ref().unwrap();
                        quote! {
                            println!("    {}: {:?}", stringify!(#field_name), &self.#field_name);
                        }
                    });
                    quote! {
                        #(#recurse)*
                    }
                }
                Fields::Unnamed(fields) => {
                    // For tuples, we'll use indices
                    let recurse = fields.unnamed.iter().enumerate().map(|(i, _f)| {
                        let index = syn::Index::from(i);
                        quote! {
                            println!("    {}: {:?}", #i, &self.#index);
                        }
                    });
                    quote! {
                        #(#recurse)*
                    }
                }
                Fields::Unit => {
                    quote! { println!("    <unit>"); }
                }
            }
        }
        _ => {
            // This is a compile error, because it's only allowed on structs
            // or enums with specific variants, etc.
            // Here we simplify for structs only.
            return TokenStream::from(quote! { compile_error!("PrintDebug can only be used on structs"); });
        }
    };

    // 4. Generate the implementation of the PrintDebug trait
    let expanded = quote! {
        impl PrintDebug for #struct_name {
            fn print_debug(&self) {
                println!("Debugging struct {}", stringify!(#struct_name));
                #field_printers
            }
        }
    };

    // 5. Return the generated TokenStream
    TokenStream::from(expanded)
}

This code is denser. Here's a breakdown of the flow:

  1. parse_macro_input!: Converts the input TokenStream (what the macro receives) into a syn::ItemStruct structure, which represents a Rust struct in a structured way.
  2. input.ident: Accesses the name of the struct (e.g., Person).
  3. match &input.data: Inspects the struct's fields. It could have named fields ({ x: i32, y: i32 }), unnamed fields ((i32, String)) or be a unit (struct MyUnit;).
  4. fields.named.iter().map(...): Iterates over the named fields. For each field, quote! is generated to print "field: value".
    • stringify!(#field_name): Gets the field name as &str.
    • &self.#field_name: Accesses the field's value. The # in quote! is important; it indicates that field_name is a syn or proc_macro2 identifier and not a literal text.
  5. quote! { ... }: This is the magic of quote. It allows you to write Rust code almost as you normally would, and quote! converts it into a proc_macro2::TokenStream. Variables prefixed with # (#struct_name, #field_printers) are "injected" into the generated code.
  6. #(#recurse)*: This is a repetition pattern within quote!. It means "take each element from recurse (which is an iterator of quote!s) and expand them here, with no separator."
  7. TokenStream::from(expanded): Converts the TokenStream generated by quote! into the proc_macro::TokenStream that the macro must return.

Using the Procedural Macro 💡

In your main application's Cargo.toml, add your macro crate as a dependency:

[dependencies]
my_macro_crate = { path = "../my_macro_crate" }

Then, in your main.rs or lib.rs:

use my_macro_crate::PrintDebug;

pub trait PrintDebug {
    fn print_debug(&self);
}

#[derive(PrintDebug)]
struct Person {
    name: String,
    age: u32,
    is_student: bool,
}

#[derive(PrintDebug)]
struct Point(i32, i32);

#[derive(PrintDebug)]
struct Unit;

fn main() {
    let p = Person {
        name: "Alice".to_string(),
        age: 30,
        is_student: false,
    };
    p.print_debug();
    /*
    Output:
    Debugging struct Person
        name: "Alice"
        age: 30
        is_student: false
    */

    let pt = Point(10, 20);
    pt.print_debug();
    /*
    Output:
    Debugging struct Point
        0: 10
        1: 20
    */

    let u = Unit;
    u.print_debug();
    /*
    Output:
    Debugging struct Unit
        <unit>
    */
}

Congratulations! You've just written your first procedural macro.

⚠️ Warning: Procedural macros are powerful, but they can also be difficult to debug and may increase compile times. Use them judiciously.

Procedural Macro Flow Diagram

Start Input TokenStream syn::parse_macro_input! AST Manipulation quote! (Generate Tokens) Return TokenStream End

Procedural Macro Flow Diagram


Attribute Macros and Function-like Procedural Macros 💡

In addition to #[derive], we can create attribute macros and function-like macros that operate with proc_macro::TokenStream.

Attribute Macro: #[log_calls]

Imagine you want to print a function's name and its arguments every time it's called. This is perfect for an attribute macro.

my_macro_crate/src/lib.rs

// ... (dependencies syn, quote, proc-macro2)

#[proc_macro_attribute]
pub fn log_calls(attr: TokenStream, item: TokenStream) -> TokenStream {
    // We ignore `attr` for now (whatever is inside #[log_calls(...)])
    let func = parse_macro_input!(item as syn::ItemFn);

    let func_name = &func.sig.ident;
    let func_args = func.sig.inputs.iter().map(|input| {
        match input {
            syn::FnArg::Receiver(_) => quote! { "self" },
            syn::FnArg::Typed(pat_type) => {
                let pat = &pat_type.pat;
                quote! { stringify!(#pat) }
            }
        }
    }).collect::<Vec<_>>();

    let arg_values = func.sig.inputs.iter().map(|input| {
        match input {
            syn::FnArg::Receiver(_) => quote! { &self },
            syn::FnArg::Typed(pat_type) => {
                let pat = &pat_type.pat;
                quote! { #pat }
            }
        }
    }).collect::<Vec<_>>();

    let expanded = quote! {
        #func

        impl #func_name {
            fn logged_version( #func_name( $( #func_args: impl std::fmt::Debug ),* ) ) -> Self {
                println!("Calling {} with args: {}", stringify!(#func_name), vec![#(#arg_values),*].iter().map(|a| format!("{:?}", a)).collect::<Vec<_>>().join(", "));
                #func_name( $( #arg_values ),* )
            }
        }

    }; // This is a simplification; usually the original function is modified.
       // For a more robust example, we would wrap the original call.

    // A more correct example would be:
    let original_block = &func.block;
    let output = quote! {
        #func.sig
        {
            println!("Calling {} with args: {}", stringify!(#func_name), vec![#(#func_args),*].iter().map(|a| format!("{:?}", a)).collect::<Vec<_>>().join(", "));
            #original_block
        }
    };

    TokenStream::from(output)
}

main.rs

use my_macro_crate::log_calls;

#[log_calls]
fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[log_calls]
fn greet(name: &str) {
    println!("Hello, {}!", name);
}

fn main() {
    let sum = add(5, 3);
    println!("Sum: {}", sum);
    // Output:
    // Calling add with args: 5, 3
    // Sum: 8

    greet("World");
    // Output:
    // Calling greet with args: "World"
    // Hello, World!
}

This log_calls example demonstrates how an attribute can inject logging code around a function's existing logic. The complexity comes from extracting argument names and their values for printing.

Function-like Procedural Macro: sql!

These macros are like macro_rules!, but their body is Rust code. They are perfect for validating custom syntax at compile time. For example, we could have an sql! macro that validates an SQL query.

my_macro_crate/src/lib.rs

// ... (dependencies syn, quote, proc-macro2)

#[proc_macro]
pub fn sql(input: TokenStream) -> TokenStream {
    let input_str = input.to_string();

    // Here we would do actual SQL validation. For simplicity, we only check 'SELECT'
    if !input_str.to_uppercase().starts_with("SELECT") {
        return TokenStream::from(quote! { compile_error!("SQL macro expects a SELECT query"); });
    }

    // For this example, we simply return the SQL string as a String
    // In a real case, you might generate code to build a safe query, etc.
    let output = quote! {
        #input_str.to_string()
    };
    TokenStream::from(output)
}

main.rs

use my_macro_crate::sql;

fn main() {
    let query = sql! { SELECT * FROM users WHERE id = 1 };
    println!("Query: {}", query);
    // Output: Query: SELECT * FROM users WHERE id = 1

    // This would cause a compile error with our simplified macro:
    // let invalid_query = sql! { INSERT INTO users (name) VALUES ('Bob') };
}

These macros are incredibly useful for creating embedded DSLs in Rust, where validation and transformation happen before final compilation.


Tools for Working with Macros 🛠️

Working with macros, especially procedural ones, can be complex. Fortunately, there are tools that facilitate the process:

  • cargo expand: Already mentioned, it's your best friend for seeing the code expanded by any macro. Essential for debugging.
  • rust-analyzer: The LSP extension for Rust. It offers autocompletion and syntactic analysis that understands macro expansion, although it can sometimes struggle with complex macros.
  • syn and quote documentation: These crates are very well documented. Familiarizing yourself with their APIs will save you a lot of time.

Best Practices and Final Considerations ✅

  • When to use macros?: Use them when functions aren't enough: when you need to manipulate syntax, generate repetitive code, automatically implement traits, or create DSLs. Avoid over-engineering; sometimes, a function or a closure is sufficient.
  • Clarity over cleverness: Macros can be cryptic. A slightly more verbose but clear macro is better than a clever but indecipherable one.
  • Meaningful errors: When writing procedural macros, ensure that the compile errors you generate are helpful and clearly point to the problem.
  • Compilation impact: Macros (especially procedural ones) can significantly increase compile times. Keep this in mind for large projects.
  • Testing: Write tests for your macros. For macro_rules!, you can test the expansion directly. For procedural macros, you need a test crate that uses your macro crate.

Conclusion 🎯

Macros are a distinctive feature of Rust that allows you to go beyond function-level abstraction, manipulating the code itself. Whether you're eliminating duplication with macro_rules! or building complex DSLs with procedural macros, mastering macros will open new doors to writing more expressive, efficient, and maintainable Rust code.

I hope this tutorial has provided you with a solid foundation to start exploring and using macros in your Rust projects. Now go forth and metaprogram with confidence!

Tutoriales relacionados

Comentarios (0)

Aún no hay comentarios. ¡Sé el primero!