Files
windmill/backend/rust-best-practices.mdc
centdix 9462d56be7 internal: better instructions for claude (#5996)
* better instructions for claude

* remove file

* better rules

* better claude action

* add api routes prefixes

* typo

* typo

* fix

* fix

* add typegen explanations

* remove npm run format
2025-06-19 15:36:10 +02:00

110 lines
7.0 KiB
Plaintext

---
description:
globs: backend/**/*.rs
alwaysApply: false
---
# Windmill Backend - Rust Best Practices
## Project Structure
Windmill uses a workspace-based architecture with multiple crates:
- **windmill-api**: API server functionality
- **windmill-worker**: Job execution
- **windmill-common**: Shared code used by all crates
- **windmill-queue**: Job & flow queuing
- **windmill-audit**: Audit logging
- Other specialized crates (git-sync, autoscaling, etc.)
## Adding New Code
### Module Organization
- Place new code in the appropriate crate based on functionality
- For API endpoints, create or modify files in `windmill-api/src/` organized by domain
- For shared functionality, use `windmill-common/src/`
- Use the `_ee.rs` suffix for enterprise-only modules
- Follow existing patterns for file structure and organization
### Error Handling
- Use the custom `Error` enum from `windmill-common::error`
- Return `Result<T, Error>` or `JsonResult<T>` for functions that can fail
- Use the `?` operator for error propagation
- Add location tracking to errors using `#[track_caller]`
### Database Operations
- Use `sqlx` for database operations with prepared statements
- Leverage existing database helper functions in `db.rs` modules
- Use transactions for multi-step operations
- Handle database errors properly
### API Endpoints
- Follow existing patterns in the `windmill-api` crate
- Use axum's routing system and extractors
- Group related routes together
- Use consistent response formats (JSON)
- Follow proper authentication and authorization patterns
- Do not forget to update backend/windmill-api/openapi.yaml after modifying an api endpoint
## Performance Optimizations
When generating code, especially involving `serde`, `sqlx`, and `tokio`, prioritize performance by applying the following principles:
### Serde Optimizations (Serialization & Deserialization)
- **Specify Structure Explicitly:** When defining structs for Serde (`#[derive(Serialize, Deserialize)]`), use `#[serde(...` attributes extensively. This includes:
* `#[serde(rename = "...")]` or `#[serde(alias = "...")]` to map external names precisely, avoiding dynamic lookups.
* `#[serde(default)]` for optional fields with default values, reducing parsing complexity.
* `#[serde(skip_serializing_if = "...")]` to avoid writing fields that meet a certain condition (e.g., `Option::is_none()`, `Vec::is_empty()`, or a custom function), reducing output size and serialization work.
* `#[serde(skip_serializing)]` or `#[serde(skip_deserializing)]` for fields that should *not* be included.
- **Prefer Borrowing:** Where possible and safe (data lifetime allows), use `Cow<'a, str>` or `&'a str` (with `#[serde(borrow)]`) instead of `String` for string fields during deserialization. This avoids allocating new strings, enabling zero-copy reading from the input buffer. Apply this principle to byte slices (`&'a [u8]` / `Cow<'a, [u8]>`) and potentially borrowed vectors as well.
- **Avoid Intermediate `Value`:** Unless the data structure is truly dynamic or unknown at compile time, deserialize directly into a well-defined struct or enum rather than into `serde_json::Value` (or equivalent for other formats). This avoids unnecessary heap allocations and type switching.
### SQLx Optimizations (Database Interaction)
- **Select Only Necessary Columns:** In `SELECT` queries, list specific column names rather than using `SELECT *`. This reduces data transferred from the database and the work needed for hydration/deserialization.
- **Batch Operations:** For multiple `INSERT`, `UPDATE`, or `DELETE` statements, prefer executing them in a single query if the database and driver support it efficiently (e.g., `INSERT INTO ... VALUES (...), (...), ...`). This minimizes round trips to the database.
- **Avoid N+1 Queries:** Do not loop through results of one query and execute a separate query for each item (e.g., fetching users, then querying for each user's profile in a loop). Instead, use JOINs or a single query with an `IN` clause to fetch related data efficiently.
- **Deserialize Directly:** Use `#[derive(FromRow)]` on structs and ensure the struct fields match the selected columns in the query. This allows SQLx to hydrate objects directly, avoiding intermediate data structures.
- **Parameterize Queries:** Always use SQLx's query methods (`.bind(...)`) to pass values as parameters rather than string formatting. This prevents SQL injection and allows the database to cache query plans, improving performance on repeated executions.
### Tokio Optimizations (Asynchronous Runtime)
- **Avoid Blocking Operations:** **Crucially**, never perform blocking operations (synchronous file I/O, `std::thread::sleep`, CPU-bound loops, `std::sync::Mutex::lock`, blocking network calls without `tokio::net`) directly within an `async fn` or a standard `tokio::spawn` task. Blocking pauses the entire worker thread, potentially starving other tasks. Use `tokio::task::spawn_blocking` for CPU-intensive work or blocking I/O.
- **Use Tokio's Async Primitives:** Prefer `tokio::sync` (channels, mutexes, semaphores), `tokio::io`, `tokio::net`, and `tokio::time` over their `std` counterparts in asynchronous contexts. These are designed to yield control back to the scheduler.
- **Manage Concurrency:** Be mindful of how many tasks are spawned. Creating a new task for every tiny piece of work can introduce overhead. Group related asynchronous operations where appropriate.
- **Handle Shared State Efficiently:** Use `Arc` for shared ownership in concurrent tasks. When shared state needs mutation, prefer `tokio::sync::Mutex` over `std::sync::Mutex` in `async` code. Consider `tokio::sync::RwLock` if reads significantly outnumber writes. Minimize the duration for which locks are held.
- **Understand `.await`:** Place `.await` strategically to allow the runtime to switch to other ready tasks. Ensure that `.await` points to genuinely asynchronous operations.
- **Backpressure:** If dealing with data streams or queues between tasks, implement backpressure mechanisms (e.g., bounded channels like `tokio::sync::mpsc::channel`) to prevent one component from overwhelming another or critical resources like the database.
## Enterprise Features
- Use feature flags for enterprise functionality
- Conditionally compile with `#[cfg(feature = "enterprise")]`
- Isolate enterprise code in separate modules
## Code Style
- Group imports by external and internal crates
- Place struct/enum definitions before implementations
- Group similar functionality together
- Use descriptive naming consistent with the codebase
- Follow existing patterns for async code using tokio
## Testing
- Write unit tests for core functionality
- Use the `#[cfg(test)]` module for test code
- For database tests, use the existing test utilities
## Common Crates Used
- **tokio**: For async runtime
- **axum**: For web server and routing
- **sqlx**: For database operations
- **serde**: For serialization/deserialization
- **tracing**: For logging and diagnostics
- **reqwest**: For HTTP client functionality