Writing Effective Rules

Learn how to write rules that maximize AI review quality while minimizing token usage.

Why Structured Rules Matter

Traditional AI code review sends your entire PR diff to an LLM with a generic prompt like "review this code." This approach has serious problems:

Problem	Impact
Token waste	Reviewing 500 files when only 10 are relevant
Unfocused analysis	AI tries to check everything, catches nothing well
Inconsistent results	Different reviews for the same patterns
High latency	Large prompts = slow responses

diffray solves this with structured rules and pattern matching.

How Pattern Matching Saves Tokens

Consider a PR that changes 50 files across your codebase. Without pattern matching, the AI would analyze all 50 files for every possible issue — security, performance, testing, etc.

With diffray's pattern matching:

rules:
  - id: react_useeffect_cleanup
    match:
      file_glob:
        - "**/*.tsx"
        - "**/*.jsx"
      content_regex:
        - "useEffect"

This rule only runs when:

The changed file is a React component (.tsx or .jsx)
The file contains useEffect

If your 50-file PR has 5 React components with useEffect, only those 5 files are sent for this specific check — 90% token savings.

Token Efficiency Example

Without pattern matching:
┌─────────────────────────────────────────────────────┐
│  50 files × all rules = massive prompt              │
│  ~500,000 tokens per review                         │
│  Cost: $$$$ | Time: 10+ minutes                     │
└─────────────────────────────────────────────────────┘

With pattern matching:
┌─────────────────────────────────────────────────────┐
│  5 React files × React rules                        │
│  10 Python files × Python rules                     │
│  3 API files × API rules                            │
│  = targeted, efficient prompts                      │
│  ~50,000 tokens per review                          │
│  Cost: $ | Time: 2-3 minutes                        │
└─────────────────────────────────────────────────────┘

How Focused Context Improves Quality

When an AI agent receives a focused prompt with specific rules, it produces better results:

Unfocused Prompt (Traditional)

Review this code for any issues:
[500 lines of mixed code]

The AI tries to check security, performance, style, testing... everything. Result: shallow analysis, missed issues, false positives.

Focused Prompt (diffray)

You are a security expert. Check this React component for XSS vulnerabilities.

Rule: dangerouslySetInnerHTML without sanitization
Checklist:
- Find all dangerouslySetInnerHTML usage
- Check if DOMPurify.sanitize() is called
- Verify user input is never passed directly

[50 lines of relevant code]

The AI knows exactly what to look for, has clear success criteria, and sees only relevant code. Result: precise, actionable findings.

Rule Anatomy

Every rule has a specific structure that guides the AI:

rules:
  - id: unique_rule_identifier        # Required: unique ID
    agent: security                    # Required: which agent processes this
    title: "Short descriptive title"   # Required: shown in review comments
    description: "Detailed explanation" # Required: context for the AI
    importance: 8                      # Required: 1-10 priority

    match:                             # Required: when to apply this rule
      file_glob:                       # File patterns (glob syntax)
        - "**/*.ts"
        - "!**/*.test.ts"              # Exclude patterns with !
      content_regex:                   # Optional: content patterns (regex)
        - "dangerouslySetInnerHTML"

    checklist:                         # Required: what to verify
      - "Step 1: Find X"
      - "Step 2: Check Y"
      - "Step 3: Verify Z"

    examples:                          # Recommended: show good vs bad
      bad: |
        // Code that violates the rule
      good: |
        // Code that follows the rule

    tags:                              # Optional: categorization
      - security
      - react

    why_important: "Business impact"   # Optional: explains severity
    link: "https://..."                # Optional: reference documentation

Field-by-Field Guide

`id` — Unique Identifier

id: sec_xss_dangerously_set_html

Must be unique across all rules
Use prefixes for organization: sec_, perf_, bug_, arch_, test_
Use snake_case
Keep it descriptive but concise

`agent` — Specialized Reviewer

agent: security

diffray has 31 specialized agents organized into categories:

Category	Agents	Examples
Core	`security`, `performance`, `bugs`, `quality`, `architecture`, `consistency`, `testing`, `documentation`, `general`	XSS detection, N+1 queries, null checks
Languages	`typescript`, `python`, `go`, `rust`, `kotlin`, `csharp`, `ruby`, `php`	Type safety, idioms, language-specific patterns
Frameworks	`react`, `vue`, `angular`, `nextjs`, `nestjs`, `nodejs`, `spring`, `flutter`	Hooks patterns, SSR, dependency injection
Domains	`graphql`, `microservices`, `dependencies`, `accessibility`, `compliance`, `refactoring`	Schema design, CVE detection, WCAG

See Review Agents for the complete list with detailed descriptions.

Why it matters: The agent's system prompt includes domain expertise. A security agent knows OWASP Top 10, injection patterns, and auth best practices. A React agent understands hooks rules and component lifecycle. Assigning rules to the right agent improves detection quality.

`title` and `description`

title: "SQL injection via string concatenation"
description: "User input concatenated into SQL queries allows attackers to execute arbitrary SQL"

title: Short (5-10 words), shown in review comments
description: Detailed context for the AI to understand the issue

`importance` — Priority Level

importance: 9

Scale of 1-10:

9-10: Critical — security vulnerabilities, data loss risks
7-8: High — bugs, performance issues
5-6: Medium — code quality, maintainability
3-4: Low — style, minor improvements
1-2: Informational — suggestions, nice-to-haves

Higher importance rules are prioritized when there are many findings.

`match` — Pattern Matching

This is where token savings happen. The more specific your match, the less code the AI processes.

`file_glob` — File Patterns

match:
  file_glob:
    - "**/*.ts"                    # All TypeScript files
    - "**/*.tsx"                   # All React TypeScript files
    - "src/api/**/*.ts"            # Only API folder
    - "!**/*.test.ts"              # Exclude test files
    - "!**/*.spec.ts"              # Exclude spec files
    - "!**/node_modules/**"        # Exclude node_modules

Glob patterns:

* — matches any characters except /
** — matches any characters including /
! prefix — excludes matching files
{a,b} — matches a or b
+(pattern) — matches one or more of pattern

`content_regex` — Content Patterns

match:
  file_glob:
    - "**/*.ts"
  content_regex:
    - "eval\\s*\\("                # Files containing eval()
    - "new\\s+Function\\s*\\("     # Files containing new Function()

Important: Files must match both file_glob AND content_regex (if specified). This dramatically reduces the files sent to AI.

`checklist` — Verification Steps

checklist:
  - "Find all eval() calls in the changed code"
  - "Check if the argument comes from user input"
  - "Verify there's no validation or sanitization"
  - "Suggest using safer alternatives like JSON.parse()"

The checklist guides the AI step-by-step:

What to look for
What conditions make it a problem
What to recommend

Best practices:

Start with "Find", "Check", "Verify", "Ensure"
Be specific about what constitutes a violation
Include the fix suggestion

`examples` — Good vs Bad Code

examples:
  bad: |
    // Vulnerable to SQL injection
    const query = `SELECT * FROM users WHERE id = ${userId}`;
    await db.execute(query);
  good: |
    // Safe parameterized query
    const query = 'SELECT * FROM users WHERE id = ?';
    await db.execute(query, [userId]);

Examples help the AI:

Recognize the exact pattern to flag
Understand what the fix should look like
Distinguish false positives from real issues

`tags` — Categorization

tags:
  - security
  - sql
  - owasp

Tags are used for:

Filtering rules in reports
Grouping related rules
Documentation

See Available Tags for the complete list of 1,000+ tags organized by category.

`why_important` — Business Context

why_important: "SQL injection is the #1 web vulnerability. Attackers can read, modify, or delete all database data."

Explains the real-world impact. Helps both the AI (context) and developers (understanding severity).

`link` — Reference Documentation

link: https://owasp.org/www-community/attacks/SQL_Injection

Points to detailed documentation for developers who want to learn more.

Matching Strategies

Strategy 1: Language-Specific Rules

# Python-specific
match:
  file_glob:
    - "**/*.py"
  content_regex:
    - "subprocess\\."

Strategy 2: Framework-Specific Rules

# React-only
match:
  file_glob:
    - "**/*.tsx"
    - "**/*.jsx"
  content_regex:
    - "useState|useEffect|useCallback"

Strategy 3: Path-Based Rules

# API endpoints only
match:
  file_glob:
    - "src/api/**/*.ts"
    - "src/routes/**/*.ts"

Strategy 4: Config Files

# Docker and CI configs
match:
  file_glob:
    - "**/Dockerfile"
    - "**/.github/workflows/*.yml"
    - "**/docker-compose*.yml"

Strategy 5: Exclude Patterns

# All code except tests and generated files
match:
  file_glob:
    - "**/*.ts"
    - "!**/*.test.ts"
    - "!**/*.spec.ts"
    - "!**/*.generated.ts"
    - "!**/node_modules/**"

Writing Effective Checklists

Bad Checklist

checklist:
  - "Check the code"
  - "Look for problems"
  - "Suggest fixes"

Too vague. The AI doesn't know what to look for.

Good Checklist

checklist:
  - "Find all places where user input is used in SQL queries"
  - "Check if parameterized queries or prepared statements are used"
  - "Verify that string concatenation or template literals are NOT used for query building"
  - "If vulnerable, suggest using db.execute(query, [params]) pattern"

Specific, actionable, with clear success criteria.

Common Patterns

Pattern: Deprecated API Usage

rules:
  - id: deprecated_moment_js
    agent: quality
    title: "Migrate from Moment.js to date-fns"
    description: "Moment.js is deprecated. Use date-fns for smaller bundle and better tree-shaking."
    importance: 5
    match:
      file_glob:
        - "**/*.ts"
        - "**/*.js"
      content_regex:
        - "import.*from ['\"]moment['\"]"
        - "require\\(['\"]moment['\"]\\)"
    checklist:
      - "Find Moment.js imports"
      - "Identify the date operations being used"
      - "Suggest equivalent date-fns functions"
    examples:
      bad: |
        import moment from 'moment';
        const formatted = moment(date).format('YYYY-MM-DD');
      good: |
        import { format } from 'date-fns';
        const formatted = format(date, 'yyyy-MM-dd');

Pattern: Missing Error Handling

rules:
  - id: async_try_catch
    agent: bugs
    title: "Async function missing error handling"
    description: "Async functions should handle errors to prevent unhandled rejections"
    importance: 7
    match:
      file_glob:
        - "**/*.ts"
      content_regex:
        - "async\\s+function"
        - "async\\s*\\("
    checklist:
      - "Find async functions in changed code"
      - "Check if they have try-catch blocks"
      - "Verify errors are logged or propagated properly"
      - "Check for .catch() on Promise chains"

Pattern: Security Headers

rules:
  - id: missing_security_headers
    agent: security
    title: "API response missing security headers"
    description: "API responses should include security headers like CORS, CSP, X-Frame-Options"
    importance: 8
    match:
      file_glob:
        - "src/api/**/*.ts"
        - "src/routes/**/*.ts"
    checklist:
      - "Find API route handlers"
      - "Check if security headers are set"
      - "Verify CORS is configured properly"
      - "Check for helmet middleware usage"

What NOT to Use diffray For

Don't Use AI for Style Checks

diffray is powerful but expensive. Each review costs tokens, and AI is overkill for issues that deterministic tools handle better.

Leave these to linters:

Issue Type	Use Instead of diffray
Formatting (indentation, spacing)	Prettier, Biome
Import order	ESLint, Biome
Naming conventions	ESLint rules
Unused variables	TypeScript, ESLint
Missing semicolons	Prettier, Biome
Trailing commas	Prettier
Quote style	Prettier, Biome

Why linters are better for style:

Linter check:
┌─────────────────────────────────────────────────────┐
│  Time: ~100ms                                       │
│  Cost: $0                                           │
│  Accuracy: 100% (deterministic)                     │
│  Runs on: every save, commit, CI                    │
└─────────────────────────────────────────────────────┘

AI check for same issue:
┌─────────────────────────────────────────────────────┐
│  Time: ~5 seconds                                   │
│  Cost: ~$0.01 per check                             │
│  Accuracy: 95% (may miss edge cases)                │
│  Runs on: PR only                                   │
└─────────────────────────────────────────────────────┘

Focus diffray on What Linters Can't Do

diffray excels at semantic analysis — understanding code meaning, not just syntax:

diffray Strengths	Linter Limitations
"This function does too many things"	Linters can't understand responsibilities
"This API call lacks error handling"	Linters can't trace data flow
"This validation is incomplete"	Linters can't understand business logic
"This is a potential race condition"	Linters can't reason about concurrency
"This pattern exists elsewhere — reuse it"	Linters can't find semantic duplicates

The Right Tool Stack

┌─────────────────────────────────────────────────────┐
│  1. Prettier/Biome — formatting (on save)           │
│  2. ESLint/Biome — static analysis (on commit)      │
│  3. TypeScript — type checking (on commit)          │
│  4. diffray — semantic review (on PR)               │
└─────────────────────────────────────────────────────┘

Each layer catches different issues:

Prettier: Makes code look consistent
ESLint: Catches common mistakes and enforces patterns
TypeScript: Prevents type-related bugs
diffray: Reviews logic, architecture, security, and catches issues that require understanding code context

Example: Wrong vs Right Use of diffray

Wrong — style checking (use ESLint instead):

# DON'T DO THIS
rules:
  - id: no_var_keyword
    agent: quality
    title: "Don't use var"
    checklist:
      - "Find var declarations"
      - "Suggest using let or const"

ESLint's no-var rule does this instantly, for free, with 100% accuracy.

Right — semantic analysis (perfect for diffray):

# DO THIS
rules:
  - id: missing_error_boundary
    agent: bugs
    title: "React component tree lacks error boundary"
    description: "Components with async data fetching should have error boundaries to prevent white screens"
    checklist:
      - "Find components with useQuery, useSWR, or fetch calls"
      - "Check if there's an ErrorBoundary wrapper in the component tree"
      - "Verify error states are handled gracefully"

No linter can understand component hierarchy and data flow patterns.

Best Practices

1. Be Specific, Not Generic

# Too generic
checklist:
  - "Check for security issues"

# Specific
checklist:
  - "Find all innerHTML assignments"
  - "Check if the value comes from user input"
  - "Verify DOMPurify.sanitize() is used"

2. Use Content Regex to Reduce Scope

Without content_regex, the rule runs on ALL matching files:

# Runs on every .ts file
match:
  file_glob:
    - "**/*.ts"

With content_regex, only relevant files are processed:

# Only files that actually use eval
match:
  file_glob:
    - "**/*.ts"
  content_regex:
    - "\\beval\\s*\\("

3. Exclude Tests When Appropriate

Test files often violate rules intentionally:

match:
  file_glob:
    - "**/*.ts"
    - "!**/*.test.ts"
    - "!**/*.spec.ts"
    - "!**/__tests__/**"

4. Provide Clear Examples

Examples should be:

Minimal — show only the relevant code
Realistic — look like real code, not pseudocode
Contrasting — bad and good should be obviously different

5. Start Simple, Iterate

Create a simple rule with just the basics
Test it on a real PR
Add content_regex if too many false positives
Refine checklist based on AI output
Add examples if AI misses patterns

6. Prefer Language-Specific Rules Over Universal Ones

When the same problem can occur across multiple programming languages, create separate rules for each language instead of one generic rule.

Why this matters:

Universal Rule	Language-Specific Rules
Vague matching, runs on many irrelevant files	Precise matching, runs only on relevant files
Generic examples that don't match real code	Examples in the actual language syntax
AI must infer context from file extension	AI has explicit language context
Weaker pattern matching	Exact regex for language idioms

Example: Detecting hardcoded secrets

A universal approach might seem efficient:

# ❌ Avoid: Universal rule for all languages
rules:
  - id: hardcoded_secrets
    agent: security
    title: "Hardcoded secrets detected"
    match:
      file_glob:
        - "**/*.{ts,js,py,go,java,rb}"
      content_regex:
        - "password\\s*=|api_key\\s*=|secret\\s*="
    checklist:
      - "Find hardcoded credentials"
      - "Suggest using environment variables"

But language-specific rules are more effective:

# ✅ Better: Separate rule for Python
rules:
  - id: py_hardcoded_secrets
    agent: security
    title: "Hardcoded secrets in Python code"
    match:
      file_glob:
        - "**/*.py"
      content_regex:
        - "password\\s*=\\s*[\"']"
        - "api_key\\s*=\\s*[\"']"
        - "AWS_SECRET"
    checklist:
      - "Find string literals assigned to password, api_key, secret, or token variables"
      - "Check if values come from os.environ or config files"
      - "Suggest using python-dotenv or environment variables"
    examples:
      bad: |
        # Hardcoded secret
        password = "super_secret_123"
        api_key = "sk-1234567890"
      good: |
        # From environment
        import os
        password = os.environ.get("DB_PASSWORD")
        api_key = os.getenv("API_KEY")

# ✅ Better: Separate rule for TypeScript/JavaScript
rules:
  - id: ts_hardcoded_secrets
    agent: security
    title: "Hardcoded secrets in TypeScript/JavaScript code"
    match:
      file_glob:
        - "**/*.ts"
        - "**/*.js"
        - "!**/*.test.ts"
        - "!**/*.spec.ts"
      content_regex:
        - "password\\s*[=:]\\s*[\"'`]"
        - "apiKey\\s*[=:]\\s*[\"'`]"
        - "AWS_SECRET"
    checklist:
      - "Find string literals assigned to password, apiKey, secret, or token properties"
      - "Check if values come from process.env or config modules"
      - "Suggest using environment variables with dotenv"
    examples:
      bad: |
        // Hardcoded secret
        const password = "super_secret_123";
        const config = { apiKey: "sk-1234567890" };
      good: |
        // From environment
        const password = process.env.DB_PASSWORD;
        const config = { apiKey: process.env.API_KEY };

Benefits of language-specific rules:

Better matching — Python uses = for assignment, JS/TS uses both = and : for object properties
Relevant examples — os.environ vs process.env gives the AI exact patterns to suggest
Accurate checklists — language-specific idioms and libraries
Reduced false positives — regex patterns tuned for each language syntax
Better AI focus — the model knows exactly which language it's reviewing

When to split rules:

The same vulnerability/problem exists in multiple languages you use
The fix or best practice differs between languages
The syntax patterns are different (assignment, imports, function calls)
You want language-specific examples and suggestions

When a universal rule is OK:

Config files with the same format across the project (YAML, JSON)
Documentation files (Markdown)
Simple text patterns that don't depend on language syntax

PR-Level and Git History Rules

Some rules need to analyze the entire Pull Request holistically rather than individual files. diffray supports special tags for these use cases.

The `pr-level` Tag

Use pr-level for rules that analyze:

PR description quality (motivation, context)
PR scope (single responsibility)
Breaking changes documentation
Test coverage for new features

rules:
  - id: pr_description_explain_why
    agent: general
    title: PR description should explain motivation
    description: |
      PR descriptions should focus on WHY changes were made, not just WHAT.
      Reviewers can see the "what" in the diff.
    importance: 7
    always_run: true
    match:
      file_glob:
        - '**/*'
    checklist:
      - Check if PR description explains the business reason
      - Verify the description answers "Why is this change necessary?"
      - Look for context about the problem being solved
    tags:
      - pr-level
      - quality

The `git-history` Tag

Use git-history for rules that need to analyze commit messages and history:

Conventional Commits format
Atomic commits
Ticket/issue references

rules:
  - id: commit_message_format
    agent: general
    title: Commit messages should follow conventional format
    description: |
      Commit messages should follow type(scope): description format
      for automated changelogs and semantic versioning.
    importance: 5
    always_run: true
    match:
      file_glob:
        - '**/*'
    checklist:
      - Use `git log --oneline` to check recent commit messages
      - Verify commits follow type(scope) format
      - Check for clear messages (not "fix stuff", "WIP")
      - Look for imperative mood in subject lines
    examples:
      bad: |
        fix stuff
        WIP
        update code
      good: |
        feat(auth): add OAuth2 login with Google
        fix(api): handle null response from payment service
    tags:
      - git-history
      - quality
      - automation

How These Tags Work

The general agent is configured to handle these special tags:

pr-level: Agent analyzes the overall PR context — files changed, scope, description quality
git-history: Agent uses Bash to run git commands:
- git log --oneline -20 — recent commit messages
- git log --stat -5 — commits with file changes
- git log --format="%s%n%b" -10 — full messages with body

Key Differences from File-Based Rules

Aspect	File-Based Rules	PR-Level Rules
Scope	Individual files	Entire PR
Matching	`file_glob` + `content_regex`	`always_run: true`
Agent	Specialized (security, bugs, etc.)	Usually `general`
Data source	File content	Git metadata, PR context

Example: Complete PR Quality Ruleset

rules/
├── pr-quality/
│   ├── description-explain-why.yaml   # PR explains "why"
│   ├── single-responsibility.yaml     # PR has one focus
│   └── ticket-reference.yaml          # Links to Jira/Linear
├── git-commits/
│   ├── message-format.yaml            # Conventional Commits
│   └── atomic-changes.yaml            # One change per commit
└── architecture-docs/
    ├── breaking-changes.yaml          # BREAKING CHANGE notice
    └── adr-reference.yaml             # Link to ADR

When to Use PR-Level Rules

Good use cases:

Enforcing PR description templates
Checking for ticket references
Validating commit message conventions
Ensuring breaking changes are documented
Verifying test coverage for new features

Avoid for:

Code-level checks (use file-based rules)
Style issues (use linters)
Anything that can be checked per-file

See also:

Available Tags — complete tag reference with 1,000+ tags
Project-Specific Rules — examples of custom rules
Agents — learn about specialized review agents
Configuration Overview — global settings and rule exclusions

Why Structured Rules Matter​

How Pattern Matching Saves Tokens​

Token Efficiency Example​

How Focused Context Improves Quality​

Unfocused Prompt (Traditional)​

Focused Prompt (diffray)​

Rule Anatomy​

Field-by-Field Guide​

id — Unique Identifier​

agent — Specialized Reviewer​

title and description​

importance — Priority Level​

match — Pattern Matching​

file_glob — File Patterns​

content_regex — Content Patterns​

checklist — Verification Steps​

examples — Good vs Bad Code​

tags — Categorization​

why_important — Business Context​

link — Reference Documentation​

Matching Strategies​

Strategy 1: Language-Specific Rules​

Strategy 2: Framework-Specific Rules​

Strategy 3: Path-Based Rules​

Strategy 4: Config Files​

Strategy 5: Exclude Patterns​

Writing Effective Checklists​

Bad Checklist​

Good Checklist​

Common Patterns​

Pattern: Deprecated API Usage​

Pattern: Missing Error Handling​

Pattern: Security Headers​

What NOT to Use diffray For​

Don't Use AI for Style Checks​

Focus diffray on What Linters Can't Do​

The Right Tool Stack​

Example: Wrong vs Right Use of diffray​

Best Practices​

1. Be Specific, Not Generic​

2. Use Content Regex to Reduce Scope​

3. Exclude Tests When Appropriate​

4. Provide Clear Examples​

5. Start Simple, Iterate​

6. Prefer Language-Specific Rules Over Universal Ones​

PR-Level and Git History Rules​

The pr-level Tag​

The git-history Tag​

How These Tags Work​

Key Differences from File-Based Rules​

Example: Complete PR Quality Ruleset​

When to Use PR-Level Rules​