Skip to content

Tutorial 04: Code Execution

Learn how to safely execute Python code generated by AI agents.

What You'll Learn

  • How to use the execute_code agent
  • Security constraints and sandboxing
  • Working with execution results
  • Building code-generating workflows
  • Best practices for code execution

Prerequisites

Why Code Execution?

Sometimes the best way to solve a problem is to write and execute code. Kagura provides a safe way to let AI agents generate and run Python code:

from kagura.agents import execute_code

result = await execute_code("Calculate the factorial of 10")

if result["success"]:
    print(result["result"])  # 3628800
    print(result["code"])    # Shows the generated code

Basic Usage

Simple Calculations

from kagura.agents import execute_code

# Mathematical operations
result = await execute_code("What is 2^10?")
print(result["result"])  # 1024

# Data processing
result = await execute_code("Sum the numbers from 1 to 100")
print(result["result"])  # 5050

# String operations
result = await execute_code("Reverse the string 'hello'")
print(result["result"])  # "olleh"

Understanding the Result

The execute_code function returns a dictionary:

result = {
    "success": True,         # Whether execution succeeded
    "result": 3628800,       # The value of the `result` variable
    "code": "...",          # The generated Python code
    "error": None           # Error message if failed
}

Important: The executed code must set a variable named result:

# ✅ Good: Sets result variable
result = await execute_code("Calculate 5 * 5")
# Generated code: result = 5 * 5

# ❌ Bad: Doesn't set result
result = await execute_code("Print hello world")
# No result variable → result["result"] is None

Data Processing

Working with Lists

# Filter data
result = await execute_code("""
Find all even numbers in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
""")
print(result["result"])  # [2, 4, 6, 8, 10]

# Transform data
result = await execute_code("""
Square each number in [1, 2, 3, 4, 5]
""")
print(result["result"])  # [1, 4, 9, 16, 25]

# Aggregate data
result = await execute_code("""
Calculate the average of [10, 20, 30, 40, 50]
""")
print(result["result"])  # 30.0

Working with Dictionaries

# Extract data
result = await execute_code("""
From this data: {'name': 'Alice', 'age': 25, 'city': 'NYC'}
Extract the age
""")
print(result["result"])  # 25

# Transform data
result = await execute_code("""
Convert this data to uppercase keys:
{'name': 'Alice', 'role': 'engineer'}
""")
print(result["result"])  # {'NAME': 'Alice', 'ROLE': 'engineer'}

JSON Processing

import json

# Parse and analyze JSON
json_data = json.dumps({
    "users": [
        {"name": "Alice", "score": 95},
        {"name": "Bob", "score": 87},
        {"name": "Charlie", "score": 92}
    ]
})

result = await execute_code(f"""
Parse this JSON and find the average score:
{json_data}
""")
print(result["result"])  # 91.33...

Advanced Features

Multi-Step Calculations

result = await execute_code("""
1. Create a list of numbers from 1 to 20
2. Filter only prime numbers
3. Calculate their sum
""")

print(result["code"])    # See the generated algorithm
print(result["result"])  # Sum of primes

Custom Algorithms

result = await execute_code("""
Implement the Fibonacci sequence up to the 10th number
""")

print(result["result"])  # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Data Analysis

result = await execute_code("""
Given these test scores: [78, 92, 85, 88, 95, 72, 90]
Calculate:
- Mean
- Median
- Mode (if exists)
Return as a dictionary
""")

print(result["result"])
# {'mean': 85.71, 'median': 88, 'mode': None}

Security

Kagura executes code in a sandboxed environment with strict security constraints.

Allowed Modules

# ✅ Allowed: Safe standard library modules
result = await execute_code("""
import math
result = math.sqrt(16)
""")
# Success: 4.0

result = await execute_code("""
import json
result = json.dumps({'key': 'value'})
""")
# Success: '{"key": "value"}'

result = await execute_code("""
from datetime import datetime
result = datetime.now().year
""")
# Success: 2025

Allowed modules: - math, random, statistics - json, re, string - datetime, collections, itertools - functools, operator, copy

Forbidden Operations

# ❌ File system access
result = await execute_code("Read file config.txt")
# Error: Forbidden import: os

# ❌ Network access
result = await execute_code("Fetch data from https://api.example.com")
# Error: Forbidden import: requests

# ❌ System commands
result = await execute_code("Run shell command ls")
# Error: Forbidden import: subprocess

# ❌ Code execution
result = await execute_code("Execute eval('1+1')")
# Error: Forbidden operation: eval

Forbidden modules/operations: - File I/O: os, sys, io, pathlib, open() - Network: socket, urllib, requests - Execution: eval, exec, compile, __import__ - System: subprocess, multiprocessing

Timeout Protection

from kagura.core.executor import CodeExecutor

# Default timeout: 5 seconds
executor = CodeExecutor()

# Custom timeout
executor = CodeExecutor(timeout=10.0)

result = await executor.execute("""
import time
time.sleep(15)  # Will timeout after 10 seconds
result = "done"
""")

print(result.success)  # False
print(result.error)    # "Execution timeout"

Custom CodeExecutor

For advanced use cases, use CodeExecutor directly:

from kagura.core.executor import CodeExecutor

# Create executor with custom settings
executor = CodeExecutor(
    timeout=10.0,           # 10 second timeout
    max_output_size=1000    # Limit output size
)

# Execute code
result = await executor.execute("""
result = sum(range(1, 1001))
""")

print(result.success)    # True
print(result.result)     # 500500
print(result.code)       # Generated code
print(result.error)      # None

Execution Result Object

from kagura.core.executor import ExecutionResult

result = await executor.execute("result = 42")

# Result attributes
print(result.success)    # bool: True/False
print(result.result)     # Any: The result value
print(result.code)       # str: Executed code
print(result.error)      # Optional[str]: Error message
print(result.stdout)     # str: Standard output

Building Code Workflows

Plan-Code-Execute Pattern

from kagura import agent
from kagura.agents import execute_code

@agent
async def plan_solution(problem: str) -> str:
    """
    Analyze this problem and describe the algorithm:
    {{ problem }}

    Provide step-by-step approach.
    """
    pass

async def solve_with_code(problem: str):
    # Step 1: Plan
    plan = await plan_solution(problem)
    print(f"Plan: {plan}")

    # Step 2: Execute
    result = await execute_code(problem)

    # Step 3: Verify
    if result["success"]:
        print(f"Result: {result['result']}")
        print(f"Code:\n{result['code']}")
    else:
        print(f"Error: {result['error']}")

# Use it
await solve_with_code("Find all prime numbers between 1 and 50")

Iterative Refinement

async def solve_with_retry(problem: str, max_retries: int = 3):
    for attempt in range(max_retries):
        result = await execute_code(problem)

        if result["success"]:
            return result["result"]

        # If failed, try with more specific instructions
        problem = f"{problem}\n\nPrevious error: {result['error']}\nPlease fix and try again."

    raise Exception(f"Failed after {max_retries} attempts")

Code Review Agent

@agent
async def review_code(code: str) -> str:
    """
    Review this Python code for:
    - Correctness
    - Efficiency
    - Best practices

    Code:
    ```python
    {{ code }}
    ```
    """
    pass

async def code_and_review(problem: str):
    # Generate code
    result = await execute_code(problem)

    if result["success"]:
        # Review the code
        review = await review_code(result["code"])
        print(f"Review: {review}")

        return result["result"]

Best Practices

1. Clear Specifications

# ✅ Good: Clear requirements
result = await execute_code("""
Calculate the factorial of 10.
Store the result in a variable named 'result'.
""")

# ❌ Bad: Vague request
result = await execute_code("Do factorial stuff")

2. Handle Errors

# ✅ Good: Error handling
result = await execute_code(problem)

if result["success"]:
    process_result(result["result"])
else:
    logger.error(f"Code execution failed: {result['error']}")
    fallback_solution()

# ❌ Bad: No error handling
result = await execute_code(problem)
process_result(result["result"])  # May fail!

3. Validate Results

# ✅ Good: Validate output
result = await execute_code("Calculate sum of [1,2,3]")

if result["success"]:
    value = result["result"]
    if isinstance(value, (int, float)) and value > 0:
        use_result(value)
    else:
        raise ValueError(f"Unexpected result: {value}")

4. Provide Context

# ✅ Good: Context and examples
result = await execute_code(f"""
Given this data: {json.dumps(data)}
Extract all items where status is 'active'
Return as a list

Example output: [item1, item2, ...]
""")

5. Use Appropriate Timeout

# ✅ Good: Adjust timeout based on task
executor = CodeExecutor(timeout=1.0)   # Quick tasks
result = await executor.execute("result = 2 + 2")

executor = CodeExecutor(timeout=30.0)  # Complex tasks
result = await executor.execute("Analyze large dataset...")

Common Patterns

Data Transformation Pipeline

async def transform_data(data: list, operations: list[str]):
    """Apply multiple transformations to data"""
    current_data = data

    for operation in operations:
        result = await execute_code(f"""
Apply this operation to the data: {operation}
Data: {current_data}
""")
        if result["success"]:
            current_data = result["result"]
        else:
            raise Exception(f"Failed: {result['error']}")

    return current_data

# Use it
data = [1, 2, 3, 4, 5]
operations = [
    "Multiply each by 2",
    "Filter numbers > 5",
    "Sum all numbers"
]
result = await transform_data(data, operations)

Calculator Agent

@agent
async def calculate(expression: str) -> float:
    """A calculator agent that evaluates expressions"""
    result = await execute_code(f"Calculate: {expression}")

    if result["success"]:
        return result["result"]
    else:
        raise ValueError(f"Calculation failed: {result['error']}")

# Use it
answer = await calculate("(5 + 3) * 2 - 10")
print(answer)  # 6.0

Data Analysis Agent

async def analyze_dataset(data: list[dict], query: str):
    """Analyze structured data with natural language"""
    data_str = json.dumps(data)

    result = await execute_code(f"""
Dataset: {data_str}
Query: {query}

Analyze the dataset and answer the query.
""")

    return result["result"] if result["success"] else None

# Use it
sales_data = [
    {"product": "A", "revenue": 1000},
    {"product": "B", "revenue": 1500},
    {"product": "C", "revenue": 800}
]

result = await analyze_dataset(
    sales_data,
    "What is the total revenue?"
)
print(result)  # 3300

Practice Exercises

Exercise 1: Prime Number Finder

# TODO: Create a function that finds prime numbers
async def find_primes(n: int):
    """Find all prime numbers up to n"""
    result = await execute_code(f"""
    Find all prime numbers up to {n}
    Return as a list
    """)
    return result["result"] if result["success"] else []

Exercise 2: Data Aggregator

# TODO: Create a function that aggregates data
async def aggregate_sales(sales: list[dict]) -> dict:
    """Calculate total, average, min, max from sales data"""
    # Use execute_code to analyze the sales list
    pass

Exercise 3: Text Analyzer

# TODO: Create a function that analyzes text
async def analyze_text(text: str) -> dict:
    """
    Analyze text and return:
    - word_count
    - unique_words
    - most_common_word
    - average_word_length
    """
    pass

Troubleshooting

Result is None

# Problem: result["result"] is None

# Cause: Code doesn't set 'result' variable
result = await execute_code("print(42)")  # Only prints

# Solution: Ask for explicit result
result = await execute_code("Calculate 42 and store in result variable")

Timeout Errors

# Problem: Execution timeout

# Cause: Complex operation or infinite loop
result = await execute_code("Calculate factorial of 100000")

# Solution: Increase timeout or simplify
executor = CodeExecutor(timeout=30.0)
result = await executor.execute("Calculate factorial of 100")

Security Errors

# Problem: Forbidden import error

# Cause: Trying to use restricted module
result = await execute_code("Read file data.txt")

# Solution: Use allowed modules or provide data
result = await execute_code(f"Process this data: {data}")

Next Steps

Additional Resources