Tools¶

Tools let agents access external data and perform specialized tasks.

Most users don't need custom tools. Built-in tools handle common needs.

See also: Agents — Agents invoke tools when needed. For more complex logic, you may want to create a custom agent instead.

Built-in tools¶

Lumen includes tools automatically:

TableLookup - Finds relevant tables in your data (see Vector Stores)
DocumentLookup - Searches uploaded documents (see Vector Stores)
DbtslLookup - Queries dbt Semantic Layer metrics

You don't need to configure these. Agents use them when needed.

Create a simple tool¶

If you require a custom tool, e.g. either to provide additional context, render some output or perform some action simply provide a function with type annotations and a docstring:

Simple function tool

import lumen.ai as lmai

def calculate_average(numbers: list[float]) -> float:
    """
    Calculate the average of a list of numbers.

    Parameters
    ----------
    numbers : list[float]
        Numbers to average

    Returns
    -------
    float
        The average
    """
    return sum(numbers) / len(numbers)

ui = lmai.ExplorerUI(
    data='penguins.csv',
    tools=[calculate_average]  # (1)!
)
ui.servable()

Function automatically becomes a tool - the LLM uses your docstring and type hints

Lumen can now call this function by filling in the arguments. The return value is surfaced to the model, and only added to context if provides is set.

Define a tool with metadata¶

Some tools require access to the current context, e.g. to access the current data. To declare that a particular argument should be looked up in the context you can use the define_tool decorator to annotate the function, ensuring the FunctionTool can populate requires, provides, and purpose.

As an example we can define a function that accept the pipeline and counts the number of rows in the table:

Tool annotations

import lumen.ai as lmai
from lumen.ai.tools import define_tool

@define_tool(
    requires=["pipeline"],
    purpose="Count rows in the active table"
)
def count_rows(pipeline) -> int:
    """Count total rows in the current table."""
    return len(pipeline.data)

ui = lmai.ExplorerUI(
    data='penguins.csv',
    tools=[count_rows]
)
ui.servable()

Render tool output¶

If your tool returns a value you want to render directly, set render_output=True:

Render tool output

import lumen.ai as lmai
from panel_material_ui import Card
from lumen.ai.tools import define_tool

@define_tool(render_output=True, purpose="Show a greeting card")
def greeting() -> Card:
    return Card(
        "Hello from Lumen tools!",
        title="Greeting",
        collapsed=True
    )

ui = lmai.ExplorerUI(
    data='penguins.csv',
    tools=[greeting]
)
ui.servable()

Explicit `FunctionTool` definition¶

You may also explicitly define a FunctionTool instance:

Tool with context access

from lumen.ai.tools import FunctionTool

def filter_penguins(table) -> dict:
    """
    Filter penguins by bill length.

    Parameters
    ----------
    table : pd.DataFrame
        The penguin data

    Returns
    -------
    dict
        Filtered data and summary
    """
    filtered = table[table['bill_length_mm'] > 40]
    return {
        "filtered_table": filtered,
        "summary": f"Found {len(filtered)} penguins with bill length > 40mm"
    }

tool = FunctionTool(
    function=filter_penguins,
    requires=["table"],              # (1)!
    provides=["filtered_table", "summary"],  # (2)!
    purpose="Filter penguins by bill length"
)

ui = lmai.ExplorerUI(
    data='penguins.csv',
    tools=[tool]
)
ui.servable()

Tool reads table from context
Tool adds filtered_table and summary to context (function must return a dict with those keys)

Tool that calls an API¶

Wrap external services:

API tool

def fetch_weather(location: str) -> str:
    """
    Get current weather for a location.

    Parameters
    ----------
    location : str
        City name

    Returns
    -------
    str
        Weather description
    """
    import requests
    response = requests.get(f"https://api.weather.gov/...")
    return f"Weather: {response.json()['temp']}°F"

ui = lmai.ExplorerUI(
    data='penguins.csv',
    tools=[fetch_weather]
)
ui.servable()

Complete example: Data validation¶

Data quality tool
import pandas as pd
from lumen.ai.tools import FunctionTool

def validate_quality(table: pd.DataFrame) -> dict:
    """
    Check data quality and report issues.

    Parameters
    ----------
    table : pd.DataFrame
        Data to validate

    Returns
    -------
    dict
        Validation report
    """
    missing = table.isnull().sum().sum()
    duplicates = table.duplicated().sum()

    issues = []
    if missing > 0:
        issues.append(f"{missing} missing values")
    if duplicates > 0:
        issues.append(f"{duplicates} duplicate rows")

    return {
        "total_rows": len(table),
        "issues": issues,
        "status": "✓ Clean" if not issues else "⚠️ Issues found"
    }

tool = FunctionTool(
    function=validate_quality,
    requires=["table"],
    provides=["data_quality_report"],
    purpose="Validate data quality and report issues"
)

ui = lmai.ExplorerUI(
    data='penguins.csv',
    tools=[tool]
)
ui.servable()

Tool components¶

requires - Context keys the tool needs:

requires=["table", "sql"]  # Tool receives these from context

provides - Context keys the tool creates (for a single key, a non-dict return value is wrapped):

provides=["summary", "report"]  # Tool adds these to context

purpose - Description for the LLM:

purpose="Validates data quality and finds issues"

Multiple tools¶

Combine tools for complex workflows:

With built-in toolsOnly custom tools

Mix custom and built-in tools

from lumen.ai.tools import DocumentLookup

def get_stats(table) -> dict:
    """Calculate summary statistics."""
    return {
        "min": table['bill_length_mm'].min(),
        "max": table['bill_length_mm'].max(),
        "mean": table['bill_length_mm'].mean(),
    }

def filter_species(table, species: str) -> dict:
    """Filter by species name."""
    filtered = table[table['species'] == species]
    return {
        "filtered": filtered,
        "count": len(filtered)
    }

ui = lmai.ExplorerUI(
    data='penguins.csv',
    tools=[get_stats, filter_species, DocumentLookup()]
)
ui.servable()

Multiple custom tools

def tool_a(data: list) -> dict:
    """Process data."""
    return {"result_a": processed}

def tool_b(data: list) -> dict:
    """Analyze data."""
    return {"result_b": analyzed}

ui = lmai.ExplorerUI(
    data='penguins.csv',
    tools=[tool_a, tool_b]
)
ui.servable()

Best practices¶

Write clear docstrings¶

Good docstring format

def my_tool(data: list) -> str:
    """
    One-line summary of what it does.

    Detailed explanation if needed.

    Parameters
    ----------
    data : list
        What the data represents

    Returns
    -------
    str
        What gets returned
    """

Use type hints¶

Type hints help the LLM

def process(numbers: list[float], threshold: int) -> dict:
    """Type hints help the LLM call correctly."""

Name parameters clearly¶

# Good
def calculate_average(numbers: list[float]) -> float:

# Bad
def calculate(x: list[float]) -> float:

Keep tools focused¶

# Good - one task
def validate_email(email: str) -> bool:

# Bad - too many tasks
def validate_and_process_user_data(data: dict):

Return structured data when using provides¶

Match provides with return keys

When using provides, your function must return a dict with those keys:

tool = FunctionTool(
    function=my_function,
    provides=["result", "metadata"]  # These keys must be in return dict
)

def my_function(data):
    return {
        "result": processed_data,
        "metadata": {"count": 10}
    }  # ✅ Has both "result" and "metadata"

Handle errors gracefully¶

Return error dictReturn error string

Structured error handling

def process(data: list) -> dict:
    if not data:
        return {"error": "No data provided"}

    try:
        result = sum(data) / len(data)
        return {"average": result}
    except Exception as e:
        return {"error": str(e)}

Simple error handling

def process(data: list) -> str:
    if not data:
        return "Error: No data provided"

    try:
        result = sum(data) / len(data)
        return f"Average: {result:.2f}"
    except Exception as e:
        return f"Error: {e}"

When to use tools vs agents¶

Use tools when	Use agents when
Simple function call	Complex prompting needed
No async/await needed	Multiple LLM calls required
Wrapping external API	Multi-step reasoning needed
Straightforward logic	Sophisticated error handling

Troubleshooting¶

Tool never gets called¶

The coordinator doesn't think the tool is relevant. Make the purpose clear and specific:

# Bad
purpose = "Does stuff with data"

# Good
purpose = "Validates email addresses and returns True if valid"

Missing required argument¶

Tool expects a context key that doesn't exist. Ensure requires lists correct keys:

tool = FunctionTool(
    function=my_function,
    requires=["table"],  # Must exist in context
)

Tool fails silently¶

Add error handling and return error messages instead of raising exceptions:

def my_tool(data):
    try:
        # Your logic
        return result
    except Exception as e:
        return {"error": str(e)}

KeyError when using provides¶

Common mistake with provides

Your function must return a dict with all keys listed in provides:

# Wrong - returns a string but provides expects dict keys
provides=["summary", "count"]

def bad_tool(data):
    return "Summary text"  # ❌

# Correct - returns dict with expected keys
provides=["summary", "count"]

def good_tool(data):
    return {
        "summary": "Summary text",
        "count": len(data)
    }  # ✅