> ## Documentation Index
> Fetch the complete documentation index at: https://docs.plexe.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# API Reference

> Complete reference documentation for the Plexe Python library API.

This page provides comprehensive reference documentation for the core classes and functions in the Plexe Python library.

## Core Classes

### `Model`

The primary class in the Plexe library, representing a machine learning model.

```python theme={null}
class Model:
    def __init__(
        self,
        intent: str,
        input_schema: Type[BaseModel] | Dict[str, type] = None,
        output_schema: Type[BaseModel] | Dict[str, type] = None,
        constraints: List[Constraint] = None,
        distributed: bool = False
    )
```

**Parameters:**

| Parameter       | Type               | Description                                                                 |                                                                                                |
| --------------- | ------------------ | --------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| `intent`        | `str`              | Natural language description of what the model should do.                   |                                                                                                |
| `input_schema`  | \`Type\[BaseModel] | Dict\[str, type]\`                                                          | Schema defining input data structure. Can be a dictionary or Pydantic model. Default: `None`.  |
| `output_schema` | \`Type\[BaseModel] | Dict\[str, type]\`                                                          | Schema defining output data structure. Can be a dictionary or Pydantic model. Default: `None`. |
| `constraints`   | `List[Constraint]` | List of constraints the model should adhere to. Default: `None`.            |                                                                                                |
| `distributed`   | `bool`             | Whether to use distributed execution (Ray) when building. Default: `False`. |                                                                                                |

**Methods:**

#### `build`

```python theme={null}
def build(
    self,
    datasets: List[Union[pd.DataFrame, DatasetGenerator]],
    provider: Union[str, ProviderConfig] = "openai/gpt-4o-mini",
    timeout: Optional[int] = None,
    max_iterations: Optional[int] = None,
    run_timeout: int = 1800,
    callbacks: Optional[List[Callback]] = None,
    verbose: bool = False,
    chain_of_thought: bool = True
) -> None
```

Builds the model using the provided datasets and configuration.

**Parameters:**

| Parameter          | Type                                          | Description                                                                                                             |
| ------------------ | --------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- |
| `datasets`         | `List[Union[pd.DataFrame, DatasetGenerator]]` | List of pandas DataFrames or DatasetGenerator objects for training data.                                                |
| `provider`         | `Union[str, ProviderConfig]`                  | LLM provider to use, either as a string ("openai/gpt-4o-mini") or ProviderConfig object. Default: "openai/gpt-4o-mini". |
| `timeout`          | `Optional[int]`                               | Maximum total time (in seconds) for the entire build process (all iterations).                                          |
| `max_iterations`   | `Optional[int]`                               | Maximum number of iterations to attempt.                                                                                |
| `run_timeout`      | `int`                                         | Maximum time (in seconds) for each individual training run. Default: 1800 (30 minutes).                                 |
| `callbacks`        | `Optional[List[Callback]]`                    | List of callback objects for monitoring the build process.                                                              |
| `verbose`          | `bool`                                        | Whether to display detailed agent logs. Default: `False`.                                                               |
| `chain_of_thought` | `bool`                                        | Whether to enable verbose output of agent reasoning. Default: `True`.                                                   |

**Returns:** `None`

**Note:** At least one of `timeout` or `max_iterations` must be provided.

#### `predict`

```python theme={null}
def predict(
    self,
    x: Dict[str, Any],
    validate_input: bool = False,
    validate_output: bool = False
) -> Dict[str, Any]
```

Makes a prediction using the trained model.

**Parameters:**

| Parameter         | Type             | Description                                                  |
| ----------------- | ---------------- | ------------------------------------------------------------ |
| `x`               | `Dict[str, Any]` | Input data for prediction.                                   |
| `validate_input`  | `bool`           | Whether to validate input against schema. Default: `False`.  |
| `validate_output` | `bool`           | Whether to validate output against schema. Default: `False`. |

**Returns:** `Dict[str, Any]` - Prediction result

#### `get_state`

```python theme={null}
def get_state(self) -> str
```

Returns the current state of the model.

**Returns:** `str` representing model state: `"draft"`, `"building"`, `"ready"`, or `"error"`

#### `get_metadata`

```python theme={null}
def get_metadata(self) -> Dict[str, Any]
```

Returns metadata about the model.

**Returns:** Dictionary containing metadata

#### `get_metrics`

```python theme={null}
def get_metrics(self) -> Optional[Dict[str, Any]]
```

Returns metrics for the trained model if available.

**Returns:** Dictionary containing metrics or `None`

#### `describe`

```python theme={null}
def describe(self) -> ModelDescription
```

Returns a detailed description of the model.

**Returns:** `ModelDescription` object

### `DatasetGenerator`

Class for generating synthetic data or augmenting existing data.

```python theme={null}
class DatasetGenerator:
    def __init__(
        self,
        description: str,
        provider: str,
        schema: Type[BaseModel] | Dict[str, type] = None,
        data: pd.DataFrame = None
    ) -> None
    
    def generate(self, num_samples: int):
        """Generates synthetic data if a provider is available."""
```

**Constructor Parameters:**

| Parameter     | Type               | Description                                                                            |                                                            |
| ------------- | ------------------ | -------------------------------------------------------------------------------------- | ---------------------------------------------------------- |
| `description` | `str`              | Human-readable description of the dataset.                                             |                                                            |
| `provider`    | `str`              | LLM provider used for synthetic data generation.                                       |                                                            |
| `schema`      | \`Type\[BaseModel] | Dict\[str, type]\`                                                                     | The schema the data should match, if any. Default: `None`. |
| `data`        | `pd.DataFrame`     | A dataset of real data on which to base the generation, if available. Default: `None`. |                                                            |

**Methods:**

| Method                       | Description                                               |
| ---------------------------- | --------------------------------------------------------- |
| `generate(num_samples: int)` | Generates the specified number of synthetic data samples. |
| `data` property              | Returns the dataset as a pandas DataFrame.                |

### `Callback`

Base class for callbacks that monitor the build process.

```python theme={null}
class Callback:
    def on_build_start(self, info: BuildStateInfo) -> None:
        pass

    def on_iteration_start(self, info: BuildStateInfo) -> None:
        pass

    def on_iteration_end(self, info: BuildStateInfo) -> None:
        pass

    def on_build_end(self, info: BuildStateInfo) -> None:
        pass
```

See [Callbacks Reference](/library/reference/callbacks) for more details.

### `Constraint`

Represents rules or conditions that the model should satisfy.

```python theme={null}
class Constraint:
    def __init__(self, condition: Callable[[Any, Any], bool], description: str)
```

**Parameters:**

| Parameter     | Type                         | Description                                   |
| ------------- | ---------------------------- | --------------------------------------------- |
| `condition`   | `Callable[[Any, Any], bool]` | Function that evaluates the constraint.       |
| `description` | `str`                        | Human-readable description of the constraint. |

## Core Functions

### `save_model`

```python theme={null}
def save_model(model: Model, path: str | Path) -> str
```

Saves a trained model to a tar archive.

**Parameters:**

| Parameter | Type    | Description        |                                                              |
| --------- | ------- | ------------------ | ------------------------------------------------------------ |
| `model`   | `Model` | The model to save. |                                                              |
| `path`    | \`str   | Path\`             | Path where the model should be saved. Must end with .tar.gz. |

**Returns:** `str` - Path where the model was saved

### `load_model`

```python theme={null}
def load_model(path: str | Path) -> Model
```

Instantiate a model from a tar archive.

**Parameters:**

| Parameter | Type  | Description |                                                 |
| --------- | ----- | ----------- | ----------------------------------------------- |
| `path`    | \`str | Path\`      | Path to the saved model archive (.tar.gz file). |

**Returns:** `Model` - The loaded model

### `configure_logging`

```python theme={null}
def configure_logging(
    level: Union[str, int] = logging.INFO,
    file: Optional[str] = None
) -> None
```

Configures logging for the Plexe library.

**Parameters:**

| Parameter | Type              | Description                                                                               |
| --------- | ----------------- | ----------------------------------------------------------------------------------------- |
| `level`   | `Union[str, int]` | Logging level (from `logging` module or as string).                                       |
| `file`    | `Optional[str]`   | Path to a log file. If provided, logs will be written to this file in addition to stdout. |

**Returns:** `None`

## Enums and Constants

### `ModelState`

Enum representing the possible states of a model. The `get_state()` method returns the string value.

```python theme={null}
class ModelState(Enum):
    DRAFT = "draft"        # Initial state, before building
    BUILDING = "building"  # During the build process
    READY = "ready"        # Build complete, model ready for use
    ERROR = "error"        # Build failed
```

Note: When checking model state, compare against the string values:

```python theme={null}
if model.get_state() == "ready":
    # Ready to make predictions
```

## Provider Configuration

### `ProviderConfig`

Class for configuring different LLM providers for different agent roles (imported from `plexe.internal.common.provider`).

```python theme={null}
from plexe.internal.common.provider import ProviderConfig

class ProviderConfig:
    def __init__(
        self,
        default_provider: str,
        orchestrator_provider: Optional[str] = None,
        research_provider: Optional[str] = None,
        engineer_provider: Optional[str] = None,
        ops_provider: Optional[str] = None,
        tool_provider: Optional[str] = None
    )
```

**Parameters:**

| Parameter               | Type            | Description                                                              |
| ----------------------- | --------------- | ------------------------------------------------------------------------ |
| `default_provider`      | `str`           | Default provider for all roles (e.g., "openai/gpt-4o-mini").             |
| `orchestrator_provider` | `Optional[str]` | Provider for the orchestrator agent (coordinates the overall process).   |
| `research_provider`     | `Optional[str]` | Provider for the research agent (analyzes problems and plans solutions). |
| `engineer_provider`     | `Optional[str]` | Provider for the engineer agent (generates training code).               |
| `ops_provider`          | `Optional[str]` | Provider for the ops agent (generates inference code).                   |
| `tool_provider`         | `Optional[str]` | Provider for tool agents (performs specialized tasks).                   |

## Performance Metrics

### `Metric`

Class representing a performance metric for a model.

```python theme={null}
class Metric:
    def __init__(
        self, 
        name: str, 
        value: float = None, 
        comparator: MetricComparator = None,
        is_worst: bool = False
    )
```

**Parameters:**

| Parameter    | Type               | Description                                                                       |
| ------------ | ------------------ | --------------------------------------------------------------------------------- |
| `name`       | `str`              | Name of the metric (e.g., "accuracy", "rmse").                                    |
| `value`      | `float`            | Numeric value of the metric. Default: `None`.                                     |
| `comparator` | `MetricComparator` | Comparison logic for determining which metric values are better. Default: `None`. |
| `is_worst`   | `bool`             | Whether this is the worst possible value for the metric. Default: `False`.        |

### `MetricComparator`

Encapsulates comparison logic for metrics.

```python theme={null}
class MetricComparator:
    def __init__(
        self, 
        comparison_method: ComparisonMethod, 
        target: float = None, 
        epsilon: float = 1e-9
    )
```

**Parameters:**

| Parameter           | Type               | Description                                                                       |
| ------------------- | ------------------ | --------------------------------------------------------------------------------- |
| `comparison_method` | `ComparisonMethod` | The method used to compare metrics (HIGHER\_IS\_BETTER, LOWER\_IS\_BETTER, etc.). |
| `target`            | `float`            | The target value for TARGET\_IS\_BETTER comparisons. Default: `None`.             |
| `epsilon`           | `float`            | Small value used for floating point comparisons. Default: `1e-9`.                 |

## Type Hints

The library uses the following type hints:

```python theme={null}
SchemaType = Union[Dict[str, Any], Type[BaseModel]]
DatasetType = Union[pd.DataFrame, DatasetGenerator]
ProviderType = Union[str, ProviderConfig]
```

## Usage Example

```python theme={null}
import plexe
import pandas as pd

# Load data
df = pd.read_csv("housing.csv")

# Create model
model = plexe.Model(
    intent="Predict house prices based on features",
    input_schema={"square_footage": float, "bedrooms": int, "bathrooms": float},
    output_schema={"price": float}
)

# Build model
model.build(
    datasets=[df],
    provider="openai/gpt-4o-mini",
    max_iterations=3,
    timeout=600,
    run_timeout=180,
    chain_of_thought=True,
    verbose=False
)

# Make prediction
prediction = model.predict({"square_footage": 2000, "bedrooms": 3, "bathrooms": 2})
print(f"Predicted price: {prediction}")

# Save model
save_path = plexe.save_model(model, "housing_model.tar.gz")

# Load model
loaded_model = plexe.load_model(save_path)
```

For more details on specific components, see the other reference sections:

* [Callbacks Reference](/library/reference/callbacks)
* [Datasets Reference](/library/reference/datasets)
* [Exceptions Reference](/library/reference/exceptions)
