API Reference

This page provides comprehensive reference documentation for the core classes and functions in the Plexe Python library.

Core Classes

`Model`

The primary class in the Plexe library, representing a machine learning model.

class Model:
    def __init__(
        self,
        intent: str,
        input_schema: Type[BaseModel] | Dict[str, type] = None,
        output_schema: Type[BaseModel] | Dict[str, type] = None,
        constraints: List[Constraint] = None,
        distributed: bool = False
    )

Parameters:

Parameter	Type	Description
`intent`	`str`	Natural language description of what the model should do.
`input_schema`	`Type[BaseModel]	Dict[str, type]`	Schema defining input data structure. Can be a dictionary or Pydantic model. Default: `None`.
`output_schema`	`Type[BaseModel]	Dict[str, type]`	Schema defining output data structure. Can be a dictionary or Pydantic model. Default: `None`.
`constraints`	`List[Constraint]`	List of constraints the model should adhere to. Default: `None`.
`distributed`	`bool`	Whether to use distributed execution (Ray) when building. Default: `False`.

Methods:

`build`

def build(
    self,
    datasets: List[Union[pd.DataFrame, DatasetGenerator]],
    provider: Union[str, ProviderConfig] = "openai/gpt-4o-mini",
    timeout: Optional[int] = None,
    max_iterations: Optional[int] = None,
    run_timeout: int = 1800,
    callbacks: Optional[List[Callback]] = None,
    verbose: bool = False,
    chain_of_thought: bool = True
) -> None

Builds the model using the provided datasets and configuration. Parameters:

Parameter	Type	Description
`datasets`	`List[Union[pd.DataFrame, DatasetGenerator]]`	List of pandas DataFrames or DatasetGenerator objects for training data.
`provider`	`Union[str, ProviderConfig]`	LLM provider to use, either as a string (“openai/gpt-4o-mini”) or ProviderConfig object. Default: “openai/gpt-4o-mini”.
`timeout`	`Optional[int]`	Maximum total time (in seconds) for the entire build process (all iterations).
`max_iterations`	`Optional[int]`	Maximum number of iterations to attempt.
`run_timeout`	`int`	Maximum time (in seconds) for each individual training run. Default: 1800 (30 minutes).
`callbacks`	`Optional[List[Callback]]`	List of callback objects for monitoring the build process.
`verbose`	`bool`	Whether to display detailed agent logs. Default: `False`.
`chain_of_thought`	`bool`	Whether to enable verbose output of agent reasoning. Default: `True`.

Returns: None Note: At least one of timeout or max_iterations must be provided.

`predict`

def predict(
    self,
    x: Dict[str, Any],
    validate_input: bool = False,
    validate_output: bool = False
) -> Dict[str, Any]

Makes a prediction using the trained model. Parameters:

Parameter	Type	Description
`x`	`Dict[str, Any]`	Input data for prediction.
`validate_input`	`bool`	Whether to validate input against schema. Default: `False`.
`validate_output`	`bool`	Whether to validate output against schema. Default: `False`.

Returns: Dict[str, Any] - Prediction result

`get_state`

def get_state(self) -> str

Returns the current state of the model. Returns: str representing model state: "draft", "building", "ready", or "error"

`get_metadata`

def get_metadata(self) -> Dict[str, Any]

Returns metadata about the model. Returns: Dictionary containing metadata

`get_metrics`

def get_metrics(self) -> Optional[Dict[str, Any]]

Returns metrics for the trained model if available. Returns: Dictionary containing metrics or None

`describe`

def describe(self) -> ModelDescription

Returns a detailed description of the model. Returns: ModelDescription object

`DatasetGenerator`

Class for generating synthetic data or augmenting existing data.

class DatasetGenerator:
    def __init__(
        self,
        description: str,
        provider: str,
        schema: Type[BaseModel] | Dict[str, type] = None,
        data: pd.DataFrame = None
    ) -> None
    
    def generate(self, num_samples: int):
        """Generates synthetic data if a provider is available."""

Constructor Parameters:

Parameter	Type	Description
`description`	`str`	Human-readable description of the dataset.
`provider`	`str`	LLM provider used for synthetic data generation.
`schema`	`Type[BaseModel]	Dict[str, type]`	The schema the data should match, if any. Default: `None`.
`data`	`pd.DataFrame`	A dataset of real data on which to base the generation, if available. Default: `None`.

Methods:

Method	Description
`generate(num_samples: int)`	Generates the specified number of synthetic data samples.
`data` property	Returns the dataset as a pandas DataFrame.

`Callback`

Base class for callbacks that monitor the build process.

class Callback:
    def on_build_start(self, info: BuildStateInfo) -> None:
        pass

    def on_iteration_start(self, info: BuildStateInfo) -> None:
        pass

    def on_iteration_end(self, info: BuildStateInfo) -> None:
        pass

    def on_build_end(self, info: BuildStateInfo) -> None:
        pass

See Callbacks Reference for more details.

`Constraint`

Represents rules or conditions that the model should satisfy.

class Constraint:
    def __init__(self, condition: Callable[[Any, Any], bool], description: str)

Parameters:

Parameter	Type	Description
`condition`	`Callable[[Any, Any], bool]`	Function that evaluates the constraint.
`description`	`str`	Human-readable description of the constraint.

Core Functions

`save_model`

def save_model(model: Model, path: str | Path) -> str

Saves a trained model to a tar archive. Parameters:

Parameter	Type	Description
`model`	`Model`	The model to save.
`path`	`str	Path`	Path where the model should be saved. Must end with .tar.gz.

Returns: str - Path where the model was saved

`load_model`

def load_model(path: str | Path) -> Model

Instantiate a model from a tar archive. Parameters:

Parameter	Type	Description
`path`	`str	Path`	Path to the saved model archive (.tar.gz file).

Returns: Model - The loaded model

`configure_logging`

def configure_logging(
    level: Union[str, int] = logging.INFO,
    file: Optional[str] = None
) -> None

Configures logging for the Plexe library. Parameters:

Parameter	Type	Description
`level`	`Union[str, int]`	Logging level (from `logging` module or as string).
`file`	`Optional[str]`	Path to a log file. If provided, logs will be written to this file in addition to stdout.

Returns: None

Enums and Constants

`ModelState`

Enum representing the possible states of a model. The get_state() method returns the string value.

class ModelState(Enum):
    DRAFT = "draft"        # Initial state, before building
    BUILDING = "building"  # During the build process
    READY = "ready"        # Build complete, model ready for use
    ERROR = "error"        # Build failed

Note: When checking model state, compare against the string values:

if model.get_state() == "ready":
    # Ready to make predictions

Provider Configuration

`ProviderConfig`

Class for configuring different LLM providers for different agent roles (imported from plexe.internal.common.provider).

from plexe.internal.common.provider import ProviderConfig

class ProviderConfig:
    def __init__(
        self,
        default_provider: str,
        orchestrator_provider: Optional[str] = None,
        research_provider: Optional[str] = None,
        engineer_provider: Optional[str] = None,
        ops_provider: Optional[str] = None,
        tool_provider: Optional[str] = None
    )

Parameters:

Parameter	Type	Description
`default_provider`	`str`	Default provider for all roles (e.g., “openai/gpt-4o-mini”).
`orchestrator_provider`	`Optional[str]`	Provider for the orchestrator agent (coordinates the overall process).
`research_provider`	`Optional[str]`	Provider for the research agent (analyzes problems and plans solutions).
`engineer_provider`	`Optional[str]`	Provider for the engineer agent (generates training code).
`ops_provider`	`Optional[str]`	Provider for the ops agent (generates inference code).
`tool_provider`	`Optional[str]`	Provider for tool agents (performs specialized tasks).

Performance Metrics

`Metric`

Class representing a performance metric for a model.

class Metric:
    def __init__(
        self, 
        name: str, 
        value: float = None, 
        comparator: MetricComparator = None,
        is_worst: bool = False
    )

Parameters:

Parameter	Type	Description
`name`	`str`	Name of the metric (e.g., “accuracy”, “rmse”).
`value`	`float`	Numeric value of the metric. Default: `None`.
`comparator`	`MetricComparator`	Comparison logic for determining which metric values are better. Default: `None`.
`is_worst`	`bool`	Whether this is the worst possible value for the metric. Default: `False`.

`MetricComparator`

Encapsulates comparison logic for metrics.

class MetricComparator:
    def __init__(
        self, 
        comparison_method: ComparisonMethod, 
        target: float = None, 
        epsilon: float = 1e-9
    )

Parameters:

Parameter	Type	Description
`comparison_method`	`ComparisonMethod`	The method used to compare metrics (HIGHER_IS_BETTER, LOWER_IS_BETTER, etc.).
`target`	`float`	The target value for TARGET_IS_BETTER comparisons. Default: `None`.
`epsilon`	`float`	Small value used for floating point comparisons. Default: `1e-9`.

Type Hints

The library uses the following type hints:

SchemaType = Union[Dict[str, Any], Type[BaseModel]]
DatasetType = Union[pd.DataFrame, DatasetGenerator]
ProviderType = Union[str, ProviderConfig]

Usage Example

import plexe
import pandas as pd

# Load data
df = pd.read_csv("housing.csv")

# Create model
model = plexe.Model(
    intent="Predict house prices based on features",
    input_schema={"square_footage": float, "bedrooms": int, "bathrooms": float},
    output_schema={"price": float}
)

# Build model
model.build(
    datasets=[df],
    provider="openai/gpt-4o-mini",
    max_iterations=3,
    timeout=600,
    run_timeout=180,
    chain_of_thought=True,
    verbose=False
)

# Make prediction
prediction = model.predict({"square_footage": 2000, "bedrooms": 3, "bathrooms": 2})
print(f"Predicted price: {prediction}")

# Save model
save_path = plexe.save_model(model, "housing_model.tar.gz")

# Load model
loaded_model = plexe.load_model(save_path)

For more details on specific components, see the other reference sections:

Introduction

Plexe Python Library

Plexe Platform

Core Classes

`Model`

`build`

`predict`

`get_state`

`get_metadata`

`get_metrics`

`describe`

`DatasetGenerator`

`Callback`

`Constraint`

Core Functions

`save_model`

`load_model`

`configure_logging`

Enums and Constants

`ModelState`

Provider Configuration

`ProviderConfig`

Performance Metrics

`Metric`

`MetricComparator`

Type Hints

Usage Example

Introduction

Plexe Python Library

Plexe Platform

​Core Classes

​Model

​build

​predict

​get_state

​get_metadata

​get_metrics

​describe

​DatasetGenerator

​Callback

​Constraint

​Core Functions

​save_model

​load_model

​configure_logging

​Enums and Constants

​ModelState

​Provider Configuration

​ProviderConfig

​Performance Metrics

​Metric

​MetricComparator

​Type Hints

​Usage Example

Core Classes

`Model`

`build`

`predict`

`get_state`

`get_metadata`

`get_metrics`

`describe`

`DatasetGenerator`

`Callback`

`Constraint`

Core Functions

`save_model`

`load_model`

`configure_logging`

Enums and Constants

`ModelState`

Provider Configuration

`ProviderConfig`

Performance Metrics

`Metric`

`MetricComparator`

Type Hints

Usage Example