Introduction to Sessions

Model Builders on the integrate.ai platform use sessions to train models, generate predictions, and perform other data science work.

The platform supports multiple session types. Regardless of the type, there are several building blocks required to create and run a session in a notebook.

Running a session

To run a session, follow the examples below:

  1. Import your token and instantiate the client (task runner).

from integrate_ai_sdk.api import connect
import os
import json
import pandas as pd

IAI_TOKEN = ""
client = connect(token=IAI_TOKEN)
  1. Specify the model_config and data_config. These paramateres in these two configuration files are specific to the type of session you are running. For details, see the example of the HFL or VFL session type.

  2. Create and start (load) the session.

#Set up the task builder that contains the tasks

from integrate_ai_sdk.taskgroup.taskbuilder.integrate_ai import IntegrateAiTaskBuilder
from integrate_ai_sdk.taskgroup.base import SessionTaskGroup

iai_tb_aws = IntegrateAiTaskBuilder(client=client,task_runner_id="")

#Create the session

hfl_session = client.create_fl_session(         #This is an HFL session example. Specify the name of the session (eg hfl_session)
    name="Testing notebook - HFL",              #Type a name to appear in the workspace UI to help you locate your session
    description="I am testing HFL FFNet session creation with a task runner through a notebook",        #This is a long description that can contain details about your session
    min_num_clients=1,                          #
    num_rounds=2,                               #The number of rounds over which to train the model
    package_name="iai_ffnet",                   #The model package name
    model_config=model_config,                  #The model configuration specified in step 2
    data_config=data_schema                     #The data schema/config specified in step 2
).start()

hfl_session.id                  # Prints the training session ID for reference
  1. Create a task group and specify the dataset names. Each dataset, or train/test dataset pair, is used in a single task.

task_group = (
    SessionTaskGroup(hfl_session)
    .add_task(iai_tb_aws.hfl(
        train_dataset_name="dataset1",  //required 
        test_dataset_name="test_set1",  //required
        batch_size=batch_size,          //optional - specify the batch size
        vcpus=vcpus,                    //optional - specify to override the default vcpus for the server for this task
        memory=memory,))\               //optional - specify to override the default memory for the server for this task
    .add_task(iai_tb_aws.hfl(
        train_dataset_name="dataset2",  //required
        test_dataset_name="test_set2",  //required
        batch_size=batch_size,          //optional - specify the batch size
        vcpus=vcpus,                    //optional - specify to override the default vcpus for the server for this task
        memory=memory,))\               //optional - specify to override the default memory for the server for this task
)

task_group_context = task_group.start()             #Starting the task group starts the training session on the task runner

Each session type has a specific client task configuration. The example here shows only the minimum requirements for an HFL session. Review the details for the session type you are running when you configure your task group.

  1. Wait for the session to run and monitor its progress either through the UI or by using task_group_context.monitor_task_logs() in the notebook.

Reviewing session results

At times, you may want to review the results of a previously run session, for example to inspect the metrics.

To load a previously run session:

client.session(session.id)

where session.id is the ID of the previous session.

For more information about session types, see the following sections:

Running VFL Batch Sessions

You can run VFL sessions in batches to quickly iterate over parameters. The integrate.ai client has a VFL batch method that runs sessions in batches, based on the parameters provided in the model_config for the session.

Model config iterations allow you to create a batch by iterating over model configuration parameters.

For example:

model_config_params = {
    "optimizer.params.learning_rate": [0.1, 0.2, 0.3],
}

This code launches 3 sessions with the optimizer.params.learning_rate parameter set to one of the specified values. The batch method executes only the configured number of parallel sessions at a time.

If you specify several parameters, the number of sessions is equal to the number of possible permutations, that is, it is a multuple of value list sizes.

Specify the VFL data configuration

To run batch sessions, specify the data_config for the VFL session as usual.

data_config = {
        "passive_client": {
            "label_client": False,
            "predictors": ["x1", "x3", "x5", "x7"],
            "target": None,
        },
        "active_client": {
            "label_client": True,
            "predictors": ["x0", "x2", "x4", "x6"],
            "target": "y",
        },
    }

Specify the VFL batch model configuration

Specify the model_config as usual, and specify the parameters to iterate over in the model_config_params.

model_config = {
        "strategy": {"name": "VflGlm", "params": {"expand_duplicates": False}},
        "model": {
            "passive_client": {"params": {}},
            "active_client": {"params": {}},
        },
        "ml_task": {
            "type": "regression",
            "loss_function": "mse",
            "params": {},
        },
        "optimizer": {"name": "SGD", "params": {"learning_rate": 0.01, "momentum": 0.0}},   # specifies the optimizer and parameters
        "influence_score": {"enable": False, "params": {}},                         # enables/disables influence score calculation
        "feature_importance_score": {"enable": False, "params": {}},                # enables/disables feature importance score calculation
        "seed": 23,  # for reproducibility
        }

# Iterate over all possible combinations of model config parameters:
model_config_params = {
    "optimizer.params.learning_rate": [0.1, 0.2, 0.3],
}

Create and start session batches

batch_id, batch_sessions = vfl_batch = client.run_vfl_batch(
    name="Testing VFL batch",
    description="batching VFL",
    prl_session_id=prl_session.id,
    vfl_mode='train',
    min_num_clients=2,
    num_rounds=2,
    package_name="iai_glm",
    data_config=data_config,
    model_config=model_config,

    # Batch params:
    model_config_params=model_config_params,

    # How many sessions to run in parallel
    capacity=4,

    # Add one task for each client, as usual for a VFL session
    tasks=[
        iai_tb_aws.vfl_train(
            train_dataset_name=active_train_dataset, test_dataset_name=active_test_dataset, 
            batch_size=1024,  
            client_name="active_client"),
        iai_tb_aws.vfl_train(
            train_dataset_name=passive_train_dataset, test_dataset_name=passive_test_dataset, 
            batch_size=1024,  
            client_name="passive_client"),
    ],
)

batch.id

Batch complete

Now you can view the VFL training metrics and start making predictions.

batch_sessions is a dict keyed by session objects that stores task group context, data config, and model config for each session.

#print(batch_sessions)
print("batch_id", batch_id)
for s in batch_sessions.keys():
    print(s.id, s.name, s.status)
    #s.metrics().plot()

You also can query the API for a previously executed batch. Specify the batch_id for the batch you want to review.

for s in client.get_batch_sessions(batch_id):
    print(s.id, s.name, s.status)

VFL Checkpoints

When creating a new VFL training session, you can pass a previous training session ID and round ID as a checkpoint. The model will be loaded from this checkpoint for the new training session.

You can also download round-specific model coefficients through the SDK interface. Coefficients are saved to the federated server after each round in addition to at the end of a session.

Run a session to determine a checkpoint of interest

checkpoint_vfl_train_session = client.create_vfl_session(
            name="Testing checkpointing in VFL",
            description="Running a VFL train session to find a checkpoint",
            prl_session_id=prl_session.id,              # must be a succesful PRL session ID that matches the datasets used in this session
            vfl_mode="train",
            min_num_clients=2,
            num_rounds=num_rounds,
            package_name="VflGlm",
            data_config=data_config,
            model_config=model_config,
        ).start()

checkpoint_vfl_train_session.id

Get metrics for the session

metrics = checkpoint_vfl_train_session.metrics().as_dict()
        metrics

Start a new session from a checkpoint

To start a new session from a previous checkpoint, specify the training_session_id and the training_round_id from the previous session of interest.

Example:

vfl_train_session = client.create_vfl_session(
            name="VFL train session with checkpoint",
            description="vfl train session with checkpoint",
            prl_session_id=prl_session.id,
            training_session_id=checkpoint_vfl_train_session.id,
            training_round_id=2,
            vfl_mode="train",
            min_num_clients=2,
            num_rounds=num_rounds,
            package_name="VflGlm",
            data_config=data_config,
            model_config=model_config,
        ).start()

vfl_train_session.id

Using best training model for prediction

You can pass a desired model round ID from a successful VFL train session to be used as a starting point in a VFL predict session. This enables you to run predict session with the best model from training.

Example:

vfl_predict_session = client.create_vfl_session(
            name="Prediction from best trained model",
            description="Using a specific model round from training to perform a prediction.",
            prl_session_id=prl_session.id,
            training_session_id=vfl_train_session.id,           #specify the training session ID
            training_round_id=2,                                #specify the round ID of the best training model
            vfl_mode="predict",
            data_config=data_config,
        ).start()

vfl_predict_session.id