Introduction to Sessions¶
Model Builders on the integrate.ai platform use sessions
to train models, generate predictions, and perform other data science work.
The platform supports multiple session types. Regardless of the type, there are several building blocks required to create and run a session in a notebook.
Running a session¶
To run a session, follow the examples below:
Import your token and instantiate the client (task runner).
from integrate_ai_sdk.api import connect
import os
import json
import pandas as pd
IAI_TOKEN = ""
client = connect(token=IAI_TOKEN)
Specify the
model_config
anddata_config
. These paramateres in these two configuration files are specific to the type of session you are running. For details, see the example of the HFL or VFL session type.Create and start (load) the session.
#Set up the task builder that contains the tasks
from integrate_ai_sdk.taskgroup.taskbuilder.integrate_ai import IntegrateAiTaskBuilder
from integrate_ai_sdk.taskgroup.base import SessionTaskGroup
iai_tb_aws = IntegrateAiTaskBuilder(client=client,task_runner_id="")
#Create the session
hfl_session = client.create_fl_session( #This is an HFL session example. Specify the name of the session (eg hfl_session)
name="Testing notebook - HFL", #Type a name to appear in the workspace UI to help you locate your session
description="I am testing HFL FFNet session creation with a task runner through a notebook", #This is a long description that can contain details about your session
min_num_clients=1, #
num_rounds=2, #The number of rounds over which to train the model
package_name="iai_ffnet", #The model package name
model_config=model_config, #The model configuration specified in step 2
data_config=data_schema #The data schema/config specified in step 2
).start()
hfl_session.id # Prints the training session ID for reference
Create a task group and specify the dataset names. Each dataset, or train/test dataset pair, is used in a single task.
task_group = (
SessionTaskGroup(hfl_session)
.add_task(iai_tb_aws.hfl(
train_dataset_name="dataset1", //required
test_dataset_name="test_set1", //required
batch_size=batch_size, //optional - specify the batch size
vcpus=vcpus, //optional - specify to override the default vcpus for the server for this task
memory=memory,))\ //optional - specify to override the default memory for the server for this task
.add_task(iai_tb_aws.hfl(
train_dataset_name="dataset2", //required
test_dataset_name="test_set2", //required
batch_size=batch_size, //optional - specify the batch size
vcpus=vcpus, //optional - specify to override the default vcpus for the server for this task
memory=memory,))\ //optional - specify to override the default memory for the server for this task
)
task_group_context = task_group.start() #Starting the task group starts the training session on the task runner
Each session type has a specific client task configuration. The example here shows only the minimum requirements for an HFL session. Review the details for the session type you are running when you configure your task group.
Wait for the session to run and monitor its progress either through the UI or by using
task_group_context.monitor_task_logs()
in the notebook.
Reviewing session results¶
At times, you may want to review the results of a previously run session, for example to inspect the metrics.
To load a previously run session:
client.session(session.id)
where session.id
is the ID of the previous session.
For more information about session types, see the following sections:
Running VFL Batch Sessions¶
You can run VFL sessions in batches to quickly iterate over parameters. The integrate.ai client has a VFL batch method that runs sessions in batches, based on the parameters provided in the model_config
for the session.
Model config iterations allow you to create a batch by iterating over model configuration parameters.
For example:
model_config_params = {
"optimizer.params.learning_rate": [0.1, 0.2, 0.3],
}
This code launches 3 sessions with the optimizer.params.learning_rate
parameter set to one of the specified values. The batch method executes only the configured number of parallel sessions at a time.
If you specify several parameters, the number of sessions is equal to the number of possible permutations, that is, it is a multuple of value list sizes.
Specify the VFL data configuration¶
To run batch sessions, specify the data_config
for the VFL session as usual.
data_config = {
"passive_client": {
"label_client": False,
"predictors": ["x1", "x3", "x5", "x7"],
"target": None,
},
"active_client": {
"label_client": True,
"predictors": ["x0", "x2", "x4", "x6"],
"target": "y",
},
}
Specify the VFL batch model configuration¶
Specify the model_config
as usual, and specify the parameters to iterate over in the model_config_params.
model_config = {
"strategy": {"name": "VflGlm", "params": {"expand_duplicates": False}},
"model": {
"passive_client": {"params": {}},
"active_client": {"params": {}},
},
"ml_task": {
"type": "regression",
"loss_function": "mse",
"params": {},
},
"optimizer": {"name": "SGD", "params": {"learning_rate": 0.01, "momentum": 0.0}}, # specifies the optimizer and parameters
"influence_score": {"enable": False, "params": {}}, # enables/disables influence score calculation
"feature_importance_score": {"enable": False, "params": {}}, # enables/disables feature importance score calculation
"seed": 23, # for reproducibility
}
# Iterate over all possible combinations of model config parameters:
model_config_params = {
"optimizer.params.learning_rate": [0.1, 0.2, 0.3],
}
Create and start session batches¶
batch_id, batch_sessions = vfl_batch = client.run_vfl_batch(
name="Testing VFL batch",
description="batching VFL",
prl_session_id=prl_session.id,
vfl_mode='train',
min_num_clients=2,
num_rounds=2,
package_name="iai_glm",
data_config=data_config,
model_config=model_config,
# Batch params:
model_config_params=model_config_params,
# How many sessions to run in parallel
capacity=4,
# Add one task for each client, as usual for a VFL session
tasks=[
iai_tb_aws.vfl_train(
train_dataset_name=active_train_dataset, test_dataset_name=active_test_dataset,
batch_size=1024,
client_name="active_client"),
iai_tb_aws.vfl_train(
train_dataset_name=passive_train_dataset, test_dataset_name=passive_test_dataset,
batch_size=1024,
client_name="passive_client"),
],
)
batch.id
Batch complete¶
Now you can view the VFL training metrics and start making predictions.
batch_sessions
is a dict keyed by session objects that stores task group context, data config, and model config for each session.
#print(batch_sessions)
print("batch_id", batch_id)
for s in batch_sessions.keys():
print(s.id, s.name, s.status)
#s.metrics().plot()
You also can query the API for a previously executed batch. Specify the batch_id
for the batch you want to review.
for s in client.get_batch_sessions(batch_id):
print(s.id, s.name, s.status)
VFL Checkpoints¶
When creating a new VFL training session, you can pass a previous training session ID and round ID as a checkpoint. The model will be loaded from this checkpoint for the new training session.
You can also download round-specific model coefficients through the SDK interface. Coefficients are saved to the federated server after each round in addition to at the end of a session.
Run a session to determine a checkpoint of interest¶
checkpoint_vfl_train_session = client.create_vfl_session(
name="Testing checkpointing in VFL",
description="Running a VFL train session to find a checkpoint",
prl_session_id=prl_session.id, # must be a succesful PRL session ID that matches the datasets used in this session
vfl_mode="train",
min_num_clients=2,
num_rounds=num_rounds,
package_name="VflGlm",
data_config=data_config,
model_config=model_config,
).start()
checkpoint_vfl_train_session.id
Get metrics for the session
metrics = checkpoint_vfl_train_session.metrics().as_dict()
metrics
Start a new session from a checkpoint¶
To start a new session from a previous checkpoint, specify the training_session_id
and the training_round_id
from the previous session of interest.
Example:
vfl_train_session = client.create_vfl_session(
name="VFL train session with checkpoint",
description="vfl train session with checkpoint",
prl_session_id=prl_session.id,
training_session_id=checkpoint_vfl_train_session.id,
training_round_id=2,
vfl_mode="train",
min_num_clients=2,
num_rounds=num_rounds,
package_name="VflGlm",
data_config=data_config,
model_config=model_config,
).start()
vfl_train_session.id
Using best training model for prediction¶
You can pass a desired model round ID from a successful VFL train
session to be used as a starting point in a VFL predict
session. This enables you to run predict session with the best model from training.
Example:
vfl_predict_session = client.create_vfl_session(
name="Prediction from best trained model",
description="Using a specific model round from training to perform a prediction.",
prl_session_id=prl_session.id,
training_session_id=vfl_train_session.id, #specify the training session ID
training_round_id=2, #specify the round ID of the best training model
vfl_mode="predict",
data_config=data_config,
).start()
vfl_predict_session.id