Task Runner Test Script

Task Runner Test Script

This tool is intended to sanity test an existing task runner.

Copy the test data

For AWS:

Copy data to the taskrunner bucket. For example:

$ AWS_PROFILE=<profile to access bucket for taskrunnner> aws s3 sync s3://public.s3.integrate.ai/integrate_ai_examples/synthetic/ s3://<the taskrunner bucket>}/example_data/')

For Azure:

The following dataset paths are expected:

azure://<storage>/example_data/train_silo0.parquet
azure://<storage>/example_data/train_silo1.parquet

If these datasets do not exist yet, copy them from the examples:

azcopy sync --recursive=true <local example data path> https://<storage>.blob.core.windows.net/nasronplsessionstorage/example_data/

Task runner test script Usage

  1. Create a token in your integrate.ai workspace. Set the value to the IAI_TOKEN environment variable.

  2. Create a virtual environment and install the integrate-ai CLI tool and SDK. Note: Python 3.9 or 3.10 is required.

  3. Specify the name of the task runner to test and run the test script.

$ python -m venv iai-test-venv
$ source iai-test-venv/bin/activate
$ export IAI_TOKEN=<your token>
$ pip install integrate-ai
$ iai sdk install
$ iai_test_taskrunner <the taskrunner name to test>

Example AWS output for a workspace “mine” with task runner “slim”

Expecting data sets:
s3://mine-slim.integrate.ai/example_data/train_silo0.parquet
s3://mine-slim.integrate.ai/example_data/train_silo1.parquet
If these datasets do not exist yet, copy them from the examples:
$ AWS_PROFILE=<profile to access bucket for taskrunnner slim> aws s3 sync s3://public.s3.integrate.ai/integrate_ai_examples/synthetic/ s3://mine-slim.integrate.ai/example_data/

Initializing EDA session
session_id=$9d85ca4369
Waiting for completion
Check https://login-mine.integrateai.net/sessions/9d85ca4369/details for details

Session finished with status: Completed