Deployment¶
integrate.ai provides a safe data collaboration platform that enables data science leaders to activate data silos by moving models, not data. It democratizes usage of the cloud by taking care of all the computational aspects of deployment — from container orchestration and deployment, to security and failovers. By harnessing the power of cloud computing and AI we enable data scientists and R&D professionals, with limited-to-no computational and machine learning training, to analyse and make sense of their data in the fastest and safest way possible.
The image below illustrates the high-level system architecture and how the managed environments interact.
The task runners use the serverless capabilities of cloud environments (such as AWS Batch and Fargate). This greatly reduces the administration burden and ensures that resource cost is only incurred while task runners are in use.
There are two deployment patterns available for task runners:
Hosted task runner (Compute hosted by integrate.ai)
Edge task runner (Compute hosted by customer)
Hosted Task runner¶
Customer maintains control and hosting of all their data. integrate.ai performs limited compute functions on data with read only access. This does not require hosting any software in the customer cloud infrastructure. Third parties cannot access raw data.
AWS Bucket Permission¶
Create a new bucket or use an existing one that contains the data on which compute is being performed.
Apply the below permissions to the bucket to allow integrate.ai read and list access.
Note: The accountid to use will be provided by your integrate.ai Customer Success Manager. Replace BUCKETNAME with the bucket from Step 1.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::<accountid>:role/iai-taskrunner-provisioner",
"arn:aws:iam::<accountid>:role/iai-taskrunner-runtime"
]
},
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::BUCKETNAME"
},
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::<accountid>:role/iai-taskrunner-provisioner",
"arn:aws:iam::<accountid>:role/iai-taskrunner-runtime"
]
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::BUCKETNAME/*"
}
]
}
Upload data to the bucket and register it, following the steps here Quickstart - Register your dataset.
Edge Task runner¶
Customer maintains maximum control of their data and compute. This requires hosting software in the customer cloud infrastructure. Third parties cannot access raw data.
IT Administrator Workflow¶
The initial setup for your workspace must be performed by an AWS or Microsoft Azure Administrator. This Administrator must have the rights neceesary to configure the permissions and access needed by the cloud-based task runners.
Follow the steps for either AWS configuration or Azure configuration to generate the roles and policies or service principal information.
Provide the cloud environment details to the integrate.ai Workspace Administrator.
a. For AWS, provide the provisioner and runtime role ARNs, and any advanced configuration information (such as VPC information, KMS keys, etc).
b. For Azure, provide the Resource group, Service principal ID, Service principal secret, Subscription ID, and Tenant ID.
Alternatively, the IT Admin may choose to take on the Workspace Administrator role in the workspace and configure one or more task runners for their users.
AWS configuration for task runners¶
Before you get started using integrate.ai in your cloud for training sessions, there are a few configuration steps that must be completed. You must grant integrate.ai permission to deploy task runner infrastructure in your cloud, by creating a limited permission Role in AWS for the provisioner and for the runtime agent. This is a one-time process - once created, you can use these roles for any task runners in your environment.
This section walks you through the required configuration.
Create a provisioner policy and provisioner role.
Create a runtime policy and runtime role.
Create AWS Provisioner Policy¶
This policy lists all of the required permissions for integrate.ai to create the necessary infrastructure.
The provisioner creates the following components and performs the required related tasks:
Configures AWS Batch infrastructure, including creating roles and policies
Configures AWS Fargate infrastructure, including creating roles and policies
Creates a VPC and compute environment - all compute runs in the VPC created for the task runner
Creates an S3 bucket that is encrypted with a customer key created by the provisioner
Pulls the required client and server container images from an integrate.ai ECR (elastic container repository)
Note: AWS imposes default maximums for some infrastructure components, as noted below. If you exceed these limits, provisioning a new task runner may fail. Request an increase to the limit through your AWS console.
VPCs - maximum 5 by default
EC2 NAT gateways - maximum 5 by default
To create the provisioner policy:
In the AWS Console, go to IAM, select Policies, and click Create policy.
On the Specify permissions page, click the JSON tab.
Paste in the Provisioner JSON policy provided below.
Click Next.
Name the policy
iai-provisioner-policy
and click Create policy.
# AWS Provisioner policy
{
"Version":"2012-10-17",
"Statement":[
{
"Sid":"IAM",
"Effect":"Allow",
"Action":[
"iam:CreateInstanceProfile",
"iam:RemoveRoleFromInstanceProfile",
"iam:DeleteInstanceProfile",
"iam:AddRoleToInstanceProfile"
],
"Resource":"arn:aws:iam::*:instance-profile/iai-*"
},
{
"Sid":"IAMPass",
"Effect":"Allow",
"Action":[
"iam:PassRole"
],
"Resource":"arn:aws:iam::*:role/iai-*"
},
{
"Sid":"IAMRead",
"Effect":"Allow",
"Action":[
"iam:GetInstanceProfile",
"iam:GetPolicy",
"iam:GetRole",
"iam:GetPolicyVersion",
"iam:ListRolePolicies",
"iam:ListAttachedRolePolicies",
"iam:ListPolicyVersions",
"iam:ListInstanceProfilesForRole"
],
"Resource":"*"
},
{
"Sid":"Batch",
"Effect":"Allow",
"Action":[
"batch:RegisterJobDefinition",
"batch:DeregisterJobDefinition",
"batch:CreateComputeEnvironment",
"batch:UpdateComputeEnvironment",
"batch:DeleteComputeEnvironment",
"batch:CreateJobQueue",
"batch:UpdateJobQueue",
"batch:DeleteJobQueue"
],
"Resource":[
"arn:aws:batch:*:*:compute-environment/iai-*",
"arn:aws:batch:*:*:job-definition/iai-*",
"arn:aws:batch:*:*:job-queue/iai-*",
"arn:aws:batch:*:*:job-definition-revision/iai-*",
"arn:aws:batch:*:*:scheduling-policy/*"
]
},
{
"Sid":"BatchRead",
"Effect":"Allow",
"Action":[
"batch:DescribeComputeEnvironments",
"batch:DescribeJobDefinitions",
"batch:DescribeJobQueues"
],
"Resource":"*"
},
{
"Sid":"CW",
"Effect":"Allow",
"Action":[
"logs:ListTagsLogGroup",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:DeleteLogGroup",
"logs:TagResource",
"logs:PutLogEvents",
"logs:PutRetentionPolicy"
],
"Resource":[
"arn:aws:logs:*:*:log-group:iai-*"
]
},
{
"Sid":"CWRead",
"Effect":"Allow",
"Action":[
"logs:DescribeLogGroups",
"logs:DescribeLogStreams"
],
"Resource":"*"
},
{
"Sid":"ECSFargate",
"Effect":"Allow",
"Action":[
"ecs:CreateCluster",
"ecs:DescribeClusters",
"ecs:DeleteCluster",
"ecs:UpdateCluster",
"ecs:RegisterTaskDefinition",
"ecs:TagResource"
],
"Resource":[
"arn:aws:ecs:*:*:cluster/iai-*",
"arn:aws:ecs:*:*:task-definition/iai-*"
]
},
{
"Sid":"ECSFargateRead",
"Effect":"Allow",
"Action":[
"ecs:DescribeTaskDefinition",
"ecs:DeregisterTaskDefinition"
],
"Resource":"*"
},
{
"Sid":"Kms",
"Effect":"Allow",
"Action":[
"kms:ListAliases",
"kms:CreateKey",
"kms:CreateAlias",
"kms:DescribeKey",
"kms:GetKeyPolicy",
"kms:GetKeyRotationStatus",
"kms:ListResourceTags",
"kms:ScheduleKeyDeletion",
"kms:CreateGrant",
"kms:ListGrants",
"kms:RevokeGrant",
"kms:TagResource",
"kms:Untagresource"
],
"Resource":"*"
},
{
"Sid":"KmsDeleteAlias",
"Effect":"Allow",
"Action":[
"kms:DeleteAlias"
],
"Resource":[
"arn:aws:kms:*:*:alias/iai/*",
"arn:aws:kms:*:*:key/*"
]
},
{
"Sid":"S3Create",
"Effect":"Allow",
"Action":[
"s3:CreateBucket",
"s3:DeleteBucket",
"s3:DeleteBucketPolicy",
"s3:PutBucketVersioning",
"s3:PutBucketPublicAccessBlock",
"s3:PutEncryptionConfiguration"
],
"Resource":[
"arn:aws:s3:::*integrate.ai",
"arn:aws:s3:::*integrate.ai/*"
]
},
{
"Sid":"S3Read",
"Effect":"Allow",
"Action":[
"s3:GetBucketCors",
"s3:GetBucketPolicy",
"s3:PutBucketPolicy",
"s3:GetBucketWebsite",
"s3:GetBucketVersioning",
"s3:GetLifecycleConfiguration",
"s3:GetAccelerateConfiguration",
"s3:GetBucketRequestPayment",
"s3:GetBucketLogging",
"s3:GetBucketPublicAccessBlock",
"s3:GetBucketAcl",
"s3:GetBucketObjectLockConfiguration",
"s3:GetReplicationConfiguration",
"s3:GetBucketTagging",
"s3:GetEncryptionConfiguration",
"s3:ListBucket"
],
"Resource":[
"arn:aws:s3:::*integrate.ai",
"arn:aws:s3:::*integrate.ai/*"
]
},
{
"Sid":"VPCDescribe",
"Effect":"Allow",
"Action":[
"ec2:DescribeVpcs",
"ec2:DescribeSubnets",
"ec2:DescribeSubnets",
"ec2:DescribeVpcAttribute",
"ec2:DescribeVpcClassicLinkDnsSupport",
"ec2:DescribeVpcClassicLink",
"ec2:DescribeInternetGateways",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSecurityGroupRules",
"ec2:DescribeRouteTables",
"ec2:DescribeNetworkAcls",
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeNatGateways",
"ec2:DescribeAddresses",
"ec2:DescribeAccountAttributes",
"ec2:DescribeAvailabilityZones"
],
"Resource":"*"
},
{
"Sid":"VPCCreate",
"Effect":"Allow",
"Action":[
"ec2:CreateVpc",
"ec2:CreateTags",
"ec2:AllocateAddress",
"ec2:ReleaseAddress",
"ec2:CreateSubnet",
"ec2:ModifySubnetAttribute",
"ec2:RevokeSecurityGroupEgress",
"ec2:RevokeSecurityGroupIngress",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:AuthorizeSecurityGroupEgress",
"ec2:CreateRouteTable",
"ec2:CreateRoute",
"ec2:CreateInternetGateway",
"ec2:AttachInternetGateway",
"ec2:AssociateRouteTable",
"ec2:ModifyVpcAttribute",
"ec2:CreateSecurityGroup",
"ec2:CreateNatGateway"
],
"Resource":"*"
},
{
"Sid":"VPCDelete",
"Effect":"Allow",
"Action":[
"ec2:DeleteSubnet",
"ec2:DisassociateRouteTable",
"ec2:DeleteSecurityGroup",
"ec2:DeleteRoute",
"ec2:DeleteNatGateway",
"ec2:DeleteRouteTable",
"ec2:DisassociateAddress",
"ec2:DetachInternetGateway",
"ec2:DeleteInternetGateway",
"ec2:DeleteVpc"
],
"Resource":"*"
}
]
}
Create AWS Provisioner Role¶
You must grant integrate.ai access to deploy task runner infrastructure in your cloud, by creating a limited permission role in AWS for the provisioner.
To create the provisioner role:
In the left navigation bar of the console, under IAM, select Roles, and click Create role.
In Step 1 - Select Trusted Entity, click Custom trust policy.
Paste in the custom trust policy provided below.
# AWS Provisioner Custom trust policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::919740802015:root"
},
"Action": [
"sts:AssumeRole",
"sts:TagSession"
],
"Condition": {
"ArnLike": {
"aws:PrincipalArn": "arn:aws:iam::919740802015:role/iai-taskrunner-provision-*"
}
}
}
]
}
Click Next.
In Step 2 - Add permissions, search for and select the policy you created (
iai-provisioner-policy
).(Optional) If your environment requires a permission boundary, attach it to the role.
Click Next.
Provide the following Role name:
iai-taskrunner-provisioner
. Important: Do not edit or change this name.Click Create role.
Copy and save the ARN for the provisioner role. Provide the ARN to any Workspace Administrator or Model Builder who will be creating task runners.
Create AWS Runtime policy¶
You must grant integrate.ai access to run the task runner in your cloud environment by creating a limited permission role in AWS for the runtime agent.
Important: This policy provides the specific permissions for integrate.ai to call the task runner and dispatch tasks, such as training a machine learning model. You may review this policy to comply with your organization’s policies. Contact your Customer Success Manager if this policy requires modification.
To create the runtime policy:
In the AWS Console, go to IAM, select Policies, and click Create policy.
On the Specify permissions page, click the JSON tab.
Paste in the Runtime JSON policy provided below.
Click Next.
Name the policy
iai-runtime-policy
and click Create policy.
# AWS Runtime policy
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowBatchDescribeJobs",
"Effect": "Allow",
"Action": [
"batch:DescribeJobs",
"batch:TagResource"
],
"Resource": "*"
},
{
"Sid": "AllowBatchAccess",
"Effect": "Allow",
"Action": [
"batch:TerminateJob",
"batch:SubmitJob",
"batch:TagResource",
"batch:CancelJob"
],
"Resource": [
"arn:aws:batch:*:*:job-definition/iai-fl-client-batch-job-*",
"arn:aws:batch:*:*:job-queue/iai-fl-client-batch-job-queue-*",
"arn:aws:batch:*:*:job/*"
]
},
{
"Sid": "AllowECSUpdateAccess",
"Effect": "Allow",
"Action": [
"ecs:DescribeContainerInstances",
"ecs:DescribeTasks",
"ecs:ListTasks",
"ecs:UpdateContainerAgent",
"ecs:StartTask",
"ecs:StopTask",
"ecs:RunTask"
],
"Resource": [
"arn:aws:ecs:*:*:cluster/iai-fl-server-ecs-cluster-*",
"arn:aws:ecs:*:*:task/iai-fl-server-ecs-cluster-*",
"arn:aws:ecs:*:*:task-definition/iai-fl-server-fargate-job-*"
]
},
{
"Sid": "AllowECSReadAccess",
"Effect": "Allow",
"Action": [
"ecs:DescribeTaskDefinition"
],
"Resource": [
"*"
]
},
{
"Sid": "AllowPassRole",
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": [
"arn:aws:iam::*:role/iai-fl-server-fargate-task-role-*-*",
"arn:aws:iam::*:role/iai-fl-server-fargate-execution-role-*-*"
]
},
{
"Sid": "AllowSSMAccessForTokens",
"Effect": "Allow",
"Action": [
"ssm:PutParameter",
"ssm:DescribeParameters",
"ssm:GetParameters",
"ssm:GetParameter",
"ssm:DeleteParameter",
"ssm:DeleteParameters"
],
"Resource": [
"arn:aws:ssm:*:*:parameter/fl-server-*-token",
"arn:aws:ssm:*:*:parameter/fl-client-*-token"
]
},
{
"Sid": "AllowS3Access",
"Effect": "Allow",
"Action": [
"s3:*Object",
"s3:ListBucket",
"s3:GetObjectAcl",
"s3:GetObjectVersion",
"s3:ListBucketVersions",
"s3:GetEncryptionConfiguration"
],
"Resource": [
"arn:aws:s3:::*.integrate.ai",
"arn:aws:s3:::*.integrate.ai/*"
]
},
{
"Sid": "DenyS3BucketReadAccess",
"Effect": "Deny",
"Action": [
"s3:*Object",
"s3:ListBucket",
"s3:GetEncryptionConfiguration",
"s3:GetObjectAcl",
"s3:ListBucketVersions",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws:s3:::*.integrate.ai/taskrunner",
"arn:aws:s3:::*.integrate.ai/taskrunner/*"
]
},
{
"Sid": "AllowKMSUsage",
"Effect": "Allow",
"Action": [
"kms:Decrypt",
"kms:DescribeKey",
"kms:Encrypt",
"kms:GenerateDataKey"
],
"Resource": "*",
"Condition": {
"ForAnyValue:StringLike": {
"kms:ResourceAliases": "alias/iai/*"
}
}
},
{
"Sid": "AllowLogGroupAndStreamAccess",
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:CreateLogGroup",
"logs:Describe*",
"logs:List*",
"logs:StartQuery",
"logs:StopQuery",
"logs:TestMetricFilter",
"logs:FilterLogEvents",
"logs:Get*"
],
"Resource": [
"arn:aws:logs:*:*:log-group:iai-fl-server-fargate-log-group-*:log-stream:*",
"arn:aws:logs:*:*:log-group:/aws/batch/job:log-stream:iai-fl-client-batch-job-*",
"arn:aws:logs:*:*:log-group:iai-proxy-log*:log-stream:*"
]
},
{
"Sid": "AllowTaskInfo",
"Effect": "Allow",
"Action": [
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeInstances"
],
"Resource": [
"*"
]
},
{
"Effect": "Allow",
"Action": "ecr:GetAuthorizationToken",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage"
],
"Resource": "arn:aws:ecr:*:*:repository/edge*"
}
]
}
Create AWS Runtime role¶
In the left navigation bar of the console, under IAM, select Roles, and click Create role.
Select Custom trust policy.
Paste in the custom trust relationship JSON provided below.
#AWS runtime custom trust policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::919740802015:root"
},
"Action": [
"sts:AssumeRole",
"sts:TagSession"
],
"Condition": {
"ArnLike": {
"aws:PrincipalArn": [
"arn:aws:iam::919740802015:role/IAI-API-*"
]
}
}
},
{
"Effect": "Allow",
"Principal": {
"Service": [
"ecs-tasks.amazonaws.com",
"batch.amazonaws.com",
"ec2.amazonaws.com"
]
},
"Action": [
"sts:AssumeRole",
"sts:TagSession"
],
"Condition": {}
}
]
}
Click Next.
On the Add permissions page, search for and select the policy you just created (
iai-runtime-policy
).Search for and add the following AWS policies:
AmazonEC2ContainerServiceforEC2Role
,AWSBatchServiceRole
,AmazonECSTaskExecutionRolePolicy
.Click Next.
Provide the following Role name:
iai-taskrunner-runtime
. Important: Do not edit or change this name.Click Create role.
Copy and save the ARN for the runtime role. Provide the ARN to any Workspace Administrator or Model Builder who will be creating task runners.
Role and policy configuration is now complete.
Azure configuration for task runners¶
Before you get started using integrate.ai in your Microsoft Azure cloud for training sessions, there are a few configuration steps that must be completed:
You must grant integrate.ai permission to deploy task runner infrastructure in your cloud, by creating the following:
A dedicated resource group
A limited permission provisioner service principal, used to provision a taskrunner.
A limited permission runtime service principal to execute Azure tasks using the task runner.
This is a one-time process - once created, you can use this infrastructure for any task runners in your environment.
This section walks you through the required configuration.
Create a resource group.
Create a custom provisioner role.
Create a provisioner service principal.
Create a custom runtime role.
Create a runtime service principal and role.
Azure Container Instance Resource Limitations¶
Azure Cloud Services impose the following hard (unchangeable) resource limitations for Azure Container Instances:
The following limits are default limits that cannot be increased through a quota request. Any quota increase requests for these limits won’t be approved.
Resource |
Actual Limit |
---|---|
Number of containers per container group |
60 |
Number of volumes per container group |
20 |
Ports per IP |
5 |
Container instance log size - running instance |
4 MB |
Container instance log size - stopped instance |
16 KB or 1,000 lines |
The following limits are changeable limits. You can request a quota increase from Azure for the following:
Resource |
Actual Limit |
---|---|
Standard sku container groups per region per subscription |
100 |
Standard sku cores (CPUs) per region per subscription |
100 |
Standard sku cores (CPUs) for V100 GPU per region per subscription |
0 |
For more information, see Resource availability & quota limits for ACI.
Create Azure Resource Group¶
You must grant integrate.ai access to deploy task runner infrastructure in your cloud, by creating a dedicated resource group. You must provide the resource group name as part of the task runner creation process.
In order to provide all the necessary permissions, the user who creates the resource group and provisioner service principal must be an Azure AD Administrator.
Log in to your Azure portal and create a dedicated resource group for use with integrate.ai. For more information about resource groups, see the Microsoft documentation.
Create Custom Provisioner Role for Azure¶
This role lists all of the required permissions for integrate.ai to create the necessary infrastructure.
Use the provided JSON example to create a custom provisioner role in your resource group.
In the Azure portal, select the Resource Groups service and select the resource group created in Step 1.
Select the Access Control (IAM) section and click Add Custom Role from the drop down.
Select the JSON tab. Copy and paste the permissions code block.
# Azure Provisioner Role
{
"properties": {
"roleName": "iai-provisoner",
"description": "Permission polices to provision a task runner",
"assignableScopes": [
"/"
],
"permissions": [
{
"actions": [
"Microsoft.Resources/subscriptions/resourceGroups/read",
"Microsoft.Storage/storageAccounts/read",
"Microsoft.Storage/storageAccounts/write",
"Microsoft.Storage/storageAccounts/delete",
"Microsoft.Storage/storageAccounts/fileServices/read",
"Microsoft.Storage/storageAccounts/fileServices/shares/read",
"Microsoft.Storage/storageAccounts/blobServices/containers/write",
"Microsoft.Storage/storageAccounts/blobServices/containers/read",
"Microsoft.Storage/storageAccounts/blobServices/containers/delete",
"Microsoft.Storage/storageAccounts/blobServices/write",
"Microsoft.Storage/storageAccounts/blobServices/read",
"Microsoft.Storage/storageAccounts/listKeys/action",
"Microsoft.ClassicStorage/storageAccounts/listKeys/action",
"Microsoft.OperationalInsights/workspaces/read",
"Microsoft.OperationalInsights/workspaces/write",
"Microsoft.OperationalInsights/workspaces/delete",
"Microsoft.OperationalInsights/workspaces/sharedKeys/read",
"Microsoft.OperationalInsights/workspaces/sharedKeys/action",
"Microsoft.OperationalInsights/workspaces/tables/write",
"Microsoft.OperationalInsights/workspaces/tables/read",
"Microsoft.OperationalInsights/workspaces/tables/delete",
"Microsoft.OperationalInsights/workspaces/tables/query/read",
"Microsoft.Insights/dataCollectionRules/read",
"Microsoft.Insights/dataCollectionRules/write",
"Microsoft.Insights/DataCollectionRules/Delete",
"Microsoft.Insights/dataCollectionEndpoints/read",
"Microsoft.Insights/dataCollectionEndpoints/write",
"Microsoft.Insights/dataCollectionEndpoints/delete"
],
"notActions": [
],
"dataActions": [
"Microsoft.Storage/storageAccounts/fileServices/fileshares/files/read",
"Microsoft.Storage/storageAccounts/fileServices/fileshares/files/write",
"Microsoft.Storage/storageAccounts/fileServices/fileshares/files/delete",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/delete",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/move/action",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/add/action",
"Microsoft.Storage/storageAccounts/fileServices/readFileBackupSemantics/action",
"Microsoft.OperationalInsights/workspaces/tables/data/read"
],
"notDataActions": []
}
]
}
}
Update the code block to use the correct resource group in
assignableScopes
:
"assignableScopes": ["/subscriptions/<SUBSCRIPTION ID>/resourceGroups/<RESOURCE GROUP NAME>"]
Click Review + Create.
Create Provisioner Service Principal¶
If it is not already available, install the
Azure CLI
. For more information, see the Azure CLI documentation.At the command prompt, type:
az ad sp create-for-rbac --name taskrunner-provisioner-sp --role "iai-provisioner-role" --scopes="/subscriptions/<your subscription ID>/resourcegroups/<your resource group>"
Make sure that you specify the correct subscription ID and resource group name.
Copy the output and save it. This information is required to connect a new task runner.
Note: The output includes credentials that you must protect. Be sure that you do not include these credentials in your code or check the credentials into your source control. For more information, see Microsoft documentation.
Assign the
Reader
role to the provisioner service principal. The user who creates the resource group and provisioner service principal must be an Azure AD Administrator.
# Example CLI output:
Creating custom role assignment under scope '/subscriptions/<your subscription ID>/resourcegroups/<your resource group>'
The output includes credentials that you must protect. Be sure that you do not include these credentials in your code or check the credentials into your source control. For more information, see https://aka.ms/azadsp-cli
{
"appId": "<client ID>",
"displayName": "azure-cli-2023-04-13-14-57-09",
"password": "<secret>",
"tenant": "<tenant ID>"
}
The provisioner creates the following components and performs the required related tasks:
Configures infrastructure
Pulls the required client and server container images from an integrate.ai ECR (elastic container repository).
Create Custom Runtime Role for Azure¶
You must grant integrate.ai access to run the task runner in your cloud, by creating a limited permission role for the runtime.
To create the runtime role:
In the Azure portal, select the Resource Groups service and select the resource group created in Step 1.
Select the Access Control (IAM) section and click Add Custom Role from the drop down.
Select the JSON tab. Copy and paste the permissions code block.
#Azure Runtime Role
{
"properties": {
"roleName": "iai-runtime",
"description": "Permission polices to run a task under the task runner",
"assignableScopes": [
"/"
],
"permissions": [
{
"actions": [
"Microsoft.Network/virtualNetworks/subnets/join/action",
"Microsoft.ContainerInstance/register/action",
"Microsoft.ContainerInstance/containerGroups/read",
"Microsoft.ContainerInstance/containerGroups/write",
"Microsoft.ContainerInstance/containerGroups/delete",
"Microsoft.ContainerInstance/containerGroups/containers/exec/action",
"Microsoft.ContainerInstance/containerGroups/containers/attach/action",
"Microsoft.ContainerInstance/containerGroups/containers/buildlogs/read",
"Microsoft.ContainerInstance/containerGroups/containers/logs/read",
"Microsoft.ContainerInstance/containerGroups/detectors/read",
"Microsoft.ContainerInstance/containerGroups/outboundNetworkDependenciesEndpoints/read",
"Microsoft.ContainerInstance/containerGroups/providers/Microsoft.Insights/diagnosticSettings/read",
"Microsoft.ContainerInstance/containerGroups/providers/Microsoft.Insights/diagnosticSettings/write",
"Microsoft.ContainerInstance/containerGroups/providers/Microsoft.Insights/metricDefinitions/read",
"Microsoft.ContainerInstance/containerGroups/operationResults/read",
"Microsoft.ContainerInstance/containerGroupProfiles/read",
"Microsoft.ContainerInstance/containerGroupProfiles/write",
"Microsoft.ContainerInstance/containerGroupProfiles/delete",
"Microsoft.ContainerInstance/containerGroupProfiles/revisions/read",
"Microsoft.ContainerInstance/containerGroupProfiles/revisions/deregister/action",
"Microsoft.ContainerInstance/containerScaleSets/read",
"Microsoft.ContainerInstance/containerScaleSets/write",
"Microsoft.ContainerInstance/containerScaleSets/delete",
"Microsoft.ContainerInstance/operations/read",
"Microsoft.ContainerInstance/serviceassociationlinks/delete",
"Microsoft.ContainerInstance/locations/validateDeleteVirtualNetworkOrSubnets/action",
"Microsoft.ContainerInstance/locations/deleteVirtualNetworkOrSubnets/action",
"Microsoft.ContainerInstance/locations/cachedImages/read",
"Microsoft.ContainerInstance/locations/capabilities/read",
"Microsoft.ContainerInstance/locations/operationResults/read",
"Microsoft.ContainerInstance/locations/operations/read",
"Microsoft.ContainerInstance/locations/usages/read",
"Microsoft.ContainerRegistry/locations/operationResults/read",
"Microsoft.ContainerRegistry/registries/read",
"Microsoft.ContainerRegistry/registries/write",
"microsoft.OperationalInsights/locations/operationStatuses/read",
"Microsoft.OperationalInsights/workspaces/tables/write",
"Microsoft.OperationalInsights/workspaces/tables/read",
"Microsoft.OperationalInsights/workspaces/tables/delete",
"Microsoft.OperationalInsights/workspaces/write",
"Microsoft.OperationalInsights/workspaces/read",
"Microsoft.OperationalInsights/workspaces/delete",
"Microsoft.OperationalInsights/workspaces/sharedkeys/action",
"Microsoft.OperationalInsights/workspaces/listKeys/action",
"Microsoft.OperationalInsights/workspaces/regenerateSharedKey/action",
"Microsoft.OperationalInsights/workspaces/search/action",
"Microsoft.OperationalInsights/workspaces/purge/action",
"Microsoft.OperationalInsights/workspaces/analytics/query/action",
"Microsoft.OperationalInsights/workspaces/analytics/query/schema/read",
"Microsoft.OperationalInsights/workspaces/api/query/action",
"Microsoft.OperationalInsights/workspaces/api/query/schema/read",
"Microsoft.OperationalInsights/workspaces/availableservicetiers/read",
"Microsoft.OperationalInsights/workspaces/features/clientGroups/members/read",
"Microsoft.OperationalInsights/workspaces/configurationscopes/read",
"Microsoft.OperationalInsights/workspaces/configurationscopes/write",
"Microsoft.OperationalInsights/workspaces/configurationscopes/delete",
"Microsoft.OperationalInsights/workspaces/features/generateMap/read",
"Microsoft.OperationalInsights/workspaces/intelligencepacks/read",
"Microsoft.OperationalInsights/workspaces/intelligencepacks/enable/action",
"Microsoft.OperationalInsights/workspaces/intelligencepacks/disable/action",
"Microsoft.OperationalInsights/workspaces/linkedservices/read",
"Microsoft.OperationalInsights/workspaces/linkedservices/write",
"Microsoft.OperationalInsights/workspaces/linkedservices/delete",
"Microsoft.OperationalInsights/workspaces/features/machineGroups/read",
"microsoft.operationalinsights/workspaces/features/machineGroups/read",
"Microsoft.OperationalInsights/workspaces/managementgroups/read",
"Microsoft.OperationalInsights/workspaces/metricDefinitions/read",
"Microsoft.OperationalInsights/workspaces/notificationsettings/read",
"Microsoft.OperationalInsights/workspaces/notificationsettings/write",
"Microsoft.OperationalInsights/workspaces/notificationsettings/delete",
"Microsoft.OperationalInsights/workspaces/restoreLogs/write",
"microsoft.operationalinsights/workspaces/restoreLogs/write",
"Microsoft.OperationalInsights/workspaces/rules/read",
"microsoft.operationalinsights/workspaces/rules/read",
"Microsoft.OperationalInsights/workspaces/savedSearches/read",
"Microsoft.OperationalInsights/workspaces/savedSearches/write",
"Microsoft.OperationalInsights/workspaces/savedSearches/delete",
"Microsoft.OperationalInsights/workspaces/scopedprivatelinkproxies/read",
"Microsoft.OperationalInsights/workspaces/scopedprivatelinkproxies/write",
"Microsoft.OperationalInsights/workspaces/scopedprivatelinkproxies/delete",
"Microsoft.OperationalInsights/workspaces/searchJobs/write",
"Microsoft.OperationalInsights/workspaces/schema/read",
"Microsoft.OperationalInsights/workspaces/features/serverGroups/members/read",
"Microsoft.OperationalInsights/workspaces/tables/query/read",
"Microsoft.OperationalInsights/workspaces/providers/Microsoft.Insights/logDefinitions/read",
"Microsoft.OperationalInsights/workspaces/usages/read",
"Microsoft.OperationalInsights/workspaces/views/read",
"Microsoft.OperationalInsights/workspaces/views/delete",
"Microsoft.OperationalInsights/workspaces/views/write",
"Microsoft.OperationalInsights/workspaces/listKeys/read",
"Microsoft.OperationalInsights/workspaces/operations/read",
"Microsoft.OperationalInsights/workspaces/upgradetranslationfailures/read",
"Microsoft.OperationalInsights/workspaces/search/read",
"Microsoft.OperationalInsights/workspaces/providers/Microsoft.Insights/diagnosticSettings/Read",
"Microsoft.OperationalInsights/workspaces/providers/Microsoft.Insights/diagnosticSettings/Write",
"Microsoft.OperationalInsights/workspaces/query/read",
"Microsoft.Storage/storageAccounts/blobServices/containers/delete",
"Microsoft.Storage/storageAccounts/blobServices/containers/read",
"Microsoft.Storage/storageAccounts/blobServices/containers/write",
"Microsoft.Storage/storageAccounts/blobServices/generateUserDelegationKey/action",
],
"notActions": [],
"dataActions": [
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/delete",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/deleteBlobVersion/action",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/permanentDelete/action",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/filter/action",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/manageOwnership/action",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/modifyPermissions/action",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/runAsSuperUser/action",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/immutableStorage/runAsSuperUser/action",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags/read",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags/write",
"Microsoft.Storage/storageAccounts/fileServices/readFileBackupSemantics/action",
"Microsoft.Storage/storageAccounts/fileServices/writeFileBackupSemantics/action",
"Microsoft.Storage/storageAccounts/fileServices/takeOwnership/action",
"Microsoft.Storage/storageAccounts/fileServices/fileshares/files/read",
"Microsoft.Storage/storageAccounts/fileServices/fileshares/files/write",
"Microsoft.Storage/storageAccounts/fileServices/fileshares/files/delete",
"Microsoft.Storage/storageAccounts/queueServices/queues/messages/read",
"Microsoft.Storage/storageAccounts/queueServices/queues/messages/write",
"Microsoft.Storage/storageAccounts/queueServices/queues/messages/delete",
"Microsoft.Storage/storageAccounts/tableServices/tables/entities/read",
"Microsoft.Storage/storageAccounts/tableServices/tables/entities/write",
"Microsoft.Storage/storageAccounts/tableServices/tables/entities/delete"
],
"notDataActions": []
}
]
}
}
Update the code block to use the correct resource group in
assignableScopes
:
"assignableScopes": ["/subscriptions/<SUBSCRIPTION ID>/resourceGroups/<RESOURCE GROUP NAME>"]
Once completed, click Review + Create.
Create Runtime Service Principal¶
In the Azure portal, select the App Registration service and click New Registration.
Create an app with a name following your policies eg:
iai-taskrunner-runtime-app
.Once the app is registered, generate a secret for it under the Certificates & Secrets tab.
You will need the application id and secret when creating a taskrunner.
Role Assignments for Runtime Service Principal¶
In the Azure portal, select the Resource Groups service and click on the Resource Group created in Step 1.
Select the Access Control (IAM) section and click Add Role Assignment from the drop down.
Assign the following Job function Roles to the Runtime Service Principal
iai-taskrunner-runtime-app
:Reader
Storage Blob Data Contributor
Log Analytics Reader
Monitoring Metrics Publisher
iai-taskrunner-runtime (Custom Role created above)
Bind to an existing Azure Virtual Network (Optional)¶
Optional configuration: You can bind to an existing virtual network by creating a network and providing the subnet_id
in the Advanced settings section of the Azure task runner registration.
Create a Virtual Network in your resource group.
In the network properties click “Subnets” and add a subnet. The subnet delegation should be
Microsoft.ContainerInstance/containerGroups
.Record the
subnet ID
and provide it to the user who will create the task runner.
Note: You can bind to a network in a resource group different from the one used to create a taskrunner.
To do so:
The network resource group must be listed in the
assignableScopes
section of the runtime role.The Runtime service principal must be assigned runtime role on the virtual network in the resource group.
Creating a taskrunner with a custom Azure container registry (ACR) (Optional)¶
Optional configuration: You can host the task runner client image in an ACR provisioned in your environment. In order to do so, the Provisioner role needs additional permissions for the taskrunner to access an image in your custom ACR. These permissions are:
"permissions": [
{
"actions": [
"Microsoft.ContainerRegistry/registries/read",
"Microsoft.ContainerRegistry/registries/listCredentials/action",
"Microsoft.ContainerRegistry/registries/scopeMaps/read",
"Microsoft.ContainerRegistry/registries/scopeMaps/write",
"Microsoft.ContainerRegistry/registries/scopeMaps/delete",
"Microsoft.ContainerRegistry/registries/tokens/read",
"Microsoft.ContainerRegistry/registries/tokens/write",
"Microsoft.ContainerRegistry/registries/tokens/delete",
"Microsoft.ContainerRegistry/registries/tokens/operationStatuses/read",
"Microsoft.ContainerRegistry/registries/generateCredentials/action"
],
"notActions": [
],
"dataActions": [
],
"notDataActions": []
}
]
Additional access when custom ACR is in a Private Endpoint (Optional)¶
Create a user managed identity in the resource group where the custom ACR has been created.
Assign ACRPull permission to the Managed Identity, with scope being limited to the ACR:
az role assignment create \
--assignee <managed_identity_id> \
--role AcrPull \
--scope /subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.ContainerRegistry/registries/<registry_name>
The task runner Runtime Service Principal must be assigned the built-in role Managed Identity Operator.
Supply the
resource_id
of the Managed Identity in the Advanced settings section of the Azure task runner registration.
This completes the Azure resource configuration for task runners.
Back to Quickstart or Continue to Workspace Administrator Workflow
Google Cloud Platform (GCP) Support¶
A GCP task runner consists of two parts:
A task runner registered as On-premise GCP Support through the Register task runner dialog in the integrate.ai workspace.
An on-premise service that is installed in a compatible environment, such as a virtual machine (VM) or bare metal machine.
You install the task runner agent through the integrate.ai command line tool (IAI CLI). In addition to the agent, a dedicated cluster is created in the integrate.ai infrastructure to maintain the status of the GCP agents, tasks, and logs.
Pre-requisites for task runner:
An GCP environment
A VM installed in the GCP environment
A storage bucket in the GCP environmnet
Register a GCP task runner¶
Task runners simplify the process of running training sessions on your data.
Step 1: To register a GCP task runner:
Log in to your integrate.ai workspace.
In the left navigation bar, click Settings.
Click Token Management.
Click Create new token and save the token for later use.
In the navigation bar, click Task Runners.
Click Register to start registering a new task runner.
Select On-Premise GCP Support.
Provide the following information:
Task runner name
- must be unique.Storage path
- enter the default storage path location, for example:gcs://mystorage/data/
. Note that this must be a path location and not simply a folder name.GCP service account key
- click in the box and paste in your GCP service account key information.
Note: To generate a GCP service account key, in your Google Cloud account, navigate to Service Accounts, select Actions > Manage Keys, then Add key. Create a new key.
Click Register. Wait for the installation to complete.
Step 2: Install the task runner agent:
Create a Python virtual env (venv) on the VM.
Agent installation must be done as root user. Run the command
sudo su
before installation.python3 -m venv /home/{installation dir}
cd
into the installation directory.Install
pip
if necessary.Install the IAI CLI tool:
pip install integrate-ai
.
Step 3: Register the VM instance as an agent for the GCP task runner created in Step 1:
At a command line on the VM, register the GCP node with the task runner using the following command:
iai onprem_node install
When prompted, provide the token you created in Step 1 (your IAI_TOKEN).
When prompted, provide the name of the task runner you created in Step 1.
Wait for registration to complete.
Deployment is now complete.
Back to Quickstart or Continue to Workspace Administrator Workflow