Quickstart

The following guide walks you through getting started with integrate.ai in 4 easy steps:

  1. Set up your Workspace

  2. Set up your Cloud environment

  3. Register datasets

  4. Connect your data

1. Set up your Workspace

A workspace represents a private customer-controlled environment within integrate.ai. Depending on their roles, users in a workspace can register data products and run evaluation jobs on that data.

Note: Keep track of your workspace-name: it is the text located after the “login-” component of your workspace URL (e.g., login-<workspace-name>.integrateai.net). Reach out to your Customer Success Manager if you need to confirm your workspace-name.

Activate Workspace Administrator

To set up your Administrator account, first provide your integrate.ai Customer Success Manager with the full name and work email address of your Admin user. As Admin you can add and remove users, and have access to all controls and functionalities of the integrate.ai platform.

Once a Customer Success Manager has created an account for you, you will receive an invitation email from support@mail.integrateai.net to activate your account.

Follow the instructions provided in the email to complete activation.

Invite additional users

The Administrator can decide which users can access their workspace and specify their roles.

The current user roles are: Administrator, Data Custodian, and Model Builder. These roles are defined below.

  • Data Custodian: only has the ability to register datasets to the workspace.

  • Model Builder: has the ability to register datasets, and run data science and evaluation jobs on these datasets in the workspace.

  • Administrator: has full access to all platform capabilities within the workspace, including the ability to add, remove, or modify users and their role; add or remove datasets; and run data science and evaluation jobs on the registered datasets.

Follow these steps to invite additional users to your workspace:

  1. In the side navigation bar, click Settings -> Members -> Invite Members.

  2. Enter the details of the user and select the appropriate role.

  3. Click Invite member at the bottom of the popup. An email is sent to the member to activate their account.

2. Set up your Cloud Environment

It’s now time to set up your cloud environment and register a task runner. Once a task runner has been registered, you can easily share your data product without having to move it out of your existing cloud environment.

Follow the steps listed in the IT Administrator Workflow section to set up your appropriate cloud environment.

Register a task runner

You can now register a task runner in your workspace so that you can register a dataset with it.

To do so, log in to the workspace with your account and complete the following steps:

  • In the side navigation bar, click Settings -> Task Runners -> Register.

  • In the pop-up, select the appropriate cloud that matches your infrastructure:

3. Register datasets

Registering your data allows you and other partners to collaborate on data science tasks.

Prepare your data product(s)

Before registering your data, you must prepare your data product to enable evaluation jobs.

Your data product(s) must be engineered to meet the the criteria laid out in Data Requirements.

Note: If your data is already engineered out-of-the-box, skip to Register your data producst(s) below. However, if you need to do additional processing, it would be most useful for the Data Consumer to keep the original columns, and append the feature-engineered column with this nomenclature: <original_column_name>_processed.

Once the above criteria are met, your data is ready for registration.

Register your data product(s)

Next, register your data through your workspace by selecting a task runner, specifying the dataset URI, and uploading associated metadata.

After registration, your dataset is available in the Library page.

Data Provider Quickstart

Once your data product(s) are registered, your final step is to complete evaluation template notebooks and connect a data partner, so that they can conduct an evaluation of your data.

Complete evaluation template

Data Consumers perform data evaluation with integrate.ai to understand how valuable a Data Provider’s data product could be to their data science task(s) or business problem(s).

There are two core questions Data Consumers typically seek to answer when evaluating a model or data product:

  1. How much data from a data product is usable in reference to my internal data?

  2. How relevant or useful is the data product to my data science task, use case, or project?

integrate.ai’s evaluation templates enable you to provide a simple, guided method for your customers (i.e., Data Consumers) to answer these evaluation questions, by performing data science jobs on your data product(s).

Creating evaluation templates

You can collaborate with your Customer Success Manager to build out evaluation templates on your data products.

Follow these steps to start building out your templates:

  1. Provide the following inputs to your Customer Success Manager:

    • A definition of the data products (number of datasets, dictionaries, and supporting material)

    • List of target variables that the data products can be used for

    • Data product target variable mapping (see example table below)

Data product target variable mapping example:
|             | Feature 1   | Feature 2 | Feature N |
| ----------- | ----------- | ----------| ----------|
| Target 1    |     X       |      X    |           |
| Target 2    |     X       |      X    |           |
| Target N    |             |           |       X   |
  1. integrate.ai will provide you with with pre-built evaluation templates in the form of a Jupyter notebook, using the information above.

  2. Configure the template for your data product by following the instructions in the template.

  3. Upload these notebooks as part of your data product registration step to make them available to Data Consumers.


4. Data Connection

Connect a partner to your data

As a data provider, you register your datasets in your workspace to make them available to your own company. You can also connect a partner or data consumer to your data to allow them to evaluate it in secure, private, federated learning sessions.

To connect a partner:

  1. In your Library, click a dataset name to select it.

  2. On the upper right of the dataset description, click Connect Partner.

  3. On the Connect data pop-up, fill in the following information:

    a. Email - address of the partner you want to connect with.

    b. Workspace name - the name of your partner’s workspace. Contact your Customer Success Manager to confirm this value.

    c. Message - the connecting message is auto-generated. You can customize it or add information.

  4. Click Connect to send the email.

The data consumer clicks a link in the email they recieve to view the connected dataset in their Library.

Viewing data connections

A dataset with a connection has an icon on its description in the Library.

Click the dataset to see a list of connections in the dataset description (bottom of the right panel).

Note: Connected data consumers cannot see any other information about the connection. They cannot see other connections to the same dataset.

Disconnecting data

As the data provider, you retain full control over who is connected to your data.

To disconnect a data consumer:

  1. Click the dataset in the Library.

  2. Locate the connection at the bottom of the dataset description.

  3. Click the trash can icon to disconnect the consumer from the data.

  4. Confirm that you want to disconnect.