VFL Model Training Overview

VFL Model Training Overview

In a vertical federated learning (VFL) process, two or more parties collaboratively train a model using datasets that share a set of overlapping features. Each party has partial information about the overlapped subjects in the dataset. Therefore, before running a VFL training session, you must run a private record linkage (PRL) session to find the intersection and create alignment between datasets.

HFL vs VFL overview

In VFL, there are two types of parties participating in the training:

  • The Active Party owns the labels, and may or may not also contribute data.

  • The Passive Party contributes only data.

VFL overview

Key features of VFL:

  • Data never leaves client nodes.

  • Data from clients have different features, but overlapping records. One client holds the label.

  • Computation happens client-side, server does data linkage, and passing info between parties.

  • After training, each client holds a partial model. The final model prediction relies on all parties.

VFL example use case

An example of a VFL use case is Enhancing an Insurance Flood Severity Model.

An insurance carrier wants to predict the severity of flood incidents at given property locations. The carrier wants to integrate proprietary features from a data vendor - such as Replacement Cost, Roof Condition, and Inspection Score - to improve their model.

Available VFL models

integrate.ai suports the following model types:

Horizontal Federated Learning (HFL) is also supported. Click here to learn more.