Fast and provable unlearning for split models via orthogonal gradient decomposition

Presentation Type

Abstract

Faculty Advisor

Chao Huang

Access Type

Event

Start Date

25-4-2025 10:30 AM

End Date

25-4-2025 11:29 AM

Description

Unlearning is the process of removing the influence of specific data points from a trained machine learning model. As data privacy regulations become more stringent, it is increasingly important to ensure that personal data can be effectively removed from existing models. While retraining from scratch is the simplest approach, it is computationally expensive and inefficient for real-world systems. Split learning introduces new challenges to unlearning, as the model is partitioned between a client and a server, and updates are distributed across both. This work investigates unlearning in split learning environments and presents a state-of-the-art algorithm that leverages orthogonalization and batch-level processing. By caching the update that each batch contributes to both the client and server layers, we were able to eliminate the effect of individual data points by subtracting the orthogonal component of their gradients. Our experiments showed that this unlearning approach was significantly faster than retraining, while successfully removing nearly all traces of the forgotten data. Additionally, we provided a mathematical proof of convergence for our proposed method.

Comments

Poster presentation at the 2025 Student Research Symposium.

This document is currently not available here.

Share

COinS
 
Apr 25th, 10:30 AM Apr 25th, 11:29 AM

Fast and provable unlearning for split models via orthogonal gradient decomposition

Unlearning is the process of removing the influence of specific data points from a trained machine learning model. As data privacy regulations become more stringent, it is increasingly important to ensure that personal data can be effectively removed from existing models. While retraining from scratch is the simplest approach, it is computationally expensive and inefficient for real-world systems. Split learning introduces new challenges to unlearning, as the model is partitioned between a client and a server, and updates are distributed across both. This work investigates unlearning in split learning environments and presents a state-of-the-art algorithm that leverages orthogonalization and batch-level processing. By caching the update that each batch contributes to both the client and server layers, we were able to eliminate the effect of individual data points by subtracting the orthogonal component of their gradients. Our experiments showed that this unlearning approach was significantly faster than retraining, while successfully removing nearly all traces of the forgotten data. Additionally, we provided a mathematical proof of convergence for our proposed method.