On Different Formulations of a Continuous CTA Model

Goran Lesaja, Ionut Iacob, Anna Oganian

Research output: Contribution to book or proceedingConference articlepeer-review

Abstract

In this paper, we consider a Controlled Tabular Adjustment (CTA) model for statistical disclosure limitation of tabular data. The goal of the CTA model is to find the closest safe (masked) table to the original table that contains sensitive information. The measure of closeness is usually measured using ℓ1 or ℓ2 norm. However, in the norm-based CTA model, there is no control of how well the statistical properties of the data in the original table are preserved in the masked table. Hence, we propose a different criterion of “closeness” between the masked and original table which attempts to minimally change certain statistics used in the analysis of the table. The Chi-square statistic is among the most utilized measures for the analysis of data in two-dimensional tables. Hence, we propose a Chi-square CTA model which minimizes the objective function that depends on the difference of the Chi-square statistics of the original and masked table. The model is non-linear and non-convex and therefore harder to solve which prompted us to also consider a modification of this model which can be transformed into a linear programming model that can be solved more efficiently. We present numerical results for the two-dimensional table illustrating our novel approach and providing a comparison with norm-based CTA models.

Original languageEnglish
Title of host publicationPrivacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2020, Proceedings
EditorsJosep Domingo-Ferrer, Krishnamurty Muralidhar
PublisherSpringer Science and Business Media Deutschland GmbH
Pages166-179
Number of pages14
ISBN (Print)9783030575205
DOIs
StatePublished - 2020
EventInternational Conference on Privacy in Statistical Databases, PSD 2020 - Tarragona, Spain
Duration: Sep 23 2020Sep 25 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12276 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Privacy in Statistical Databases, PSD 2020
Country/TerritorySpain
CityTarragona
Period09/23/2009/25/20

Keywords

  • Chi-square statistic
  • Controlled tabular adjustment models
  • Interior-point methods
  • Linear and non-linear optimization
  • Statistical disclosure limitation

Fingerprint

Dive into the research topics of 'On Different Formulations of a Continuous CTA Model'. Together they form a unique fingerprint.

Cite this