This doc summarizes our assumptions and requirements for using the DP Building Block Libraries in a safe way. We assume that an attacker does not have direct access to the raw user data, has limited abilities to inject data into the dataset, and has limited visibility into the resources consumed by the DP Libraries.
The DP Building Block Libraries provide differentially private output. In layperson's terms, this means that an attacker should only be able to get very limited additional knowledge from the output of the DP library. An upper bound on the amount of knowledge obtained from the output is configurable via the DP parameters epsilon and delta.
There are two intended use-cases of the library:
We assume that the DP Library is executed on trusted compute nodes. This means that clients must trust the hardware and any process that is running on the same node. If an attacker can control any process on these nodes, then the attack surface is much larger than just via the DP libraries, since the node has access to the raw user data (whether via the network or on a hard disc).
We assume that the DP Library is executed in batch mode. After every run, the output is eventually published to a wider audience and accessible to the attacker.
The attacker does not have access to the raw user data, otherwise there is nothing left to protect.
The attacker might have knowledge about a subset of the raw user data:
The attacker can forge a very large number or even all contributions to the raw input data set. However, from the output of the DP Libraries, it should be impossible for the attacker to learn whether there were any other entries in the dataset (i.e. whether all contributions were forged).
Note: Depending on the application logic, there can be some mitigations against malicious data, e.g., applying rounding and/or enforcing a typical number of contributions per privacy unit.
The attacker can control the sequence in which user data is passed to the DP Library, including the case where this data is malicious and attacker-controlled. The DP Library protects against attacks that target the non-associativity of floating point arithmetic and similar attacks using the order of events.
The execution is hidden from the attacker. In particular, we assume that the attacker does not have additional information about: