tree: af0fb0d86b7013bcbf2cb03e6beeae8ff71f6d20 [path history] [tgz]
  1. main/
  2. tests/
  3. .bazelrc
  4. .bazelversion
  5. BUILD.bazel
  6. pom.template
  7. README.md
  8. WORKSPACE.bazel
pipelinedp4j/README.md

PipelineDP4j

PipelineDP4j is an end-to-end differential privacy solution for JVM that supports various frameworks for distributed data processing such as Apache Beam and Apache Spark (coming soon). It is intended to be usable by all developers, regardless of their differential privacy expertise.

Internally, PipelineDP4j relies on the lower-level building blocks from the differential privacy library and combines them into an “out-of-the-box” solution that takes care of all the steps that are essential to differential privacy, including noise addition, partition selection, and contribution bounding. Thus, rather than using the lower-level differential privacy library, it is recommended to use PipelineDP4j, as it can reduce implementation mistakes.

You can use PipelineDP4j in Java, Kotlin or Scala.

How to Build

Build the PipelineDP4j sources and dependencies using bazelisk (... is a part of the command and not a placeholder):

cd pipelinedp4j
bazelisk build ...

How to Use

Example

Familiarize yourself with an example. It shows how to compute differentially private statistics on a real dataset using the library. Also, the documentation explains how to run the library on Google Cloud.

The public API of the library is located in the API package. You can look at it if you need something beyond the example.

Use the library from Maven repository

The easiest way to start using the library in your project is to use the dependency from Maven repository. You can find it here. After adding this dependency into your project you can write the same code as in the example above and it will compile.