2 Probability Distributions

This section describes the distribution types and operations supported by gamble.

2.1 Distribution Operations

procedure
(dist? v) → boolean?
v : any/c

Returns #t if v is a distribution object, #f otherwise.

procedure
(integer-dist? v) → boolean?
v : any/c

Returns #t if v is a distribution object whose support is intrinsically integer-valued, #f otherwise.

The distribution types for which integer-dist? returns true consist of exactly the ones listed in Integer Distribution Types. A discrete distribution whose values happen to be integers is not considered an integer distribution.

procedure
(finite-dist? v) → boolean?
v : any/c

Returns #t if v is a distribution object whose support is finite, #f otherwise.

The distribution types for which finite-dist? returns true consist of exactly the ones listed in Integer Distribution Types and Discrete Distribution Type.

procedure
(real-dist? v) → boolean?
v : any/c

Returns #t if v is a distribution object whose support is a real interval, #f otherwise.

The distribution types for which real-dist? returns true consist of exactly the ones listed in Real Distribution Types. A discrete distribution whose values happen to be real numbers is not considered a real distribution. A distribution whose values consist of real vectors, such as a Dirichlet distribution, is not considered real-valued.

procedure
(dist-pdf d v [log?]) → real?
  d : dist?
  v : any/c
  log? : any/c = #f

Returns the probability density (or mass, as appropriate) of the value v in distribution d. If log? is true, the log density (or log mass) is returned instead.

procedure
(dist-cdf d v [log? 1-p?]) → real?
  d : dist?
  v : any/c
  log? : any/c = #f
  1-p? : any/c = #f

Returns the cumulative probability density (or mass, as appropriate) of the value v in distribution d—that is, the probability that a random variable X distributed as d satisfies (<= X v). If 1-p? is true, then the probability of (> X v) is returned instead.

If log? is true, then the log probability is returned instead of the probability.

This function applies only to integer-valued (integer-dist?) and real-valued (real-dist?) distributions.

procedure
(dist-inv-cdf d p [log? 1-p?]) → any/c
  d : dist?
  p : (real-in 0 1)
  log? : any/c = #f
  1-p? : any/c = #f

Returns the inverse of the CDF of d at p. If log? is true, then the inverse at (exp p) is used instead. If 1-p? is true, then the inverse at (- 1 p) is used instead.

procedure
(dist-sample d) → any/c
d : dist?

Produces a sample distributed according to d. This is an unmanaged stochastic effect, like calling Racket’s random function.

Do not use dist-sample within a sampler/solver; use sample instead.

procedure
(in-dist d) → sequence?
d : finite-dist?

Returns a sequence where each element consists of two values: a value from the support of the distribution and its probability mass. The weights are normalized to 1, even if d is unnormalized.

Example:

> (for ([(v p) (in-dist (bernoulli-dist 1/3))])
(printf "Result ~s has probability ~s.\n" v p))
Result 0 has probability 2/3.
Result 1 has probability 1/3.

procedure
(in-measure d) → sequence?
d : finite-dist?

Like in-dist, but without normalizing the weights.

procedure
(dist-mean d) → (or/c any/c +nan.0 #f)
  d : dist?
procedure
(dist-median d) → (or/c any/c +nan.0 #f)
  d : dist?
procedure
(dist-variance d) → (or/c any/c +nan.0 #f)
  d : dist?

Returns the mean, median, or variance of the distribution d, respectively.

If the distribution is integer-valued or real-valued, the statistic is a real number. Other kinds of distribution may have other types for these statistics.

A return value of +nan.0 indicates that the statistic is known to be undefined.

A return value of #f indicates that the statistic is unknown; it may not be defined, it may be infinite, or the calculation might not be implemented.

procedure
(dist-modes d) → (or/c list? #f)
d : dist?

Returns the modes of the distribution d.

A return value of '() indicates that the distribution has no mode. A return value of #f indicates that the statistic is unknown; it may not be defined, it may be infinite, or the calculation might not be implemented.

procedure
(dist-energy d x) → real?
d : dist?
x : any/c

Returns the value of the “energy” function of d evaluated at x. Minimizing energy is equivalent to maximizing likelihood.

Equivalent to (- (log (dist-pdf d x))).

Examples:

> (define N (normal-dist 0 1))
> (dist-energy N 5)
13.418938533204672
> (dist-energy N 1)
1.4189385332046727
> (dist-energy N 0)
0.9189385332046728
> (dist-energy N -1)
1.4189385332046727

procedure
(dist-Denergy d x [dx/dt dparam/dt ...]) → real?
  d : dist?
  x : real?
  dx/dt : real? = 1
  dparam/dt : real? = 0

Returns the value at x of the derivative of d’s energy function.

Examples:

> (define N (normal-dist 0 1))
> (dist-Denergy N 5)
5.0
> (dist-Denergy N 1)
1.0
> (dist-Denergy N 0)
-0.0
> (dist-Denergy N -1)
-1.0

The parameters of d and the position x are considered functions of a hypothetical variable t, and the derivative is taken with respect to t. Thus by varying dx/dt and the dparam/dts, mixtures of the partial derivatives of energy with respect to the distribution’s parameters can be recovered.

Examples:

> (define N (normal-dist 0 1))
> (dist-Denergy N 5 1 0 0) ; ∂energy/∂x
5.0
> (dist-Denergy N 5 0 1 0) ; ∂energy/∂μ; μ is 1st param of normal-dist
-5.0
> (dist-Denergy N 5 0 0 1) ; ∂energy/∂σ; σ is 2nd param of normal-dist
-24.0

If the derivative is not defined (such as for non-continuous distributions) or not implemented for distribution d, an exception is raised.

procedure
(dist-update-prior prior dist-pattern data) → (or/c dist? #f)
  prior : dist?
  dist-pattern : any/c
  data : vector?

Returns a distribution representing a closed-form solution to Bayes’ Law applied to a distribution matching dist-pattern whose parameter is distributed according to prior and incorporating data as evidence. The result is a distribution of the same type as prior. If prior is not a (known, implemented) conjugate prior for dist-pattern, #f is returned.

The dist-pattern is an S-expression consisting of a distribution type name (a symbol) and the distribution parameters, where the parameter distributed according to prior is indicated by the symbol '_.

For example, if the prior of a Bernoulli distribution’s success probability is (beta-dist 1 1), and three successes are observed, the posterior can be calculated with:

> (dist-update-prior (beta-dist 1 1) '(bernoulli-dist _) (vector 1 1 1))
(beta-dist 4.0 1.0)

If one success and two failures were observed, the posterior is

> (dist-update-prior (beta-dist 1 1) '(bernoulli-dist _) (vector 1 0 0))
(beta-dist 2.0 1.0)

Here is another example where the mean of a normal distribution is also normally distributed, but the standard deviation is fixed.

> (dist-update-prior (normal-dist 10 1) '(normal-dist _ 1) (vector 9))
(normal-dist 9.5 0.7071067811865476)
> (dist-update-prior (normal-dist 10 1) '(normal-dist _ 0.5) (vector 9))
(normal-dist 9.2 0.4472135954999579)

2.2 Integer Distribution Types

struct
(struct bernoulli-dist (p))
p : (real-in 0 1)

Represents a Bernoulli distribution with success probability p.

struct
(struct binomial-dist (n p))
n : exact-positive-integer?
p : (real-in 0 1)

Represents a binomial distribution of n trials each with success probability p.

struct
(struct categorical-dist (weights))
weights : (vectorof (>=/c 0))

Represents a categorical distribution (sometimes called a discrete distribution, multinomial distribution, or multinoulli distribution).

struct
(struct geometric-dist (p))
p : (real-in 0 1)

Represents a geometric distribution.

struct
(struct poisson-dist (mean))
mean : (>/c 0)

Represents a Poisson distribution with mean mean.

2.3 Real Distribution Types

struct
(struct beta-dist (a b))
a : (>=/c 0)
b : (>=/c 0)

Represents a beta distribution with shape a and scale b.

struct
(struct cauchy-dist (mode scale))
mode : real?
scale : (>/c 0)

Represents a Cauchy distribution with mode mode and scale scale.

struct
(struct exponential-dist (mean))
mean : (>/c 0)

Represents an exponential distribution with mean mean.

Note: A common alternative parameterization uses the rate λ = (/ mean).

struct
(struct gamma-dist (shape scale))
shape : (>/c 0)
scale : (>/c 0)

Represents a gamma distribution with shape (k) shape and scale (θ) scale.

Note: A common alternative parameterization uses α = shape and rate β = (/ scale).

struct
(struct inverse-gamma-dist (shape scale))
shape : (>/c 0)
scale : (>/c 0)

Represents an inverse-gamma distribution, the reciprocals of whose values are distributed according to (gamma-dist shape scale).

struct
(struct logistic-dist (mean scale))
mean : real?
scale : (>/c 0)

Represents a logistic distribution with mean mean and scale scale.

struct
(struct normal-dist (mean stddev))
mean : real?
stddev : (>/c 0)

Represents a normal (Gaussian) distribution with mean (μ) mean and standard deviation (σ) stddev.

Note: A common alternative parameterization uses the variance σ2.

struct
(struct pareto-dist (scale shape))
scale : (>/c 0)
shape : (>/c 0)

Represents a Pareto distribution.

struct
(struct t-dist (degrees mode scale))
  degrees : (>/c 0)
  mode : real?
  scale : (>/c 0)

Represents a Student’s t distribution.

struct
(struct uniform-dist (min max))
min : real?
max : real?

Represents a uniform distribution with lower bound min and upper bound max.

2.4 Real Distribution Transformers

The following constructors take and produce real-valued distributions.

struct
(struct affine-distx (dist a b))
  dist : real-dist?
  a : real?
  b : real?

Represents the distribution of (+ (* a t) b) where t is distributed according to dist.

struct
(struct clip-distx (dist a b))
  dist : real-dist?
  a : real?
  b : real?

Clips dist to the closed interval [a, b].

If the interval is small, the clipped dist is sampled using the dist-inv-cdf method of dist; otherwise, rejection sampling is used.

struct
(struct exp-distx (dist))
dist : real-dist?

Represents the distribution of (exp t) where t is distributed according to dist.

struct
(struct log-distx (dist))
dist : real-dist?

Represents the distribution of (log t) where t is distributed according to dist, whose support should include only the non-negative reals.

2.5 Vector Distribution Types

struct
(struct dirichlet-dist (alpha))
alpha : (vectorof (>/c 0))

Represents a Dirichlet distribution. The support consists of vectors of the same length as alpha whose elements are nonnegative reals summing to 1.

struct
(struct multinomial-dist (n weights))
n : exact-nonnegative-integer?
weights : (vectorof (>=/c 0))

Represents a multinomial distribution. The support consists of vectors of the same length as weights representing counts of n iterated samples from the corresponding categorical distribution with weights for weights.

struct
(struct permutation-dist (n))
n : exact-nonnegative-integer?

Returns a uniform distribution over permutations of n elements, where a permutation is represented by a vector containing each of the integers from 0 to (sub1 n) exactly once.

2.6 Multivariate Distribution Types

struct
(struct multi-normal-dist (mean cov))
mean : col-matrix?
cov : square-matrix?

Represents a multi-variate normal (Gaussian) distribution. The covariance matrix cov must be a square, symmetric, positive-definite matrix with as many rows as mean.

The support consists of column matrices having the same shape as mean.

struct
(struct wishart-dist (n V))
n : real?
V : square-matrix?

Represents a Wishart distribution with n degrees of freedom and scale matrix V. The scale matrix V must be a square, symmetric, positive-definite matrix.

The support consists of square, symmetric, positive-definite matrices having the same shape as V.

struct
(struct inverse-wishart-dist (n Vinv))
n : real?
Vinv : square-matrix?

Represents a Inverse-Wishart distribution with n degrees of freedom and scale matrix Vinv. The scale matrix Vinv must be a square, symmetric, positive-definite matrix.

If X is distributed according to (wishart-dist n V), then (matrix-inverse X) is distributed according to (inverse-wishart-dist n (matrix-inverse V)).

The support consists of square, symmetric, positive-definite matrices having the same shape as Vinv.

2.7 Discrete Distribution Type

A discrete distribution is a distribution whose support is a finite collection of arbitrary Racket values. Note: this library calls categorical-dist a distribution whose support consists of the integers {0, 1, ..., N}.

The elements of a discrete distribution are distinguished using equal?. The constructors for discrete distributions detect and coalesce duplicates.

procedure
(discrete-dist? v) → boolean?
v : any/c

Returns #t if v is a discrete distribution, #f otherwise.

syntax
(discrete-dist maybe-normalize
[value-expr weight-expr] ...)

maybe-normalize =
| #:normalize? normalize?-expr

weight-expr : (>=/c 0)

Produces a discrete distribution whose values are the value-exprs and whose probability masses are the corresponding weight-exprs.

Normalization affects printing and discrete-dist-weights, but not dist-pdf.

Example:

> (discrete-dist ['apple 1/2] ['orange 1/3] ['pear 1/6])
(discrete-dist ['apple 1/2] ['orange 1/3] ['pear 1/6])

procedure
(make-discrete-dist weighted-values
[ #:normalize? normalize?]) → discrete-dist?
weighted-values : dict?
normalize? : any/c = #t

Produces a discrete distribution from the dictionary weighted-values that maps values to weights.

Example:

> (make-discrete-dist '((apple . 1/2) (orange . 1/3) (pear . 1/6)))
(discrete-dist ['apple 1/2] ['orange 1/3] ['pear 1/6])

procedure
(make-discrete-dist* values
[ weights
#:normalize? normalize?]) → discrete-dist?
  values : vector?
  weights : (vectorof (>=/c 0)) = (vector 1 ...)
  normalize? : any/c = #t

Produces a discrete distribution with the values of values and weights of weights. The two vectors must have equal lengths.

Example:

> (make-discrete-dist* (vector 'apple 'orange 'pear)
(vector 1/2 1/3 1/6))
(discrete-dist ['apple 1/2] ['orange 1/3] ['pear 1/6])

procedure
(normalize-discrete-dist d) → discrete-dist?
d : discrete-dist?

Normalizes a discrete distribution. If d is already normalized, the function may return d.

procedure
(discrete-dist-values d) → vector?
d : discrete-dist?
procedure
(discrete-dist-weights d) → vector?
d : discrete-dist?

Returns the values and weights of d, respectively.

syntax
(discrete-measure [value-expr weight-expr] ...)

Equivalent to (discrete-dist #:normalize? #f [value-expr weight-expr] ...).

2.8 Finite Distributions as a Monad

The following operations, despite the dist- in the names, may produce unnormalized discrete distributions.

procedure
(dist-unit v) → finite-dist?
v : any/c

Returns a distribution with all probability mass concentrated on v.

procedure
(dist-bind d f) → finite-dist?
d : finite-dist?
f : (-> any/c finite-dist?)

Given a distribution d for random variable A and a probability kernel f for B given A, forms the joint probability for (A,B), then marginalizes out A, returning the marginal distribution for B.

Examples:

> (define (ground-wet raining)
    (case raining
      [(0) (bernoulli-dist 9/10)]
      [(1) (bernoulli-dist 1/5)]))
> (define raining-dist (bernoulli-dist 1/5))
; marginal probability of Ground Wet
> (dist-bind raining-dist ground-wet)
(discrete-dist [0 6/25] [1 19/25])

procedure
(dist-bindx d f) → finite-dist?
d : finite-dist?
f : (-> any/c finite-dist?)

Like dist-bind, but omits the marginalization step, returning the joint distribution.

Equivalent to (dist-bind d (λ (v1) (dist-fmap (f v1) (λ (v2) (list v1 v2))))).

Examples:

; joint distribution of (Raining, Ground Wet)
> (dist-bindx raining-dist ground-wet)
(discrete-dist ['(0 0) 2/25] ['(0 1) 18/25] ['(1 0) 4/25] ['(1 1) 1/25])

procedure
(dist-fmap d f) → finite-dist?
d : finite-dist?
f : (-> any/c any/c)

Equivalent to (dist-bind d (compose dist-unit f)).

procedure
(dist-filter d pred) → finite-dist?
d : finite-dist?
pred : (-> any/c boolean?)

Returns a distribution like d but whose support is narrowed to values accepted by the predicate pred.

Examples:

> (dist-filter (binomial-dist 10 1/2) even?)
(discrete-measure
[0 0.0009765625]
[2 0.0439453125]
[4 0.205078125]
[6 0.205078125]
[8 0.0439453125]
[10 0.0009765625])
> (dist-filter (binomial-dist 10 1/2) negative?)
(discrete-measure)

2.9 Defining New Probability Distributions

syntax
(define-dist-type dist-name ([field-id field-ctc] ...)
  #:pdf pdf-fun
  #:sample sample-fun
  dist-option ...)

dist-option = #:cdf cdf-fun
| #:inv-cdf inv-cdf-fun
| #:guard guard-fun
| #:extra [struct-option ...]
| #:no-provide

   field-ctc : contract?
   pdf-fun : (field-ctc ... any/c boolean? . -> . real?)
   cdf-fun : (field-ctc ... any/c boolean? boolean? . -> . real?)
   inv-cdf-fun : (field-ctc ... real? boolean? boolean? . -> . any/c)
   sample-fun : (field-ctc ... . -> . any/c)

Defines dist-name as a new distribution type. In particular, dist-name is a struct implementing the internal generic interface representing distributions.

At a minimum, the new distribution type must supply a probability density/mass function (pdf-fun) and a sampling function (sample-fun). The type may also supply a cumulative probability function (cdf-fun) and inverse (inv-cdf-fun). The signatures of the functions are like dist-pdf, dist-sample, etc, but instead of the distribution itself, they accept the fields of the distribution as the initial arguments.

The guard-fun is a struct guard function; see make-struct-type for details. Other struct options may be passed via the #:extra option. Distribution types are automatically transparent.

By default, dist-name, dist-name?, and the synthesized accessors are provided with the field-ctc contracts. The #:no-provide option disables automatic providing, in which case the field-ctcs are ignored.

← prev up next →

1	Introduction
2	Probability Distributions
3	Primitive Stochastic Functions
4	Derived Stochastic Forms and Functions
5	Samplers and Solvers
6	Utilities
7	Visualization Utilities
	Bibliography

2.1	Distribution Operations
2.2	Integer Distribution Types
2.3	Real Distribution Types
2.4	Real Distribution Transformers
2.5	Vector Distribution Types
2.6	Multivariate Distribution Types
2.7	Discrete Distribution Type
2.8	Finite Distributions as a Monad
2.9	Defining New Probability Distributions