2 Probability Distributions
This section describes the distribution types and operations supported by gamble.
2.1 Distribution Operations
procedure
(integer-dist? v) → boolean?
v : any/c
The distribution types for which integer-dist? returns true consist of exactly the ones listed in Integer Distribution Types. A discrete distribution whose values happen to be integers is not considered an integer distribution.
procedure
(finite-dist? v) → boolean?
v : any/c
The distribution types for which finite-dist? returns true consist of exactly the ones listed in Integer Distribution Types and Discrete Distribution Type.
procedure
(real-dist? v) → boolean?
v : any/c
The distribution types for which real-dist? returns true consist of exactly the ones listed in Real Distribution Types. A discrete distribution whose values happen to be real numbers is not considered a real distribution. A distribution whose values consist of real vectors, such as a Dirichlet distribution, is not considered real-valued.
If log? is true, then the log probability is returned instead of the probability.
This function applies only to integer-valued (integer-dist?) and real-valued (real-dist?) distributions.
procedure
(dist-inv-cdf d p [log? 1-p?]) → any/c
d : dist? p : (real-in 0 1) log? : any/c = #f 1-p? : any/c = #f
procedure
(dist-sample d) → any/c
d : dist?
Do not use dist-sample within a sampler/solver; use sample instead.
procedure
d : finite-dist?
Example: | ||||||||
|
procedure
(in-measure d) → sequence?
d : finite-dist?
procedure
d : dist?
procedure
(dist-median d) → (or/c any/c +nan.0 #f)
d : dist?
procedure
(dist-variance d) → (or/c any/c +nan.0 #f)
d : dist?
If the distribution is integer-valued or real-valued, the statistic is a real number. Other kinds of distribution may have other types for these statistics.
A return value of +nan.0 indicates that the statistic is known to be undefined.
A return value of #f indicates that the statistic is unknown; it may not be defined, it may be infinite, or the calculation might not be implemented.
procedure
(dist-modes d) → (or/c list? #f)
d : dist?
A return value of '() indicates that the distribution has no mode. A return value of #f indicates that the statistic is unknown; it may not be defined, it may be infinite, or the calculation might not be implemented.
procedure
(dist-energy d x) → real?
d : dist? x : any/c
Equivalent to (- (log (dist-pdf d x))).
Examples: | |||||||||||
|
procedure
(dist-Denergy d x [dx/dt dparam/dt ...]) → real?
d : dist? x : real? dx/dt : real? = 1 dparam/dt : real? = 0
Examples: | |||||||||||
|
The parameters of d and the position x are considered functions of a hypothetical variable t, and the derivative is taken with respect to t. Thus by varying dx/dt and the dparam/dts, mixtures of the partial derivatives of energy with respect to the distribution’s parameters can be recovered.
Examples: | |||||||||
|
If the derivative is not defined (such as for non-continuous distributions) or not implemented for distribution d, an exception is raised.
procedure
(dist-update-prior prior dist-pattern data) → (or/c dist? #f)
prior : dist? dist-pattern : any/c data : vector?
The dist-pattern is an S-expression consisting of a distribution type name (a symbol) and the distribution parameters, where the parameter distributed according to prior is indicated by the symbol '_.
> (dist-update-prior (beta-dist 1 1) '(bernoulli-dist _) (vector 1 1 1)) (beta-dist 4.0 1.0)
> (dist-update-prior (beta-dist 1 1) '(bernoulli-dist _) (vector 1 0 0)) (beta-dist 2.0 1.0)
> (dist-update-prior (normal-dist 10 1) '(normal-dist _ 1) (vector 9)) (normal-dist 9.5 0.7071067811865476)
> (dist-update-prior (normal-dist 10 1) '(normal-dist _ 0.5) (vector 9)) (normal-dist 9.2 0.4472135954999579)
2.2 Integer Distribution Types
struct
(struct bernoulli-dist (p))
p : (real-in 0 1)
struct
(struct binomial-dist (n p))
n : exact-positive-integer? p : (real-in 0 1)
struct
(struct categorical-dist (weights))
weights : (vectorof (>=/c 0))
struct
(struct geometric-dist (p))
p : (real-in 0 1)
struct
(struct poisson-dist (mean))
mean : (>/c 0)
2.3 Real Distribution Types
struct
(struct cauchy-dist (mode scale))
mode : real? scale : (>/c 0)
struct
(struct exponential-dist (mean))
mean : (>/c 0)
Note: A common alternative parameterization uses the rate λ = (/ mean).
struct
(struct gamma-dist (shape scale))
shape : (>/c 0) scale : (>/c 0)
Note: A common alternative parameterization uses α = shape and rate β = (/ scale).
struct
(struct logistic-dist (mean scale))
mean : real? scale : (>/c 0)
struct
(struct normal-dist (mean stddev))
mean : real? stddev : (>/c 0)
Note: A common alternative parameterization uses the variance σ2.
struct
(struct pareto-dist (scale shape))
scale : (>/c 0) shape : (>/c 0)
struct
(struct uniform-dist (min max))
min : real? max : real?
2.4 Real Distribution Transformers
The following constructors take and produce real-valued distributions.
struct
(struct affine-distx (dist a b))
dist : real-dist? a : real? b : real?
struct
(struct clip-distx (dist a b))
dist : real-dist? a : real? b : real?
If the interval is small, the clipped dist is sampled using the dist-inv-cdf method of dist; otherwise, rejection sampling is used.
struct
dist : real-dist?
struct
dist : real-dist?
2.5 Vector Distribution Types
struct
(struct dirichlet-dist (alpha))
alpha : (vectorof (>/c 0))
struct
(struct multinomial-dist (n weights))
n : exact-nonnegative-integer? weights : (vectorof (>=/c 0))
struct
(struct permutation-dist (n))
n : exact-nonnegative-integer?
2.6 Multivariate Distribution Types
struct
(struct multi-normal-dist (mean cov))
mean : col-matrix? cov : square-matrix?
The support consists of column matrices having the same shape as mean.
struct
(struct wishart-dist (n V))
n : real? V : square-matrix?
The support consists of square, symmetric, positive-definite matrices having the same shape as V.
struct
(struct inverse-wishart-dist (n Vinv))
n : real? Vinv : square-matrix?
If X is distributed according to (wishart-dist n V), then (matrix-inverse X) is distributed according to (inverse-wishart-dist n (matrix-inverse V)).
The support consists of square, symmetric, positive-definite matrices having the same shape as Vinv.
2.7 Discrete Distribution Type
A discrete distribution is a distribution whose support is a finite collection of arbitrary Racket values. Note: this library calls categorical-dist a distribution whose support consists of the integers {0, 1, ..., N}.
The elements of a discrete distribution are distinguished using equal?. The constructors for discrete distributions detect and coalesce duplicates.
procedure
(discrete-dist? v) → boolean?
v : any/c
syntax
(discrete-dist maybe-normalize [value-expr weight-expr] ...)
maybe-normalize =
| #:normalize? normalize?-expr
weight-expr : (>=/c 0)
Normalization affects printing and discrete-dist-weights, but not dist-pdf.
Example: | ||
|
procedure
(make-discrete-dist weighted-values [ #:normalize? normalize?]) → discrete-dist? weighted-values : dict? normalize? : any/c = #t
Example: | ||
|
procedure
(make-discrete-dist* values [ weights #:normalize? normalize?]) → discrete-dist? values : vector? weights : (vectorof (>=/c 0)) = (vector 1 ...) normalize? : any/c = #t
Example: | ||||
|
procedure
d : discrete-dist?
procedure
(discrete-dist-values d) → vector?
d : discrete-dist?
procedure
(discrete-dist-weights d) → vector?
d : discrete-dist?
syntax
(discrete-measure [value-expr weight-expr] ...)
2.8 Finite Distributions as a Monad
The following operations, despite the dist- in the names, may produce unnormalized discrete distributions.
procedure
(dist-unit v) → finite-dist?
v : any/c
procedure
(dist-bind d f) → finite-dist?
d : finite-dist? f : (-> any/c finite-dist?)
Examples: | |||||||||||||
|
procedure
(dist-bindx d f) → finite-dist?
d : finite-dist? f : (-> any/c finite-dist?)
Equivalent to (dist-bind d (λ (v1) (dist-fmap (f v1) (λ (v2) (list v1 v2))))).
Examples: | |||
|
procedure
(dist-fmap d f) → finite-dist?
d : finite-dist? f : (-> any/c any/c)
procedure
(dist-filter d pred) → finite-dist?
d : finite-dist? pred : (-> any/c boolean?)
Examples: | |||||||||||
|
2.9 Defining New Probability Distributions
syntax
(define-dist-type dist-name ([field-id field-ctc] ...) #:pdf pdf-fun #:sample sample-fun dist-option ...)
dist-option = #:cdf cdf-fun | #:inv-cdf inv-cdf-fun | #:guard guard-fun | #:extra [struct-option ...] | #:no-provide
field-ctc : contract?
pdf-fun : (field-ctc ... any/c boolean? . -> . real?)
cdf-fun : (field-ctc ... any/c boolean? boolean? . -> . real?)
inv-cdf-fun : (field-ctc ... real? boolean? boolean? . -> . any/c)
sample-fun : (field-ctc ... . -> . any/c)
At a minimum, the new distribution type must supply a probability density/mass function (pdf-fun) and a sampling function (sample-fun). The type may also supply a cumulative probability function (cdf-fun) and inverse (inv-cdf-fun). The signatures of the functions are like dist-pdf, dist-sample, etc, but instead of the distribution itself, they accept the fields of the distribution as the initial arguments.
The guard-fun is a struct guard function; see make-struct-type for details. Other struct options may be passed via the #:extra option. Distribution types are automatically transparent.
By default, dist-name, dist-name?, and the synthesized accessors are provided with the field-ctc contracts. The #:no-provide option disables automatic providing, in which case the field-ctcs are ignored.