UncertaintyBase¶
-
class
stats_arrays.
UncertaintyBase
¶ Bases:
object
Abstract base class for uncertainty types.
All methods on uncertainty classes should be class methods, as instantiating uncertainty classes many times is not desired.
Defaults
default_number_points_in_pdf
: 200. The default number of points to calculate for PDF/CDF functions.standard_deviations_in_default_range
: 3. The number of standard deviations that define the default range when calculating PDF/CDF values. In a normal distribution, 3 standard deviations is approximately 99% of all values.
-
classmethod
bounded_random_variables
(params, size, seeded_random=None, maximum_iterations=50)¶ Generate random variables repeatedly until all varaibles are within the bounds of each distribution. Raise MaximumIterationsError if this takes more that maximum_iterations. Uses random_variables for random number generation.
Inputs
- params : A Parameter array.
- size : Integer. The number of values to draw from each distribution in params.
- seeded_random : Integer. Optional. Random seed to get repeatable samples.
- maximum_iterations : Integer. Optional. Maximum iterations to try to fit the given bounds before an error is raised.
Output
An array of random values, with dimensions params rows by size.
-
classmethod
cdf
(params, vector)¶ Used when a distribution is bounded, to determine where to begin or end the percentages used in calculating hypercube sampling space.
Inputs
- params : A Parameter array.
- vector : A array of values taken from the uncertainty distributions, with one row or the same number of rows as params.
Output
An array of cumulative densities, bounded on (0,1), with params rows and vector columns.
-
classmethod
check_2d_inputs
(params, vector)¶ Convert
vector
to 2 dimensions if not already, and raisestats_arrays.InvalidParamsError
ifvector
andparams
dimensions don’t match.
-
classmethod
check_bounds_reasonableness
(params, *args, **kwargs)¶ Test if there is at least a
threshold
percent chance of generating random numbers within the provided bounds.Doesn’t return anything. Raises
stats_arrays.UnreasonableBoundsError
if this condition is not met.Inputs
- params : A one-row Parameter array.
- threshold : A percentage between 0 and 1. The minimum loc of the distribution covered by the bounds before an error is raised.
-
classmethod
from_dicts
(*dicts)¶ Construct a Heterogeneous parameter array from parameter dictionaries.
Dictionary keys are the normal parameter array columns. Each distribution defines which columns are required and which are optional.
Example:
>>> from stats_arrays import UncertaintyBase >>> import numpy as np >>> UncertaintyBase.from_dicts( ... {'loc': 2, 'scale': 3, 'uncertainty_type': 3}, ... {'loc': 5, 'minimum': 3, 'maximum': 10, 'uncertainty_type': 5} ... ) array([(2.0, 3.0, nan, nan, nan, False, 3), (5.0, nan, nan, 3.0, 10.0, False, 5)], dtype=[('loc', '<f8'), ('scale', '<f8'), ('shape', '<f8'), ('minimum', '<f8'), ('maximum', '<f8'), ('negative', '?'), ('uncertainty_type', 'u1')])
- Args:
- One of more dictionaries.
- Returns:
- A Heterogeneous parameter array
-
classmethod
from_tuples
(*data)¶ Construct a Heterogeneous parameter array from parameter tuples.
The order of the parameters is:
loc
scale
shape
minimum
maximum
negative
uncertainty_type
Each input tuple must have a length of exactly 7. For more flexibility, use
from_dicts
.Example:
>>> from stats_arrays import UncertaintyBase >>> import numpy as np >>> UncertaintyBase.from_tuples( ... (2, 3, np.NaN, np.NaN, np.NaN, False, 3), ... (5, np.NaN, np.NaN, 3, 10, False, 5) ... ) array([(2.0, 3.0, nan, nan, nan, False, 3), (5.0, nan, nan, 3.0, 10.0, False, 5)], dtype=[('loc', '<f8'), ('scale', '<f8'), ('shape', '<f8'), ('minimum', '<f8'), ('maximum', '<f8'), ('negative', '?'), ('uncertainty_type', 'u1')])
- Args:
- One of more tuples of length 7.
- Returns:
- A Heterogeneous parameter array
-
classmethod
pdf
(params, *args, **kwargs)¶ Provide a standard interface to calculate the probability distribution function of a uncertainty distribution. Default is cls.default_number_points_in_pdf points between min to max range if bounds are present, or cls.standard_deviations_in_default_range standard distributions.
Inputs
- params : A one-row Parameter array.
- xs : Optional. A one-dimensional numpy array of input values.
Output
Important
The output format for PDF is different than CDF or PPF.
A tuple of a vactor x values and a vector of y values. Y values are a one-dimensional array of probability densities, bounded on (0,1), with length xs, if provided, or cls.default_number_points_in_pdf.
-
classmethod
ppf
(params, percentages)¶ Return percent point function (inverse of CDF, e.g. value in distribution where x percent of the distribution is less than value) for various distributions.
Inputs
- params : A Parameter array.
- percentages : An array of percentages, bounded on (0,1). Each row in percentages corresponds to a row in params.
Output
An array of values within the ranges of each distribtion, with params rows and percentages columns.
-
classmethod
random_variables
(params, size, seeded_random=None)¶ Generate random variables for the given uncertainty. Should not check to ensure that random samples are with the (minimum, maximum bounds). Bounds checking is provided by the bounded_random_variables class method.
Inputs
- params : A Parameter array.
- size : Integer. The number of values to draw from each distribution in params.
- seeded_random : Integer. Optional.
Output
An array of random values, with dimensions params rows by size.
-
classmethod
statistics
(params, *args, **kwargs)¶ Build a dictionary of mean, mode, median, and 95% confidence interval upper and lower values.
Inputs
- params : A one-row Parameter array.
Output
{‘mean’: mean value, ‘mode’: mode value, ‘median’: median value, ‘upper’: upper limit value, ‘lower’: lower limit value}. All values should be floats (not single-element arrays). Parameters that are not defined should be returned None, not omitted.
-
classmethod
validate
(params)¶ Validate the parameter array for uncertainty distribution.
Validation is distribution specific. The only default check is that
minimum
is less than or equal tomaximum
, and otherwise raisesstats_arrays.ImproperBoundsError
.Doesn’t return anything.
- Args:
- A Parameter array.
BoundedUncertaintyBase¶
-
class
stats_arrays.
BoundedUncertaintyBase
¶ Bases:
stats_arrays.distributions.base.UncertaintyBase
An uncertainty distribution where minimum and maximum bounds are required. No bounds checking is required for these distributions, as bounds are integral inputs into the sample space generator.
-
classmethod
bounded_random_variables
(params, size, seeded_random=None, maximum_iterations=None)¶ No bounds checking because the bounds do not exclude any of the distribution.
-
classmethod
cdf
(params, vector)¶ Used when a distribution is bounded, to determine where to begin or end the percentages used in calculating hypercube sampling space.
Inputs
- params : A Parameter array.
- vector : A array of values taken from the uncertainty distributions, with one row or the same number of rows as params.
Output
An array of cumulative densities, bounded on (0,1), with params rows and vector columns.
-
classmethod
check_2d_inputs
(params, vector)¶ Convert
vector
to 2 dimensions if not already, and raisestats_arrays.InvalidParamsError
ifvector
andparams
dimensions don’t match.
-
classmethod
check_bounds_reasonableness
(params, *args, **kwargs)¶ Always true because the bounds do not exclude any of the distribution.
-
classmethod
from_dicts
(*dicts)¶ Construct a Heterogeneous parameter array from parameter dictionaries.
Dictionary keys are the normal parameter array columns. Each distribution defines which columns are required and which are optional.
Example:
>>> from stats_arrays import UncertaintyBase >>> import numpy as np >>> UncertaintyBase.from_dicts( ... {'loc': 2, 'scale': 3, 'uncertainty_type': 3}, ... {'loc': 5, 'minimum': 3, 'maximum': 10, 'uncertainty_type': 5} ... ) array([(2.0, 3.0, nan, nan, nan, False, 3), (5.0, nan, nan, 3.0, 10.0, False, 5)], dtype=[('loc', '<f8'), ('scale', '<f8'), ('shape', '<f8'), ('minimum', '<f8'), ('maximum', '<f8'), ('negative', '?'), ('uncertainty_type', 'u1')])
- Args:
- One of more dictionaries.
- Returns:
- A Heterogeneous parameter array
-
classmethod
from_tuples
(*data)¶ Construct a Heterogeneous parameter array from parameter tuples.
The order of the parameters is:
loc
scale
shape
minimum
maximum
negative
uncertainty_type
Each input tuple must have a length of exactly 7. For more flexibility, use
from_dicts
.Example:
>>> from stats_arrays import UncertaintyBase >>> import numpy as np >>> UncertaintyBase.from_tuples( ... (2, 3, np.NaN, np.NaN, np.NaN, False, 3), ... (5, np.NaN, np.NaN, 3, 10, False, 5) ... ) array([(2.0, 3.0, nan, nan, nan, False, 3), (5.0, nan, nan, 3.0, 10.0, False, 5)], dtype=[('loc', '<f8'), ('scale', '<f8'), ('shape', '<f8'), ('minimum', '<f8'), ('maximum', '<f8'), ('negative', '?'), ('uncertainty_type', 'u1')])
- Args:
- One of more tuples of length 7.
- Returns:
- A Heterogeneous parameter array
-
classmethod
pdf
(params, *args, **kwargs)¶ Provide a standard interface to calculate the probability distribution function of a uncertainty distribution. Default is cls.default_number_points_in_pdf points between min to max range if bounds are present, or cls.standard_deviations_in_default_range standard distributions.
Inputs
- params : A one-row Parameter array.
- xs : Optional. A one-dimensional numpy array of input values.
Output
Important
The output format for PDF is different than CDF or PPF.
A tuple of a vactor x values and a vector of y values. Y values are a one-dimensional array of probability densities, bounded on (0,1), with length xs, if provided, or cls.default_number_points_in_pdf.
-
classmethod
ppf
(params, percentages)¶ Return percent point function (inverse of CDF, e.g. value in distribution where x percent of the distribution is less than value) for various distributions.
Inputs
- params : A Parameter array.
- percentages : An array of percentages, bounded on (0,1). Each row in percentages corresponds to a row in params.
Output
An array of values within the ranges of each distribtion, with params rows and percentages columns.
-
classmethod
random_variables
(params, size, seeded_random=None)¶ Generate random variables for the given uncertainty. Should not check to ensure that random samples are with the (minimum, maximum bounds). Bounds checking is provided by the bounded_random_variables class method.
Inputs
- params : A Parameter array.
- size : Integer. The number of values to draw from each distribution in params.
- seeded_random : Integer. Optional.
Output
An array of random values, with dimensions params rows by size.
-
classmethod
rescale
(params)¶ Rescale params to a (0,1) interval. Return adjusted_means and scale. Needed because SciPy assumes a (0,1) interval for many distributions.
-
classmethod
statistics
(params, *args, **kwargs)¶ Build a dictionary of mean, mode, median, and 95% confidence interval upper and lower values.
Inputs
- params : A one-row Parameter array.
Output
{‘mean’: mean value, ‘mode’: mode value, ‘median’: median value, ‘upper’: upper limit value, ‘lower’: lower limit value}. All values should be floats (not single-element arrays). Parameters that are not defined should be returned None, not omitted.
-
classmethod