API Reference¶
This section provides a detailed overview of the functions, classes, and submodules available in the powerlawrs package.
High-Level API¶
These are the primary components you will interact with.
powerlawrs: A Python package for analyzing power-law distributions.
- class powerlawrs.Powerlaw(data)[source]¶
A class to fit and analyze power-law distributions in a given dataset.
Distributions (powerlawrs.dist)¶
Modules related to statistical distributions.
- class powerlawrs.dist.exponential.Exponential¶
A Python-compatible wrapper for the Exponential struct from the powerlaw crate.
Creates a new Exponential distribution instance.
- Parameters:
lambda (float) – The rate parameter of the distribution. Must be > 0.
x_min (float) – The minimum value of the distribution (scale parameter). Must be > 0.
Example:¶
import powerlawrs dist = powerlawrs.dist.exponential.Exponential(lambda=0.5, x_min=1.0) pdf_val = dist.pdf(2.0)
It does not contain any logic itself, but calls the underlying Rust implementation.
- ccdf(x)¶
Calls the underlying ccdf method from the powerlaw crate.
- cdf(x)¶
Calls the underlying cdf method from the powerlaw crate.
- loglikelihood(x)¶
Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.
- name()¶
Set the name of the distribution
- parameters()¶
Fitted distribution parameters
- pdf(x)¶
Calls the underlying pdf method from the powerlaw crate.
- rv(u)¶
Calls the underlying rv method from the powerlaw crate.
- Parameters:
u (float) – A random number from a Uniform(0, 1) distribution.
- class powerlawrs.dist.powerlaw.Powerlaw(alpha, x_min)¶
A Python-compatible wrapper for the Powerlaw struct from the powerlaw crate.
Represents a generic Power-Law distribution where the probability density function (PDF) is: f(x) = C * x^(-alpha)
This simplifies to a Pareto Type I distribution. Note: The alpha parameter here is the power-law exponent. It is equal to 1 + alpha_pareto, where alpha_pareto is the shape parameter of the standard Pareto Type I distribution.
- Parameters:
alpha (float) – The scaling exponent of the distribution. Must be > 1.
x_min (float) – The minimum value of the distribution. Must be > 0.
Example:¶
import powerlawrs # Create a distribution with exponent 2.5 (equivalent to Pareto alpha 1.5) dist = powerlawrs.dist.powerlaw.Powerlaw(alpha=2.5, x_min=1.0) pdf_val = dist.pdf(2.0)
- ccdf(x)¶
Calls the underlying ccdf method from the powerlaw crate.
- cdf(x)¶
Calls the underlying cdf method from the powerlaw crate.
- loglikelihood(x)¶
Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.
- name()¶
Set the name of the distribution
- parameters()¶
Fitted distribution parameters
- pdf(x)¶
Calls the underlying pdf method from the powerlaw crate.
- rv(u)¶
Calls the underlying rv method from the powerlaw crate.
- Parameters:
u (float) – A random number from a Uniform(0, 1) distribution.
- powerlawrs.dist.powerlaw.alpha_hat(data, x_min)¶
Python wrapper that calls the alpha_hat function from the powerlaw crate.
Calculates the maximum likelihood estimate (MLE) for the alpha parameter of a generic Power-Law distribution. This returns the power-law exponent (f(x) ~ x^-alpha).
- class powerlawrs.dist.lognormal.Lognormal(mu, sigma, x_min=0.0)¶
A Python-compatible wrapper for the Lognormal struct from the powerlaw crate.
The Lognormal distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed.
- Parameters:
mu (float) – The mean of the underlying normal distribution.
sigma (float) – The standard deviation of the underlying normal distribution. Must be > 0.
x_min (float, optional) – The minimum value of the distribution (truncation point). Defaults to 0.0.
Example:¶
import powerlawrs dist = powerlawrs.dist.lognormal.Lognormal(mu=0.0, sigma=1.0, x_min=1.0) pdf_val = dist.pdf(2.0)
- ccdf(x)¶
Calls the underlying ccdf method from the powerlaw crate.
- cdf(x)¶
Calls the underlying cdf method from the powerlaw crate.
- static from_fitment(data, fitment)¶
Creates a new Lognormal distribution by fitting it to data using the results of a Pareto fit.
This method uses the Newton-Raphson method to find the maximum likelihood estimates for mu and sigma, accounting for the truncation at x_min.
- Parameters:
data (list[float]) – The dataset to fit.
fitment (ParetoFit) – The result of a previous Pareto fit (used for x_min).
- Returns:
A new Lognormal instance with fitted parameters.
- Return type:
- loglikelihood(x)¶
Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.
- name()¶
Set the name of the distribution
- parameters()¶
Fitted distribution parameters
- pdf(x)¶
Calls the underlying pdf method from the powerlaw crate.
- rv(u)¶
Calls the underlying rv method from the powerlaw crate.
- Parameters:
u (float) – A random number from a Uniform(0, 1) distribution.
- class powerlawrs.dist.pareto.Pareto(alpha, x_min)¶
Bases:
objectA Python-compatible wrapper for the Pareto struct from the powerlaw crate.
Creates a new Pareto Type I distribution instance.
- Parameters:
alpha (float) – The shape parameter of the distribution. Must be > 0.
x_min (float) – The minimum value of the distribution (scale parameter). Must be > 0.
Example:¶
import powerlawrs dist = powerlawrs.dist.pareto.Pareto(alpha=2.5, x_min=1.0) pdf_val = dist.pdf(2.0)
It does not contain any logic itself, but calls the underlying Rust implementation.
- alpha¶
- ccdf(x)¶
Calls the underlying ccdf method from the powerlaw crate.
- cdf(x)¶
Calls the underlying cdf method from the powerlaw crate.
- static from_fitment(fitment)¶
Creates a Pareto distribution directly from a Fitment result.
This allows for a clean conversion from the results of a goodness-of-fit test to a concrete distribution instance.
- loglikelihood(x)¶
Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.
- name()¶
Set the name of the distribution
- parameters()¶
Fitted distribution parameters
- pdf(x)¶
Calls the underlying pdf method from the powerlaw crate.
- rv(u)¶
Calls the underlying rv method from the powerlaw crate.
- Parameters:
u (float) – A random number from a Uniform(0, 1) distribution.
- x_min¶
Statistics (powerlawrs.stats)¶
Modules for statistical analysis and random number generation.
- powerlawrs.stats.descriptive.mean(data)¶
Calculates the arithmetic mean of a vector.
Example
import powerlawrs data = [1.0, 2.0, 3.0, 4.0, 5.0] mu = powerlawrs.stats.descriptive.mean(data) # mu is 3.0
- powerlawrs.stats.descriptive.variance(data, ddof)¶
Calculates the variance of a vector where
ddof= degrees of freedom. Ifddof=1, the sample variance is returned otherwise the population variance is returned.Example
import powerlawrs data = [1.0, 2.0, 3.0, 4.0, 5.0] # Population variance (ddof=0) sigma_sq_pop = powerlawrs.stats.descriptive.variance(data, 0) # sigma_sq_pop is 2.0 # Sample variance (ddof=1) sigma_sq_samp = powerlawrs.stats.descriptive.variance(data, 1) # sigma_sq_samp is 2.5
- powerlawrs.stats.random.random_choice(data, size)¶
Sample
nelements with probability U(0,1) with replacement.Example
import powerlawrs data = [1.0, 2.0, 3.0, 4.0, 5.0] # Get 10 random samples from data with replacement samples = powerlawrs.stats.random.random_choice(data, 10) # len(samples) will be 10
- powerlawrs.stats.random.random_uniform(n)¶
Generate
nrandom variates from U(0,1).
- powerlawrs.stats.ks.ks_1sam_sorted(sorted_x, cdf_func)¶
1 sample KS test based on a known cdf.
- Parameters:
sorted_x (list[float]) – A list of data points, pre-sorted in ascending order.
cdf_func (callable) – A Python function (or lambda) that takes a single float (x) and returns its cumulative probability F(x) as a float.
- Returns:
A tuple containing (D+, D-, D).
- Return type:
tuple[float, float, float]
- Raises:
ValueError – If the list is empty.
TypeError – If the cdf_func does not return a float.
(Any exception) – Any exception raised by the cdf_func will be propagated.
Utilities (powerlawrs.util)¶
Helper functions and simulation tools.
- powerlawrs.util.erf(x)¶
Computes the error function of x, often denoted as erf(x).
The error function is a special function of sigmoid shape that occurs in probability, statistics, and partial differential equations. It is defined as:
\[\begin{split}\\mathrm{erf}(x) = \\frac{2}{\\sqrt{\\pi}} \\int_0^x e^{-t^2}\\,dt\end{split}\]Example
import powerlawrs result = powerlawrs.util.erf(0.0) # result is 0.0 result = powerlawrs.util.erf(1.0) # result is approx 0.8427
- powerlawrs.util.linspace(start, end, n)¶
Returns
nquantity of evenly spaced numbers over a specified interval. Motivated by numpy’s linspace.Example
import powerlawrs numbers = powerlawrs.util.linspace(0.0, 1.0, 5) # numbers is [0.0, 0.25, 0.5, 0.75, 1.0]
- powerlawrs.util.sim.calculate_sim_params(prec, data, x_min)¶
Calculates the number of simulations, number of samples per sim, the size of the tail given a predetermined x_min, and calculate the probability of the tail event. The methodology is based on what is proposed in Section 4.1 of Clauset, Aaron, et al. ‘Power-Law Distributions in Empirical Data’. SIAM Review, vol. 51, no. 4, Society for Industrial & Applied Mathematics (SIAM), Nov. 2009, pp. 661–703, [doi:10.48550/ARXIV.0706.1062](https://doi.org/10.48550/arXiv.0706.1062). Where the number of simulations required for the desired level of precision in the estimate is: 1/4 * prec^(-2). Ex. 1/4 * 0.01^(-2) = 2500 sims gives accuracy within 0.01
- powerlawrs.util.sim.generate_synthetic_datasets(data, x_min, sim_params, alpha)¶
Generates multiple synthetic datasets using a hybrid model based on the input data and a proposed Pareto Type I fit. This process is fully parallelized, with M simulations running concurrently on separate threads.
Each simulated dataset (of size ‘n’) is constructed by mixing two sampling mechanisms: 1. Sampling from the ‘lower’ part of the original data (where x < x_min). 2. Sampling from a Pareto Type I distribution (defined by x_min and alpha).
The probability of selecting the Pareto tail is controlled by ‘p_tail’.
This approach is commonly used in bootstrapping or simulation studies for extreme value analysis.