API Reference¶

This section provides a detailed overview of the functions, classes, and submodules available in the powerlawrs package.

High-Level API¶

These are the primary components you will interact with.

powerlawrs: A Python package for analyzing power-law distributions.

class powerlawrs.Powerlaw(data)[source]¶

A class to fit and analyze power-law distributions in a given dataset.

fit()[source]¶

Fits the data to a power-law distribution.

This method finds the optimal x_min and alpha parameters for the power-law fit and assesses the goodness of fit. The results are stored in the object’s attributes.

plot()[source]¶: Plots the CCDF of the data and plots the model. Plots for the entire distribution as well as just the tail are shown.

powerlawrs.fit(data)[source]¶

Fits the data to a power-law distribution.

This function is a convenience wrapper that instantiates the Powerlaw class, fits the data, and returns the ParetoFit results.

Parameters:: data (list[float]) – The dataset to analyze.
Returns:: The ParetoFit result object.

Distributions (powerlawrs.dist)¶

Modules related to statistical distributions.

class powerlawrs.dist.exponential.Exponential¶

A Python-compatible wrapper for the Exponential struct from the powerlaw crate.

Creates a new Exponential distribution instance.

Parameters:

lambda (float) – The rate parameter of the distribution. Must be > 0.
x_min (float) – The minimum value of the distribution (scale parameter). Must be > 0.

Example:¶

import powerlawrs
dist = powerlawrs.dist.exponential.Exponential(lambda=0.5, x_min=1.0)
pdf_val = dist.pdf(2.0)

It does not contain any logic itself, but calls the underlying Rust implementation.

ccdf(x)¶: Calls the underlying ccdf method from the powerlaw crate.

cdf(x)¶: Calls the underlying cdf method from the powerlaw crate.

loglikelihood(x)¶: Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.

name()¶: Set the name of the distribution

parameters()¶: Fitted distribution parameters

pdf(x)¶: Calls the underlying pdf method from the powerlaw crate.

rv(u)¶

Calls the underlying rv method from the powerlaw crate.

Parameters:: u (float) – A random number from a Uniform(0, 1) distribution.

class powerlawrs.dist.powerlaw.Powerlaw(alpha, x_min)¶

A Python-compatible wrapper for the Powerlaw struct from the powerlaw crate.

Represents a generic Power-Law distribution where the probability density function (PDF) is: f(x) = C * x^(-alpha)

This simplifies to a Pareto Type I distribution. Note: The alpha parameter here is the power-law exponent. It is equal to 1 + alpha_pareto, where alpha_pareto is the shape parameter of the standard Pareto Type I distribution.

Parameters:

alpha (float) – The scaling exponent of the distribution. Must be > 1.
x_min (float) – The minimum value of the distribution. Must be > 0.

Example:¶

import powerlawrs
# Create a distribution with exponent 2.5 (equivalent to Pareto alpha 1.5)
dist = powerlawrs.dist.powerlaw.Powerlaw(alpha=2.5, x_min=1.0)
pdf_val = dist.pdf(2.0)

ccdf(x)¶: Calls the underlying ccdf method from the powerlaw crate.

cdf(x)¶: Calls the underlying cdf method from the powerlaw crate.

loglikelihood(x)¶: Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.

name()¶: Set the name of the distribution

parameters()¶: Fitted distribution parameters

pdf(x)¶: Calls the underlying pdf method from the powerlaw crate.

rv(u)¶

Calls the underlying rv method from the powerlaw crate.

Parameters:: u (float) – A random number from a Uniform(0, 1) distribution.

powerlawrs.dist.powerlaw.alpha_hat(data, x_min)¶

Python wrapper that calls the alpha_hat function from the powerlaw crate.

Calculates the maximum likelihood estimate (MLE) for the alpha parameter of a generic Power-Law distribution. This returns the power-law exponent (f(x) ~ x^-alpha).

class powerlawrs.dist.lognormal.Lognormal(mu, sigma, x_min=0.0)¶

A Python-compatible wrapper for the Lognormal struct from the powerlaw crate.

The Lognormal distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed.

Parameters:

mu (float) – The mean of the underlying normal distribution.
sigma (float) – The standard deviation of the underlying normal distribution. Must be > 0.
x_min (float, optional) – The minimum value of the distribution (truncation point). Defaults to 0.0.

Example:¶

import powerlawrs
dist = powerlawrs.dist.lognormal.Lognormal(mu=0.0, sigma=1.0, x_min=1.0)
pdf_val = dist.pdf(2.0)

ccdf(x)¶: Calls the underlying ccdf method from the powerlaw crate.

cdf(x)¶: Calls the underlying cdf method from the powerlaw crate.

static from_fitment(data, fitment)¶

Creates a new Lognormal distribution by fitting it to data using the results of a Pareto fit.

This method uses the Newton-Raphson method to find the maximum likelihood estimates for mu and sigma, accounting for the truncation at x_min.

Parameters:

data (list[float]) – The dataset to fit.
fitment (ParetoFit) – The result of a previous Pareto fit (used for x_min).

Returns:

A new Lognormal instance with fitted parameters.

Return type:

Lognormal

loglikelihood(x)¶: Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.

name()¶: Set the name of the distribution

parameters()¶: Fitted distribution parameters

pdf(x)¶: Calls the underlying pdf method from the powerlaw crate.

rv(u)¶

Calls the underlying rv method from the powerlaw crate.

Parameters:: u (float) – A random number from a Uniform(0, 1) distribution.

class powerlawrs.dist.pareto.Pareto(alpha, x_min)¶

Bases: object

A Python-compatible wrapper for the Pareto struct from the powerlaw crate.

Creates a new Pareto Type I distribution instance.

Parameters:

alpha (float) – The shape parameter of the distribution. Must be > 0.
x_min (float) – The minimum value of the distribution (scale parameter). Must be > 0.

Example:¶

import powerlawrs
dist = powerlawrs.dist.pareto.Pareto(alpha=2.5, x_min=1.0)
pdf_val = dist.pdf(2.0)

It does not contain any logic itself, but calls the underlying Rust implementation.

alpha¶

ccdf(x)¶: Calls the underlying ccdf method from the powerlaw crate.

cdf(x)¶: Calls the underlying cdf method from the powerlaw crate.

static from_fitment(fitment)¶

Creates a Pareto distribution directly from a Fitment result.

This allows for a clean conversion from the results of a goodness-of-fit test to a concrete distribution instance.

loglikelihood(x)¶: Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.

name()¶: Set the name of the distribution

parameters()¶: Fitted distribution parameters

pdf(x)¶: Calls the underlying pdf method from the powerlaw crate.

rv(u)¶

Calls the underlying rv method from the powerlaw crate.

Parameters:: u (float) – A random number from a Uniform(0, 1) distribution.

x_min¶

Statistics (powerlawrs.stats)¶

Modules for statistical analysis and random number generation.

powerlawrs.stats.descriptive.mean(data)¶

Calculates the arithmetic mean of a vector.

Example

import powerlawrs

data = [1.0, 2.0, 3.0, 4.0, 5.0]
mu = powerlawrs.stats.descriptive.mean(data)
# mu is 3.0

powerlawrs.stats.descriptive.variance(data, ddof)¶

Calculates the variance of a vector where ddof = degrees of freedom. If ddof=1, the sample variance is returned otherwise the population variance is returned.

Example

import powerlawrs

data = [1.0, 2.0, 3.0, 4.0, 5.0]
# Population variance (ddof=0)
sigma_sq_pop = powerlawrs.stats.descriptive.variance(data, 0)
# sigma_sq_pop is 2.0

# Sample variance (ddof=1)
sigma_sq_samp = powerlawrs.stats.descriptive.variance(data, 1)
# sigma_sq_samp is 2.5

powerlawrs.stats.random.random_choice(data, size)¶

Sample n elements with probability U(0,1) with replacement.

Example

import powerlawrs

data = [1.0, 2.0, 3.0, 4.0, 5.0]
# Get 10 random samples from data with replacement
samples = powerlawrs.stats.random.random_choice(data, 10)
# len(samples) will be 10

powerlawrs.stats.random.random_uniform(n)¶: Generate n random variates from U(0,1).

powerlawrs.stats.ks.ks_1sam_sorted(sorted_x, cdf_func)¶

1 sample KS test based on a known cdf.

Parameters:

sorted_x (list[float]) – A list of data points, pre-sorted in ascending order.
cdf_func (callable) – A Python function (or lambda) that takes a single float (x) and returns its cumulative probability F(x) as a float.

Returns:

A tuple containing (D+, D-, D).

Return type:

tuple[float, float, float]

Raises:

ValueError – If the list is empty.
TypeError – If the cdf_func does not return a float.
(Any exception) – Any exception raised by the cdf_func will be propagated.

Utilities (powerlawrs.util)¶

Helper functions and simulation tools.

powerlawrs.util.erf(x)¶

Computes the error function of x, often denoted as erf(x).

The error function is a special function of sigmoid shape that occurs in probability, statistics, and partial differential equations. It is defined as:

\[\begin{split}\\mathrm{erf}(x) = \\frac{2}{\\sqrt{\\pi}} \\int_0^x e^{-t^2}\\,dt\end{split}\]

Example

import powerlawrs

result = powerlawrs.util.erf(0.0)
# result is 0.0
result = powerlawrs.util.erf(1.0)
# result is approx 0.8427

powerlawrs.util.linspace(start, end, n)¶

Returns n quantity of evenly spaced numbers over a specified interval. Motivated by numpy’s linspace.

Example

import powerlawrs

numbers = powerlawrs.util.linspace(0.0, 1.0, 5)
# numbers is [0.0, 0.25, 0.5, 0.75, 1.0]

powerlawrs.util.sim.calculate_sim_params(prec, data, x_min)¶: Calculates the number of simulations, number of samples per sim, the size of the tail given a predetermined x_min, and calculate the probability of the tail event. The methodology is based on what is proposed in Section 4.1 of Clauset, Aaron, et al. ‘Power-Law Distributions in Empirical Data’. SIAM Review, vol. 51, no. 4, Society for Industrial & Applied Mathematics (SIAM), Nov. 2009, pp. 661–703, [doi:10.48550/ARXIV.0706.1062](https://doi.org/10.48550/arXiv.0706.1062). Where the number of simulations required for the desired level of precision in the estimate is: 1/4 * prec^(-2). Ex. 1/4 * 0.01^(-2) = 2500 sims gives accuracy within 0.01

powerlawrs.util.sim.generate_synthetic_datasets(data, x_min, sim_params, alpha)¶

Generates multiple synthetic datasets using a hybrid model based on the input data and a proposed Pareto Type I fit. This process is fully parallelized, with M simulations running concurrently on separate threads.

Each simulated dataset (of size ‘n’) is constructed by mixing two sampling mechanisms: 1. Sampling from the ‘lower’ part of the original data (where x < x_min). 2. Sampling from a Pareto Type I distribution (defined by x_min and alpha).

The probability of selecting the Pareto tail is controlled by ‘p_tail’.

This approach is commonly used in bootstrapping or simulation studies for extreme value analysis.