API Reference

This section provides a detailed overview of the functions, classes, and submodules available in the powerlawrs package.

High-Level API

These are the primary components you will interact with.

powerlawrs: A Python package for analyzing power-law distributions.

class powerlawrs.Powerlaw(data)[source]

A class to fit and analyze power-law distributions in a given dataset.

fit()[source]

Fits the data to a power-law distribution.

This method finds the optimal x_min and alpha parameters for the power-law fit and assesses the goodness of fit. The results are stored in the object’s attributes.

plot()[source]

Plots the CCDF of the data and plots the model. Plots for the entire distribution as well as just the tail are shown.

powerlawrs.fit(data)[source]

Fits the data to a power-law distribution.

This function is a convenience wrapper that instantiates the Powerlaw class, fits the data, and returns the ParetoFit results.

Parameters:

data (list[float]) – The dataset to analyze.

Returns:

The ParetoFit result object.

Distributions (powerlawrs.dist)

Modules related to statistical distributions.

class powerlawrs.dist.exponential.Exponential

A Python-compatible wrapper for the Exponential struct from the powerlaw crate.

Creates a new Exponential distribution instance.

Parameters:
  • lambda (float) – The rate parameter of the distribution. Must be > 0.

  • x_min (float) – The minimum value of the distribution (scale parameter). Must be > 0.

Example:

import powerlawrs
dist = powerlawrs.dist.exponential.Exponential(lambda=0.5, x_min=1.0)
pdf_val = dist.pdf(2.0)

It does not contain any logic itself, but calls the underlying Rust implementation.

ccdf(x)

Calls the underlying ccdf method from the powerlaw crate.

cdf(x)

Calls the underlying cdf method from the powerlaw crate.

loglikelihood(x)

Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.

name()

Set the name of the distribution

parameters()

Fitted distribution parameters

pdf(x)

Calls the underlying pdf method from the powerlaw crate.

rv(u)

Calls the underlying rv method from the powerlaw crate.

Parameters:

u (float) – A random number from a Uniform(0, 1) distribution.

class powerlawrs.dist.powerlaw.Powerlaw(alpha, x_min)

A Python-compatible wrapper for the Powerlaw struct from the powerlaw crate.

Represents a generic Power-Law distribution where the probability density function (PDF) is: f(x) = C * x^(-alpha)

This simplifies to a Pareto Type I distribution. Note: The alpha parameter here is the power-law exponent. It is equal to 1 + alpha_pareto, where alpha_pareto is the shape parameter of the standard Pareto Type I distribution.

Parameters:
  • alpha (float) – The scaling exponent of the distribution. Must be > 1.

  • x_min (float) – The minimum value of the distribution. Must be > 0.

Example:

import powerlawrs
# Create a distribution with exponent 2.5 (equivalent to Pareto alpha 1.5)
dist = powerlawrs.dist.powerlaw.Powerlaw(alpha=2.5, x_min=1.0)
pdf_val = dist.pdf(2.0)
ccdf(x)

Calls the underlying ccdf method from the powerlaw crate.

cdf(x)

Calls the underlying cdf method from the powerlaw crate.

loglikelihood(x)

Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.

name()

Set the name of the distribution

parameters()

Fitted distribution parameters

pdf(x)

Calls the underlying pdf method from the powerlaw crate.

rv(u)

Calls the underlying rv method from the powerlaw crate.

Parameters:

u (float) – A random number from a Uniform(0, 1) distribution.

powerlawrs.dist.powerlaw.alpha_hat(data, x_min)

Python wrapper that calls the alpha_hat function from the powerlaw crate.

Calculates the maximum likelihood estimate (MLE) for the alpha parameter of a generic Power-Law distribution. This returns the power-law exponent (f(x) ~ x^-alpha).

class powerlawrs.dist.lognormal.Lognormal(mu, sigma, x_min=0.0)

A Python-compatible wrapper for the Lognormal struct from the powerlaw crate.

The Lognormal distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed.

Parameters:
  • mu (float) – The mean of the underlying normal distribution.

  • sigma (float) – The standard deviation of the underlying normal distribution. Must be > 0.

  • x_min (float, optional) – The minimum value of the distribution (truncation point). Defaults to 0.0.

Example:

import powerlawrs
dist = powerlawrs.dist.lognormal.Lognormal(mu=0.0, sigma=1.0, x_min=1.0)
pdf_val = dist.pdf(2.0)
ccdf(x)

Calls the underlying ccdf method from the powerlaw crate.

cdf(x)

Calls the underlying cdf method from the powerlaw crate.

static from_fitment(data, fitment)

Creates a new Lognormal distribution by fitting it to data using the results of a Pareto fit.

This method uses the Newton-Raphson method to find the maximum likelihood estimates for mu and sigma, accounting for the truncation at x_min.

Parameters:
  • data (list[float]) – The dataset to fit.

  • fitment (ParetoFit) – The result of a previous Pareto fit (used for x_min).

Returns:

A new Lognormal instance with fitted parameters.

Return type:

Lognormal

loglikelihood(x)

Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.

name()

Set the name of the distribution

parameters()

Fitted distribution parameters

pdf(x)

Calls the underlying pdf method from the powerlaw crate.

rv(u)

Calls the underlying rv method from the powerlaw crate.

Parameters:

u (float) – A random number from a Uniform(0, 1) distribution.

class powerlawrs.dist.pareto.Pareto(alpha, x_min)

Bases: object

A Python-compatible wrapper for the Pareto struct from the powerlaw crate.

Creates a new Pareto Type I distribution instance.

Parameters:
  • alpha (float) – The shape parameter of the distribution. Must be > 0.

  • x_min (float) – The minimum value of the distribution (scale parameter). Must be > 0.

Example:

import powerlawrs
dist = powerlawrs.dist.pareto.Pareto(alpha=2.5, x_min=1.0)
pdf_val = dist.pdf(2.0)

It does not contain any logic itself, but calls the underlying Rust implementation.

alpha
ccdf(x)

Calls the underlying ccdf method from the powerlaw crate.

cdf(x)

Calls the underlying cdf method from the powerlaw crate.

static from_fitment(fitment)

Creates a Pareto distribution directly from a Fitment result.

This allows for a clean conversion from the results of a goodness-of-fit test to a concrete distribution instance.

loglikelihood(x)

Calculates the log-likelihood of the data given the distribution. Note: The log likelihoods are not summed.

name()

Set the name of the distribution

parameters()

Fitted distribution parameters

pdf(x)

Calls the underlying pdf method from the powerlaw crate.

rv(u)

Calls the underlying rv method from the powerlaw crate.

Parameters:

u (float) – A random number from a Uniform(0, 1) distribution.

x_min

Statistics (powerlawrs.stats)

Modules for statistical analysis and random number generation.

powerlawrs.stats.descriptive.mean(data)

Calculates the arithmetic mean of a vector.

Example

import powerlawrs

data = [1.0, 2.0, 3.0, 4.0, 5.0]
mu = powerlawrs.stats.descriptive.mean(data)
# mu is 3.0
powerlawrs.stats.descriptive.variance(data, ddof)

Calculates the variance of a vector where ddof = degrees of freedom. If ddof=1, the sample variance is returned otherwise the population variance is returned.

Example

import powerlawrs

data = [1.0, 2.0, 3.0, 4.0, 5.0]
# Population variance (ddof=0)
sigma_sq_pop = powerlawrs.stats.descriptive.variance(data, 0)
# sigma_sq_pop is 2.0

# Sample variance (ddof=1)
sigma_sq_samp = powerlawrs.stats.descriptive.variance(data, 1)
# sigma_sq_samp is 2.5
powerlawrs.stats.random.random_choice(data, size)

Sample n elements with probability U(0,1) with replacement.

Example

import powerlawrs

data = [1.0, 2.0, 3.0, 4.0, 5.0]
# Get 10 random samples from data with replacement
samples = powerlawrs.stats.random.random_choice(data, 10)
# len(samples) will be 10
powerlawrs.stats.random.random_uniform(n)

Generate n random variates from U(0,1).

powerlawrs.stats.ks.ks_1sam_sorted(sorted_x, cdf_func)

1 sample KS test based on a known cdf.

Parameters:
  • sorted_x (list[float]) – A list of data points, pre-sorted in ascending order.

  • cdf_func (callable) – A Python function (or lambda) that takes a single float (x) and returns its cumulative probability F(x) as a float.

Returns:

A tuple containing (D+, D-, D).

Return type:

tuple[float, float, float]

Raises:
  • ValueError – If the list is empty.

  • TypeError – If the cdf_func does not return a float.

  • (Any exception) – Any exception raised by the cdf_func will be propagated.

Utilities (powerlawrs.util)

Helper functions and simulation tools.

powerlawrs.util.erf(x)

Computes the error function of x, often denoted as erf(x).

The error function is a special function of sigmoid shape that occurs in probability, statistics, and partial differential equations. It is defined as:

\[\begin{split}\\mathrm{erf}(x) = \\frac{2}{\\sqrt{\\pi}} \\int_0^x e^{-t^2}\\,dt\end{split}\]

Example

import powerlawrs

result = powerlawrs.util.erf(0.0)
# result is 0.0
result = powerlawrs.util.erf(1.0)
# result is approx 0.8427
powerlawrs.util.linspace(start, end, n)

Returns n quantity of evenly spaced numbers over a specified interval. Motivated by numpy’s linspace.

Example

import powerlawrs

numbers = powerlawrs.util.linspace(0.0, 1.0, 5)
# numbers is [0.0, 0.25, 0.5, 0.75, 1.0]
powerlawrs.util.sim.calculate_sim_params(prec, data, x_min)

Calculates the number of simulations, number of samples per sim, the size of the tail given a predetermined x_min, and calculate the probability of the tail event. The methodology is based on what is proposed in Section 4.1 of Clauset, Aaron, et al. ‘Power-Law Distributions in Empirical Data’. SIAM Review, vol. 51, no. 4, Society for Industrial & Applied Mathematics (SIAM), Nov. 2009, pp. 661–703, [doi:10.48550/ARXIV.0706.1062](https://doi.org/10.48550/arXiv.0706.1062). Where the number of simulations required for the desired level of precision in the estimate is: 1/4 * prec^(-2). Ex. 1/4 * 0.01^(-2) = 2500 sims gives accuracy within 0.01

powerlawrs.util.sim.generate_synthetic_datasets(data, x_min, sim_params, alpha)

Generates multiple synthetic datasets using a hybrid model based on the input data and a proposed Pareto Type I fit. This process is fully parallelized, with M simulations running concurrently on separate threads.

Each simulated dataset (of size ‘n’) is constructed by mixing two sampling mechanisms: 1. Sampling from the ‘lower’ part of the original data (where x < x_min). 2. Sampling from a Pareto Type I distribution (defined by x_min and alpha).

The probability of selecting the Pareto tail is controlled by ‘p_tail’.

This approach is commonly used in bootstrapping or simulation studies for extreme value analysis.