Product Distribution
Edit

A product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions. Given two statistically independent random variables X and Y, the distribution of the random variable Z that is formed as the product is a product distribution.

product distribution probability distribution random variable

1. Algebra of Random Variables

The product is one type of algebra for random variables: Related to the product distribution are the ratio distribution, sum distribution (see List of convolutions of probability distributions) and difference distribution. More generally, one may talk of combinations of sums, differences, products and ratios.

Many of these distributions are described in Melvin D. Springer's book from 1979 The Algebra of Random Variables.[1]

2. Derivation for Independent Random Variables

If X and Y are two independent, continuous random variables, described by probability density functions fX and fY then the probability density function of Z=XY is[2]

fZ(z)=fX(x)fY(z/x)1|x|dx.

2.1. Proof [3]

We first write the cumulative distribution function of Z starting with its definition

Unknown environment 'align'

We find the desired probability density function by taking the derivative of both sides with respect to z. Since on the right hand side, z appears only in the integration limits, the derivative is easily performed using the fundamental theorem of calculus and the chain rule. (Note the negative sign that is needed when the variable occurs in the lower limit of the integration.)

Unknown environment 'align'

where the absolute value is used to conveniently combine the two terms.

2.2. Alternate Proof

A faster more compact proof begins with the same step of writing the cumulative distribution of Z starting with its definition:

Unknown environment 'align'

where u() is the Heaviside step function and serves to limit the region of integration to values of x and y satisfying xyz.

We find the desired probability density function by taking the derivative of both sides with respect to z.

Unknown environment 'align'

where we utilize the translation and scaling properties of the Dirac delta function δ.

A more intuitive description of the procedure is illustrated in the figure below. The joint pdf fX(x)fY(y) exists in the x-y plane and an arc of constant z value is shown as the shaded line. To find the marginal probability fZ(z) on this arc, integrate over increments of area dxdyf(x,y) on this contour.

Diagram to illustrate the product distribution of two variables.

Starting with y=zx, we have dy=zx2dx=yxdx. So the probability increment is δp=f(x,y)dx|dy|=fX(x)fY(z/x)y|x|dxdx. Since z=yx implies dz=ydx, we can relate the probability increment to the z-increment, namely δp=fX(x)fY(z/x)1|x|dxdz. Then integration over x, yields fZ(z)=fX(x)fY(z/x)1|x|dx.

2.3. A Bayesian Interpretation

Let Xf(x) be a random sample drawn from probability distribution fx(x). Scaling X by θ generates a sample from scaled distribution θX1|θ|fx(xθ) which can be written as a conditional distribution gx(x|θ)=1|θ|fx(xθ).

Letting θ be a random variable with pdf fθ(θ), the distribution of the scaled sample becomes fX(θx)=gX(xθ)fθ(θ) and integrating out θ we get hx(x)=gX(x|θ)fθ(θ)dθ so θX is drawn from this distribution θXhX(x). However, substituting the definition of g we also have hX(x)=1|θ|fx(xθ)fθ(θ)dθ which has the same form as the product distribution above. Thus the Bayesian posterior distribution hX(x) is the distribution of the product of the two independent random samples θ and X.

For the case of one variable being discrete, let θ have probability Pi at levels θi with iPi=1. The conditional density is fX(xθi)=1|θi|fx(xθi). Therefore fX(θx)=Pi|θi|fX(xθi).

3. Expectation of Product of Random Variables

When two random variables are statistically independent, the expectation of their product is the product of their expectations. This can be proved from the Law of total expectation:

Undefined control sequence \operatorname

In the inner expression, Y is a constant. Hence:

Undefined control sequence \operatorname
Undefined control sequence \operatorname

This is true even if X and Y are statistically dependent in which case Undefined control sequence \operatorname is a function of Y. In the special case in which X and Y are statistically independent, it is a constant independent of Y. Hence:

Undefined control sequence \operatorname
Undefined control sequence \operatorname

4. Variance of the Product of Independent Random Variables

Let X,Y be uncorrelated random variables with means μX,μY, and variances σX2,σY2. The variance of the product XY is

Undefined control sequence \operatorname

In the case of the product of more than two variables, if X1Xn,n>2 are statistically independent then[4] the variance of their product is

Undefined control sequence \operatorname

5. Characteristic Function of Product of Random Variables

Assume X, Y are independent random variables. The characteristic function of X is φX(t), and the distribution of Y is known. Then from the law of total expectation, we have[5]

Unknown environment 'align'

If the characteristic functions and distributions of both X and Y are known, then alternatively, Undefined control sequence \operatorname also holds.

6. Mellin Transform

The Mellin transform of a distribution f(x) with support only on x0 and having a random sample X is

Undefined control sequence \operatorname

The inverse transform is

M1φ(s)=f(x)=12πicic+ixsφ(s)ds.

if X and Y are two independent random samples from different distributions, then the Mellin transform of their product is equal to the product of their Mellin transforms:

MXY(s)=MX(s)MY(s)

If s is restricted to integer values, a simpler result is

Undefined control sequence \operatorname

Thus the moments of the random product XY are the product of the corresponding moments of X and Y and this extends to non-integer moments, for example

Undefined control sequence \operatorname.

The pdf of a function can be reconstructed from its moments using the saddlepoint approximation method.

A further result is that for independent X, Y

Undefined control sequence \operatorname

Gamma distribution example To illustrate how the product of moments yields a much simpler result than finding the moments of the distribution of the product, let X,Y be sampled from two Gamma distributions, Γ(θ)1xθ1ex with parameters θ=α,β whose moments are

Undefined control sequence \operatorname

Multiplying the corresponding moments gives the Mellin transform result

Undefined control sequence \operatorname

Independently, it is known that the product of two independent Gamma samples has the distribution

f(z,α,β)=2Γ(α)1Γ(β)1zα+β21Kαβ(2z),z0.

To find the moments of this, make the change of variable y=2z, simplifying similar integrals to:

0zpKν(2z)dz=22p10y2p+1Kν(y)dy

thus

20zα+β21Kαβ(2z)dz=2(α+β)2p+10y(α+β)+2p1Kαβ(y)dy

The definite integral

0yμKν(y)dy=2μ1Γ(1+μ+ν2)Γ(1+μν2) is well documented and we have finally
Unknown environment 'align'

which, after some difficulty, has agreed with the moment product result above.

If X, Y are drawn independently from Gamma distributions with shape parameters α,β then

Undefined control sequence \operatorname

This type of result is universally true, since for bivariate independent variables fx,y(x,y)=fX(x)fY(y) thus

Unknown environment 'align'

or equivalently it is clear that Xp and Yq are independent variables.

7. Special Cases

7.1. Lognormal Distributions

The distribution of the product of two random variables which have lognormal distributions is again lognormal. This is itself a special case of a more general set of results where the logarithm of the product can be written as the sum of the logarithms. Thus, in cases where a simple result can be found in the list of convolutions of probability distributions, where the distributions to be convolved are those of the logarithms of the components of the product, the result might be transformed to provide the distribution of the product. However this approach is only useful where the logarithms of the components of the product are in some standard families of distributions.

7.2. Uniformly Distributed Independent Random Variables

Let Z be the product of two independent variables Z=X1X2 each uniformly distributed on the interval [0,1], possibly the outcome of a copula transformation. As noted in "Lognormal Distributions" above, PDF convolution operations in the Log domain correspond to the product of sample values in the original domain. Thus, making the transformation u=ln(x), such that pU(u)|du|=pX(x)|dx|, each variate is distributed independently on u as

pU(u)=pX(x)/|du/dx|=1x1=eu,<u0.

and the convolution of the two distributions is the autoconvolution

c(y)=u=0yeueyudu=u=y0eydu=yey,<y0

Next retransform the variable to z=ey yielding the distribution

c2(z)=cY(y)/|dz/dy|=yeyey=y=ln(1/z) on the interval [0,1]

For the product of multiple ( >2 ) independent samples the characteristic function route is favorable. If we define y~=y then c(y~) above is a Gamma distribution of shape 1 and scale factor 1, c(y~)=y~ey~ , and its known CF is (1it)1. Note that |dy~|=|dy| so the Jacobian of the transformation is unity.

The convolution of n independent samples from Y~ therefore has CF (1it)n which is known to be the CF of a Gamma distribution of shape n:

cn(y~)=Γ(n)1y~(n1)ey~=Γ(n)1(y)(n1)ey.

Making the inverse transformation z=ey we get the PDF of the product of the n samples:

fn(z)=cn(y)|dz/dy|=Γ(n)1(logz)n1ey/ey=(logz)n1(n1)!

The following, more conventional, derivation from Stackexchange[6] is consistent with this result. First of all, letting Z2=X1X2 its CDF is

Unknown environment 'align'

The density of z2 is then f(z2)=log(z2)

Multiplying by a third independent sample gives distribution function

Unknown environment 'align'

Taking the derivative yields fZ3(z)=12log2(z),0<z1.

The author of the note conjectures that, in general, fZn(z)=(logz)n1(n1)!,0<z1

The geometry of the product distribution of two random variables in the unit square.

The figure illustrates the nature of the integrals above. The shaded area within the unit square and below the line z = xy, represents the CDF of z. This divides into two parts. The first is for 0 < x < z where the increment of area in the vertical slot is just equal to dx. The second part lies below the xy line, has y-height z/x, and incremental area dx z/x.

7.3. Independent Central-Normal Distributions

The product of two independent Normal samples follows a modified Bessel function. Let x,y be samples from a Normal(0,1) distribution and z=xy. Then

pZ(z)=K0(|z|)π,<z<+


The variance of this distribution could be determined, in principle, by a definite integral from Gradsheyn and Ryzhik,[7]

0xμKν(ax)dx=2μ1aμ1Γ(1+μ+ν2)Γ(1+μν2),a>0,ν+1±μ>0

thus z2K0(|z|)πdz=4πΓ2(32)=1

A much simpler result, stated in a section above, is that the variance of the product of zero-mean independent samples is equal to the product of their variances. Since the variance of each Normal sample is one, the variance of the product is also one.

7.4. Correlated Central-Normal Distributions

The product of correlated Normal samples case was recently addressed by Nadarajaha and Pogány.[8] Let XY be zero mean, unit variance, normally distributed variates with correlation coefficient ρ and let Z=XY

Then

fZ(z)=1π1ρ2exp(ρz1ρ2)K0(|z|1ρ2)

Mean and variance: For the mean we have Undefined control sequence \operatorname from the definition of correlation coefficient. The variance can be found by transforming from two unit variance zero mean uncorrelated variables U, V. Let

X=U,Y=ρU+(1ρ2)V

Then X, Y are unit variance variables with correlation coefficient ρ and

(XY)2=U2(ρU+(1ρ2)V)2=U2(ρ2U2+2ρ1ρ2UV+(1ρ2)V2)

Removing odd-power terms, whose expectations are obviously zero, we get

Undefined control sequence \operatorname

Since Undefined control sequence \operatorname we have

Undefined control sequence \operatorname

High correlation asymptote In the highly correlated case, ρ1 the product converges on the square of one sample. In this case the K0 asymptote is Undefined control sequence \tfrac and

Unknown environment 'align'

which is a Chi-squared distribution with one degree of freedom.

Multiple correlated samples. Nadarajaha et. al. further show that if Z1,Z2,..Zn are n iid random variables sampled from fZ(z) and Undefined control sequence \tfrac is their mean then

fZ¯(z)=nn/22n/2Γ(n2)|z|n/21exp(βγ2z)W0,1n2(|z|),<z<.

where W is the Whittaker function while β=n1ρ,γ=n1+ρ.

Using the identity W0,ν(x)=xπKν(x/2),x0, see for example the DLMF compilation. eqn(13.13.9),[9] this expression can be somewhat simplified to

fz¯(z)=nn/22n/2Γ(n2)|z|n/21exp(βγ2z)β+γπ|z|K1n2(β+γ2|z|),<z<.

The pdf gives the distribution of a sample covariance.

Multiple non-central correlated samples. The distribution of the product of correlated non-central normal samples was derived by Cui et.al.[10] and takes the form of an infinite series of modified Bessel functions of the first kind.

Moments of product of correlated central normal samples

For a central normal distribution N(0,1) the moments are

Undefined control sequence \operatorname

where n!! denotes the double factorial.

If X,YNorm(0,1) are central correlated variables, the simplest bivariate case of the multivariate normal moment problem described by Kan,[11] then

Undefined control sequence \operatorname

where

ρ is the correlation coefficient and t=min([p,q]/2)

[needs checking]

7.5. Correlated Non-Central Normal Distributions

The distribution of the product of non-central correlated normal samples was derived by Cui et al.[10] and takes the form of an infinite series.

These product distributions are somewhat comparable to the Wishart distribution. The latter is the joint distribution of the four elements (actually only three independent elements) of a sample covariance matrix. If xt,yt are samples from a bivariate time series then the Undefined control sequence \dbinom is a Wishart matrix with K degrees of freedom. The product distributions above are the unconditional distribution of the aggregate of K > 1 samples of W2,1.

7.6. Independent Complex-Valued Central-Normal Distributions

Let u1,v1,u2,v2 be independent samples from a normal(0,1) distribution.
Setting z1=u1+iv1 and z2=u2+iv2 then z1,z2 are independent zero-mean complex normal samples with circular symmetry. Their complex variances are Undefined control sequence \operatorname

The density functions of

ri|zi|=(ui2+vi2)12,i=1,2 are Rayleigh distributions defined as:
Undefined control sequence \tfrac

The variable yiri2 is clearly Chi-squared with two degrees of freedom and has PDF

Undefined control sequence \tfrac

Wells et. al.[12] show that the density function of s|z1z2| is

fs(s)=sK0(s),s0

and the cumulative distribution function of s is

P(a)=Pr[sa]=s=0asK0(s)ds=1aK1(a)

Thus the polar representation of the product of two uncorrelated complex Gaussian samples is

fs,θ(s,θ)=fs(s)pθ(θ) where p(θ) is uniform on [0,2π].

The first and second moments of this distribution can be found from the integral in Normal Distributions above

Undefined control sequence \tfrac
Undefined control sequence \tfrac

Thus its variance is Undefined control sequence \operatorname.

Further, the density of zs2=|r1r2|2=|r1|2|r2|2=y1y2 corresponds to the product of two independent Chi-square samples yi each with two DoF. Writing these as scaled Gamma distributions Undefined control sequence \tfrac then, from the Gamma products below, the density of the product is

Undefined control sequence \tfrac

7.7. Independent Complex-Valued Noncentral Normal Distributions

The product of non-central independent complex Gaussians is described by O’Donoughue and Moura[13] and forms a double infinite series of modified Bessel functions of the first and second types.

7.8. Gamma Distributions

The product of two independent Gamma samples, z=x1x2, defining Γ(x;ki,θi)=xki1ex/θiΓ(ki)θiki, follows[14]

Unknown environment 'align'

7.9. Beta Distributions

Nagar et. al.[15] define a correlated bivariate beta distribution

f(x,y)=xa1yb1(1x)b+c1(1y)a+c1B(a,b,c)(1xy)a+b+c,0<x,y<1

where

B(a,b,c)=Γ(a)Γ(b)Γ(c)Γ(a+b+c)

Then the pdf of Z = XY is given by

fZ(z)=B(a+c,b+c)za1(1z)c1B(a,b,c)2F1(a+c,a+c;a+b+2c;1z),0<z<1

where 2F1 is the Gauss hypergeometric function defined by the Euler integral

2F1(a,b,c,z)=Γ(c)Γ(a)Γ(ca)01va1(1v)ca1(1vz)bdv

Note that multivariate distributions are not generally unique, apart from the Gaussian case, and there may be alternatives.

7.10. Uniform and Gamma Distributions

The distribution of the product of a random variable having a uniform distribution on (0,1) with a random variable having a gamma distribution with shape parameter equal to 2, is an exponential distribution.[16] A more general case of this concerns the distribution of the product of a random variable having a beta distribution with a random variable having a gamma distribution: for some cases where the parameters of the two component distributions are related in a certain way, the result is again a gamma distribution but with a changed shape parameter.[16]

The K-distribution is an example of a non-standard distribution that can be defined as a product distribution (where both components have a gamma distribution).

7.11. Gamma and Pareto Distributions

The product of n Gamma and m Pareto independent samples was derived by Nadarajah.[17]

8. In Theoretical Computer Science

In computational learning theory, a product distribution D over {0,1}n is specified by the parameters μ1,μ2,,μn. Each parameter μi gives the marginal probability that the ith bit of x{0,1}n sampled as xD is 1; i.e. μi=PrD[xi=1]. In this setting, the uniform distribution is simply a product distribution with every μi=1/2.

Product distributions are a key tool used for proving learnability results when the examples cannot be assumed to be uniformly sampled.[18] They give rise to an inner product , on the space of real-valued functions on {0,1}n as follows:

f,gD=x{0,1}nD(x)f(x)g(x)=ED[fg]

This inner product gives rise to a corresponding norm as follows:

fD=f,fD

References

  1. Springer, Melvin Dale (1979). The Algebra of Random Variables. John Wiley & Sons. ISBN 978-0-471-01406-5. https://archive.org/details/algebraofrandomv0000spri. Retrieved 24 September 2012. 
  2. Rohatgi, V. K. (1976). An Introduction to Probability Theory and Mathematical Statistics. Wiley Series in Probability and Statistics. New York: Wiley. doi:10.1002/9781118165676. ISBN 978-0-19-853185-2.  https://dx.doi.org/10.1002%2F9781118165676
  3. Grimmett, G. R.; Stirzaker, D.R. (2001). Probability and Random Processes. Oxford: Oxford University Press. ISBN 978-0-19-857222-0. http://ukcatalogue.oup.com/product/9780198572220.do?keyword=grimmett+stirzaker&;sortby=bestMatches. Retrieved 4 October 2015. 
  4. Sarwate, Dilip (March 9, 2013). "Variance of product of multiple random variables". Stack Exchange. https://stats.stackexchange.com/q/52699. ;
  5. "How to find characteristic function of product of random variables". Stack Exchange. January 3, 2013. https://math.stackexchange.com/q/269579. ;
  6. heropup (1 February 2014). "product distribution of two uniform distribution, what about 3 or more". Stack Exchange. https://math.stackexchange.com/q/659278. ;
  7. Gradsheyn, I S; Ryzhik, I M (1980). Tables of Integrals, Series and Products. Academic Press. pp. section 6.561. 
  8. Nadarajah, Saralees; Pogány, Tibor (2015). "On the distribution of the product of correlated normal random variables". Comptes Rendus de l'Académie des Sciences, Série I 354 (2): 201–204. doi:10.1016/j.crma.2015.10.019.  https://dx.doi.org/10.1016%2Fj.crma.2015.10.019
  9. Equ(13.18.9). "Digital Library of Mathematical Functions". https://dlmf.nist.gov. ;
  10. Cui, Guolong (2016). "Exact Distribution for the Product of Two Correlated Gaussian Random Variables". IEEE Signal Processing Letters 23 (11): 1662–1666. doi:10.1109/LSP.2016.2614539. Bibcode: 2016ISPL...23.1662C.  https://dx.doi.org/10.1109%2FLSP.2016.2614539
  11. Kan, Raymond (2008). "From moments of sum to moments of product". Journal of Multivariate Analysis 99 (3): 542–554. doi:10.1016/j.jmva.2007.01.013.  https://dx.doi.org/10.1016%2Fj.jmva.2007.01.013
  12. Wells, R T; Anderson, R L; Cell, J W (1962). "The Distribution of the Product of Two Central or Non-Central Chi-Square Variates". The Annals of Mathematical Statistics 33 (3): 1016–1020. doi:10.1214/aoms/1177704469.  https://dx.doi.org/10.1214%2Faoms%2F1177704469
  13. O’Donoughue, N; Moura, J M F (March 2012). "On the Product of Independent Complex Gaussians". IEEE Transactions on Signal Processing 60 (3): 1050–1063. doi:10.1109/TSP.2011.2177264. Bibcode: 2012ITSP...60.1050O.  https://dx.doi.org/10.1109%2FTSP.2011.2177264
  14. Wolfies (August 2017). "PDF of the product of two independent Gamma random variables". https://math.stackexchange.com/q/2396324. ;
  15. Nagar, D K; Orozco-Castañeda, J M; Gupta, A K (2009). "Product and quotient of correlated beta variables". Applied Mathematics Letters 22: 105–109. doi:10.1016/j.aml.2008.02.014.  https://dx.doi.org/10.1016%2Fj.aml.2008.02.014
  16. Johnson, Norman L.; Kotz, Samuel; Balakrishnan, N. (1995). Continuous Univariate Distributions Volume 2, Second edition. Wiley. p. 306. ISBN 978-0-471-58494-0. http://www.wiley.com/WileyCDA/WileyTitle/productCd-0471584940.html. Retrieved 24 September 2012. 
  17. Nadarajah, Saralees (June 2011). "Exact distribution of the product of n gamma and m Pareto random variables". Journal of Computational and Applied Mathematics 235 (15): 4496–4512. doi:10.1016/j.cam.2011.04.018.  https://dx.doi.org/10.1016%2Fj.cam.2011.04.018
  18. Servedio, Rocco A. (2004), "On learning monotone DNF under product distributions", Information and Computation 193 (1): 57–74, doi:10.1016/j.ic.2004.04.003  https://dx.doi.org/10.1016%2Fj.ic.2004.04.003
More
Related Content
This entry provides a comprehensive overview of methods used in image matching. It starts by introducing area-based matching, outlining well-established techniques for determining correspondences. Then, it presents the concept of feature-based image matching, covering feature point detection and description issues, including both handcrafted and learning-based operators. Brief presentations of frequently used detectors and descriptors are included, followed by a presentation of descriptor matching and outlier rejection techniques. Finally, the entry provides a brief overview of relational matching.
Keywords: photogrammetry; computer vision; image matching; feature-based matching; area-based matching; relational matching; handcrafted operators; learning-based operators; outlier rejection
The increasing complexity of social science data and phenomena necessitates using advanced analytical techniques to capture nonlinear relationships that traditional linear models often overlook. This chapter explores the application of machine learning (ML) models in social science research, focusing on their ability to manage nonlinear interactions in multidimensional datasets. Nonlinear relationships are central to understanding social behaviors, socioeconomic factors, and psychological processes. Machine learning models, including decision trees, neural networks, random forests, and support vector machines, provide a flexible framework for capturing these intricate patterns. The chapter begins by examining the limitations of linear models and introduces essential machine learning techniques suited for nonlinear modeling. A discussion follows on how these models automatically detect interactions and threshold effects, offering superior predictive power and robustness against noise compared to traditional methods. The chapter also covers the practical challenges of model evaluation, validation, and handling imbalanced data, emphasizing cross-validation and performance metrics tailored to the nuances of social science datasets. Practical recommendations are offered to researchers, highlighting the balance between predictive accuracy and model interpretability, ethical considerations, and best practices for communicating results to diverse stakeholders. This chapter demonstrates that while machine learning models provide robust solutions for modeling nonlinear relationships, their successful application in social sciences requires careful attention to data quality, model selection, validation, and ethical considerations. Machine learning holds transformative potential for understanding complex social phenomena and informing data-driven psychology, sociology, and political science policy-making.
Keywords: machine learning in social sciences; nonlinear relationships; model interpretability; predictive analytics; imbalanced data handling
An article about the term "synchronicity" defined as the occurrence of meaningful coincidences that seem to have no cause.
Keywords: synchronicity; coincidences; Carl Jung
Assembly theory is a framework for quantifying selection, evolution, and complexity. It, therefore, spans various scientific disciplines, including physics, chemistry, biology, and information theory. Assembly theory is rooted in the assembly of an object from a set of basic building units, forming an initial assembly pool and from subunits that entered the assembly pool in previous assembly steps. Hence, the object is defined not as a set of point particles but by the history of its assembly, where the assembly index is the smallest number of steps required to assemble the object.
Keywords: assembly theory; complexity; origin of life; emergent dimensionality; mathematical physics
MRSA Chromagar.
Keywords: bacteria; Staphylococcus
Information
Subjects: Others
Contributor MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :
View Times: 5.9K
Entry Collection: HandWiki
Revision: 1 time (View History)
Update Date: 11 Nov 2022
Video Production Service