Discrete RandomÃ¢â‚¬â€¹ Variable Continuous RandomÃ¢â‚¬â€¹ Variable

Discrete Random Variable

A discrete random variable is one that can assume only a finite, or countably infinite, number of distinct values.

From: The Joy of Finite Mathematics , 2016

Basic Concepts in Probability

Oliver C. Ibe , in Markov Processes for Stochastic Modeling (Second Edition), 2013

1.2.3 Continuous Random Variables

Discrete random variables have a set of possible values that are either finite or countably infinite. However, there exists another group of random variables that can assume an uncountable set of possible values. Such random variables are called continuous random variables. Thus, we define a random variable X to be a continuous random variable if there exists a nonnegative function f _X(x), defined for all real x∈(−∞,∞), having the property that for any set A of real numbers,

$P [X \in A] = \int_{A} f_{X} (x) d x$

The function f _X(x) is called the probability density function (PDF) of the random variable X and is defined by

$f_{X} (x) = \frac{d F_{X} (x)}{d x}$

This means that

$F_{X} (x) = \int_{- \infty}^{x} f_{X} (u) d u$

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124077959000013

Discrete Lifetime Models

N. Unnikrishnan Nair , ... N. Balakrishnan , in Reliability Modelling and Analysis in Discrete Time, 2018

Geeta Distribution

A discrete random variable X is said to have Geeta distribution with parameters θ and α if

$\begin{matrix} f (x) = \frac{1}{α x - 1} (\begin{matrix} α x - 1 \\ x \end{matrix}) θ^{x - 1} {(1 - θ)}^{α x - x}, \\ x = 1, 2, \dots, 0 < θ < 1, 1 < α < θ^{- 1} . \end{matrix}$

The distribution is L-shaped and unimodal with

$μ = (1 - \hat{θ}) {(1 - α θ)}^{- 1}$

and

$μ_{2} = (α - 1) θ (1 - θ) {(1 - α θ)}^{- 3} .$

A recurrence relation is satisfied by the central moments of the form

$μ_{r + 1} = θ μ \frac{d μ_{r}}{d θ} + r μ_{2} μ_{r - 1}, r = 1, 2, \dots .$

Estimation of the parameters can be done by the method of moments or by the maximum likelihood method. Moment estimates are

$\hat{μ} = \bar{x} and \hat{α} = \frac{S^{2} - \bar{x} (\bar{x} - 1)}{S^{2} - {\bar{x}}^{2} (\bar{x} - 1)}$

with $\bar{x}$ and $s^{2}$ being the sample mean and variance. The maximum likelihood estimates are

${\hat{μ}}_{M} = \bar{x}$

and $\hat{α}$ is iteratively obtained by solving the equation

$\frac{(α - 1) \bar{x}}{α \bar{x} - 1} = \exp [- \frac{1}{n \bar{x}} \sum_{x = 2}^{k} \sum_{i = 2}^{x} \frac{x f_{x}}{x - i}]$

with $n = \sum_{x = 1}^{k} f_{x}$ as the sample size and $f_{x}$ is the frequency of $x = 1, 2, \dots, k$ . For details, see Consul (1990). As the Geeta model is a member of the MPSD, the reliability properties can be readily obtained from those of the MPSD discussed in Section 3.2.2.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128019139000038

Probability Distributions of Interest

N. Balakrishnan , ... M.S Nikulin , in Chi-Squared Goodness of Fit Tests with Applications, 2013

8.1.3 Poisson distribution

The discrete random variable X follows the Poisson distribution with parameter $λ > 0$ , if

$P {X = k} = \frac{λ^{k}}{k!} e^{- λ}, k = 0, 1, \dots,$

and we shall denote it by $X \sim P (λ)$ . It is easy to show that

$E X = Var X = λ,$

and so

$\frac{Var X}{E X} = 1 .$

The distribution function of X is

(8.4) $P {X ⩽ m} = \sum_{k = 0}^{m} \frac{λ^{k}}{k!} e^{- λ} = 1 - I_{λ} (m + 1),$

where

$I_{x} (f) = \frac{1}{Γ (f)} \int_{0}^{x} t^{f - 1} e^{- t} dt, x > 0$

is the incomplete gamma function. Often, for large values of $λ$ , to compute (8.4), we can use a normal approximation

$P {X ⩽ m} = Φ (\frac{m + 0.5 - λ}{\sqrt{λ}}) + O (\frac{1}{\sqrt{λ}}), λ \to \infty .$

Let ${X_{n}}_{n = 1}^{\infty}$ be a sequence of independent and identically distributed random variables following the same Bernoulli distribution with parameter $p, 0 < p < 1$ , with

$P {X_{i} = 1} = p, P {X_{i} = 0} = q = 1 - p .$

Let

$μ_{n} = X_{1} + \dots + X_{n}, F_{n} (x) = P \{\frac{μ_{n} - np}{\sqrt{npq}} ⩽ x\}, x \in R^{1} .$

Then, uniformly for $x \in R^{1}$ , we have

$F_{n} (x) \to Φ (x) = \frac{1}{\sqrt{2 π}} \int_{- \infty}^{x} e^{- t^{2} / 2} dt, n \to \infty .$

From this result, it follows that for large values of n,

$P \{\frac{μ_{n} - np}{\sqrt{npq}} ⩽ x\} \approx Φ (x) .$

Often this approximation is used with the so-called continuity correction given by

$P \{\frac{μ_{n} - np + 0.5}{\sqrt{npq}} ⩽ x\} \approx Φ (x) .$

We shall now describe the Poisson approximation to the binomial distribution. Let ${μ_{n}}$ be a sequence of binomial random variables, $μ_{n} \sim B (n, p_{n})$ , $0 < p_{n} < 1$ , such that

${np}_{n} \to λ if n \to \infty and λ > 0 .$

Then,

$\lim_{n \to \infty} P {μ_{n} = m | n, p_{n}} = \frac{λ^{m}}{m!} e^{- λ} .$

In practice, this means that for "large" values of n and "small" values of p, we may approximate the binomial distribution $B (n, p)$ by the Poisson distribution with parameter $λ = np$ , that is,

$P {μ_{n} = m | n, p} \approx \frac{λ^{m}}{m!} e^{- λ} .$

It is of interest to note that (Hodges and Le Cam, 1960)

$\sup_{x} |\sum_{m = 0}^{x} (\begin{matrix} n \\ m \end{matrix}) p^{m} (1 - p)^{n - m} - \sum_{m = 0}^{x} \frac{λ^{m}}{m!} e^{- λ}| ⩽ \frac{C}{\sqrt{n}}, where C ⩽ 3 \sqrt{λ} .$

Hence, if the probability of success in Bernoulli trials is small, and the number of trials is large, then the number of observed successes in the trials can be regarded as a random variable following the Poisson distribution.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123971944000089

Multiple Random Variables

Oliver C. Ibe , in Fundamentals of Applied Probability and Random Processes (Second Edition), 2014

Section 5.7 Covariance and Correlation Coefficient

5.20

Two discrete random variables X and Y have the joint PMF given by

$p_{XY} (x, y) = \{\begin{cases} 0 x = - 1, y = 0 \\ \frac{1}{3} x = - 1, y = 1 \\ \frac{1}{3} x = 0, y = 0 \\ 0 x = 0, y = 1 \\ 0 x = 1, y = 0 \\ \frac{1}{3} x = 1, y = 1 \end{cases}$

a.: Are X and Y independent?
b.: What is the covariance of X and Y?

5.21

Two events A and B are such that $P [A] = \frac{1}{4}, P [B | A] = \frac{1}{2}$ and $P [A | B] = \frac{1}{4} .$ Let the random variable X be defined such that X = 1 if event A occurs and X = 0 if event A does not occur. Similarly, let the random variable Y be defined such that Y = 1 if event B occurs and Y = 0 if event B does not occur.

a.: Find E[X] and the variance of X
b.: Find E[Y] and the variance of Y.
c.: Find ρ _XY and determine whether or not X and Y are uncorrelated.

5.22

A fair die is tossed three times. Let X be the random variable that denotes the number of 1's and let Y be the random variable that denotes the number of 3's. Find the correlation coefficient of X and Y.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128008522000055

Mathematical foundations

Xin-She Yang , in Introduction to Algorithms for Data Mining and Machine Learning, 2019

2.4.1 Random variables

For a discrete random variable X with distinct values such as the number of cars passing through a junction, each value $x_{i}$ may occur with certain probability $p (x_{i})$ . In other words, the probability varies and is associated with the corresponding random variable. Traditionally, an uppercase letter such as X is used to denote a random variable, whereas a lowercase letter such as $x_{i}$ represents its values. For example, if X means a coin-flipping event, then $x_{i} = 0$ (tail) or 1 (head). A probability function $p (x_{i})$ is a function that assigns probabilities to all the discrete values $x_{i}$ of the random variable X.

As an event must occur inside a sample space, the requirement that all the probabilities must be summed to one, which leads to

(2.33) $\sum_{i = 1}^{n} p (x_{i}) = 1 .$

For example, the outcomes of tossing a fair coin form a sample space. The outcome of a head (H) is an event with probability $P (H) = 1 / 2$ , and the outcome of a tail (T) is also an event with probability $P (T) = 1 / 2$ . The sum of both probabilities should be one, that is,

(2.34) $P (H) + P (T) = \frac{1}{2} + \frac{1}{2} = 1 .$

The cumulative probability function of X is defined by

(2.35) $P (X \leq x) = \sum_{x_{i} < x} p (x_{i}) .$

Two main measures for a random variable X with given probability distribution $p (x)$ are its mean and variance. The mean μ or expectation of $E [X]$ is defined by

(2.36) $μ \equiv E [X] \equiv < X > = \int x p (x) d x$

for a continuous distribution and the integration is within the integration limits. If the random variable is discrete, then the integration becomes the weighted sum

(2.37) $E [X] = \sum_{i} x_{i} p (x_{i}) .$

The variance $var [X] = σ^{2}$ is the expectation value of the deviation squared, that is, $E [{(X - μ)}^{2}]$ . We have

(2.38) $σ^{2} \equiv var [X] = E [{(X - μ)}^{2}] = \int {(x - μ)}^{2} p (x) d x .$

The square root of the variance $σ = \sqrt{var [X]}$ is called the standard deviation, which is simply σ.

The above definition of mean $μ = E [X]$ is essentially the first moment if we define the kth moment of a random variable X (with a probability density distribution $p (x)$ ) by

(2.39) $μ_{k} \equiv E [X^{k}] = \int x^{k} p (x) d x (k = 1, 2, 3, \dots) .$

Similarly, we can define the kth central moment by

(2.40) $\begin{matrix} ν_{k} & \equiv E [{(X - E [X])}^{k}] \equiv E [{(X - μ)}^{k}] \\ = \int {(x - μ)}^{k} p (x) d x (k = 0, 1, 2, 3, \dots), \end{matrix}$

where μ is the mean (the first moment). Thus, the zeroth central moment is the sum of all probabilities when $k = 0$ , which gives $ν_{0} = 1$ . The first central moment is $ν_{1} = 0$ . The second central moment $ν_{2}$ is the variance $σ^{2}$ , that is, $ν_{2} = σ^{2}$ .

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128172162000090

Sampling distributions

Kandethody M. Ramachandran , Chris P. Tsokos , in Mathematical Statistics with Applications in R (Third Edition), 2021

4.4 The normal approximation to the binomial distribution

We know that a binomial random variable Y, with parameters n and P = P(success), can be viewed as the number of successes in n trials and can be written as:

$Y = \sum_{i = 1}^{n} X_{i}$

where

$X_{i} = {\begin{cases} 1 with probability p \\ 0 with probability (1 - p) . \end{cases}$

The fraction of successes in n trials is:

$\frac{Y}{n} = \frac{1}{n} \sum_{i = 1}^{n} X_{i} = \bar{X} .$

Hence, Y/n is a sample mean. Since E(X _i) = P and Var (X _i) = P(1 – P), we have:

$E (\frac{Y}{n}) = E (\frac{1}{n} \sum_{i = 1}^{n} X_{i}) = \frac{1}{n} n p = p$

and

$V a r (\frac{Y}{n}) = \frac{1}{n^{2}} \sum_{i = 1}^{n} V a r (X_{i}) = \frac{p (1 - p)}{n} .$

Because $Y = n \bar{X},$ by the central limit theorem, Y has an approximate normal distribution with mean μ = np and variance σ² = np(1 – P). Because the calculation of the binomial probabilities is cumbersome for large sample sizes n, the normal approximation to the binomial distribution is widely used. A useful rule of thumb for use of the normal approximation to the binomial distribution is to make sure n is large enough if np ≥ 5 and n(1 – P) ≥ 5. Otherwise, the binomial distribution may be so asymmetric that the normal distribution may not provide a good approximation. Other rules, such as np ≥ 10 and n(1 – P) ≥ 10, or np(1 – P) ≥ 10, are also used in the literature. Because all of these rules are only approximations, for consistency's sake we will use np ≥ 5 and n(1 – P) ≥ 5 to test for largeness of sample size in the normal approximation to the binomial distribution. If the need arises, we could use the more stringent condition np(1 – P) ≥ 10.

Recall that discrete random variables take no values between integers, and their probabilities are concentrated at the integers as shown in Fig. 4.7. However, the normal random variables have zero probability at these integers; they have nonzero probability only over intervals. Because we are approximating a discrete distribution with a continuous distribution, we need to introduce a correction factor for continuity which is explained next.

Correction for continuity for the normal approximation to the binomial distribution

(a): To approximate P(X ≤ a) or P(X>a), the correction for continuity is (a + 0.5), that is,

$P (X \leq a) = P (Z < \frac{(a + 0.5) - n p}{\sqrt{n p (1 - p)}})$

and

$P (X > a) = P (Z > \frac{(a + 0.5) - n p}{\sqrt{n p (1 - p)}}) .$

(b): To approximate P(X ≥ a) or P(X<a), the correction for continuity is (a − 0.5), that is,

$P (X \geq a) = P (Z > \frac{(a - 0.5) - n p}{\sqrt{n p (1 - p)}})$

and

$P (X < a) = P (Z < \frac{(a - 0.5) - n p}{\sqrt{n p (1 - p)}}) .$

(c): To approximate P(a ≤ X ≤ b), treat ends of the intervals separately, calculating two distinct z-values according to steps (a) and (b), that is,

$P (a \leq X \leq b) = P (\frac{(a - 0.5) - n p}{\sqrt{n p (1 - p)}} < Z < \frac{(b + 0.5) - n p}{\sqrt{n p (1 - p)}}) .$

(d): Use the normal table to obtain the approximate probability of the binomial event.

The shaded area in Fig. 4.8 represents the continuity correction for P(X = i).

Example 4.4.2

A study of parallel interchange ramps revealed that many drivers do not use the entire length of parallel lanes for acceleration, but seek, as soon as possible, a gap in the major stream of traffic to merge. At one site on Interstate Highway 75, 46% of drivers used less than one-third of the lane length available before merging. Suppose we monitor the merging pattern of a random sample of 250 drivers at this site.

(a): What is the probability that fewer than 120 of the drivers will use less than one-third of the acceleration lane length before merging?
(b): What is the probability that more than 225 of the drivers will use less than one-third of the acceleration lane length before merging?

Solution

First we check for adequacy of the sample size:

$n p = (250) (0 .46) =115 and n (1 - p) = (250) (1 - 0 .46) =135 .$

Both are greater than 5. Hence, we can use the normal approximation. Let X be the number of drivers using less than one-third of the lane length available before merging. Then X can be considered to be a binomial random variable. Also,

$μ = n p = (250) (0 .46) =115 .0$

and

$σ = \sqrt{n p (1 - p)} = \sqrt{250 (0.46) (0.54)} = 7.8804.$

Thus,

(a): $P (X < 120) = P (Z < \frac{119.5 - 115}{7.8804} = 0.57103) = 0.7157$ , that is, we are approximately 71.57% certain that fewer than 120 drivers will use less than one-third of the acceleration length before merging.
(b): $P (X > 225) = P (Z > \frac{225.5 - 115}{7.8804} = 14.02213) \approx 0,$ that is, there is almost no chance that more than 225 drivers will use less than one-third of the acceleration lane length before merging.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978012817815700004X

Pairs of Random Variables

Scott L. Miller , Donald Childers , in Probability and Random Processes (Second Edition), 2012

Section 5.4 Conditional Distribution, Density and Mass Functions

5.17

For the discrete random variables whose joint PMF is described by the table in Exercise 5.14, find the following conditional PMFs:

(a): P_M (m|N=2);
(b): P_M (m|N≥2);
(c): P _N(n|M≠2).

5.18

Consider again the random variables in Exercise 5.11 that are uniformly distributed over an ellipse.

(a): Find the conditional PDFs, f_X _|Y(x|y) and f_Y _|X(y|x).
(b): Find f_X _|Y>1(x).
(c): Find f _{Y|{|X| <1}}(y)

5.19

Recall the random variables of Exercise 5.12 that are uniformly distributed over the region |X| + |Y| ≤ 1.

(a): Find the conditional PDFs, f_X|Y (x|y) and f_Y _|X(y|x).
(b): Find the conditional CDFs, F_X _|Y(x|y) and F_Y _|X(y|x).
(c): Find f_X _{|{Y > 1/2}}(x) and F_X _{|{Y > 1/2}}(X).

5.20

Suppose a pair of random variables (X, Y) is uniformly distributed over a rectangular region, A: x ₁ < X < X ₂, y ₁ < Y < y ₂. Find the conditional PDF of (X, Y) given the conditioning event (X, Y) ɛ B, where the region B is an arbitrary region completely contained within the rectangle A as shown in the accompanying figure.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123869814500084

Introduction to Probability Theory

Scott L. Miller , Donald Childers , in Probability and Random Processes (Second Edition), 2012

2.8 Discrete Random Variables

Suppose we conduct an experiment, E, which has some sample space, S. Furthermore, let ξ be some outcome defined on the sample space, S. It is useful to define functions of the outcome ξ, X = f(ξ). That is, the function f has as its domain all possible outcomes associated with the experiment, E. The range of the function f will depend upon how it maps outcomes to numerical values but in general will be the set of real numbers or some part of the set of real numbers. Formally, we have the following definition.

Definition 2.9: A random variable is a real valued function of the elements of a sample space, S. Given an experiment, E, with sample space, S, the random variable X maps each possible outcome, ξ ∈ S, to a real number X(ξ) as specified by some rule. If the mapping X(ξ) is such that the random variable X takes on a finite or countably infinite number of values, then we refer to X as a discrete random variable; whereas, if the range of X(ξ) is an uncountably infinite number of points, we refer to X as a continuous random variable.

Since X = f(ξ) is a random variable whose numerical value depends on the outcome of an experiment, we cannot describe the random variable by stating its value; rather, we must give it a probabilistic description by stating the probabilities that the variable X takes on a specific value or values (e.g., Pr(X=3) or Pr(X > 8)). For now, we will focus on random variables which take on discrete values and will describe these random variables in terms of probabilities of the form Pr(X=x). In the next chapter when we study continuous random variables, we will find this description to be insufficient and will introduce other probabilistic descriptions as well.

Definition 2.10: The probability mass function (PMF), P_X (x), of a random variable, X, is a function that assigns a probability to each possible value of the random variable, X. The probability that the random variable X takes on the specific value x is the value of the probability mass function for x. That is, P_X (x) = Pr(X=x). We use the convention that upper case variables represent random variables while lower case variables represent fixed values that the random variable can assume.

Example 2.23

A discrete random variable may be defined for the random experiment of flipping a coin. The sample space of outcomes is S = {H, T}. We could define the random variable X to be X(H) = 0 and X(T) = 1. That is, the sample space H, T is mapped to the set {0, 1} by the random variable X. Assuming a fair coin, the resulting probability mass function is P_X (0) = 1/2 and P_X (l) = 1/2. Note that the mapping is not unique and we could have just as easily mapped the sample space {H, T} to any other pair of real numbers (e.g., {1,2}).

Example 2.24

Suppose we repeat the experiment of flipping a fair coin n times and observe the sequence of heads and tails. A random variable, Y, could be defined to be the number of times tails occurs in n trials. It turns out that the probability mass function for this random variable is

$P_{Y} (k) = (\begin{matrix} n \\ k \end{matrix}) {(\begin{matrix} 1 \\ 2 \end{matrix})}^{n}, k = 0, 1, \dots, n .$

The details of how this PMF is obtained will be deferred until later in this section.

Example 2.25

Again, let the experiment be the flipping of a coin, and this time we will continue repeating the event until the first time a heads occurs. The random variable Z will represent the number of times until the first occurrence of a heads. In this case, the random variable Z can take on any positive integer value, 1 ≤ Z < ∞. The probability mass function of the random variable Z can be worked out as follows:

$Pr (Z = n) = Pr (n - 1 tails followed by one heads) = {(Pr (T))}^{n - 1} Pr (H) = {(\frac{1}{2})}^{n - 1} (\frac{1}{2}) = 2^{- n} .$

Hence,

P_Z (n) = 2⁻ⁿ, n = 1, 2, 3,….

Example 2.26

In this example, we will estimate the PMF in Example 2.24 via MATLAB simulation using the relative frequency approach. Suppose the experiment consists of tossing the coin n = 10 times and counting the number of tails. We then repeat this experiment a large number of times and count the relative frequency of each number of tails to estimate the PMF. The following MATLAB code can be used to accomplish this. Results of running this code are shown in Figure 2.3.

Try running this code using a larger value for m. You should see more accurate relative frequency estimates as you increase m.

From the preceding examples, it should be clear that the probability mass function associated with a random variable, X, must obey certain properties. First, since P_X (x) is a probability it must be non-negative and no greater than 1. Second, if we sum P_X (x) over all x, then this is the same as the sum of the probabilities of all outcomes in the sample space, which must be equal to 1. Stated mathematically, we may conclude that

When developing the probability mass function for a random variable, it is useful to check that the PMF satisfies these properties.

In the paragraphs that follow, we list some commonly used discrete random variables, along with their probability mass functions, and some real-world applications in which each might typically be used.

A. Bernoulli Random Variable

This is the simplest possible random variable and is used to represent experiments which have two possible outcomes. These experiments are called Bernoulli trials and the resulting random variable is called a Bernoulli random variable. It is most common to associate the values {0,1} with the two outcomes of the experiment. If X is a Bernoulli random variable, its probability mass function is of the form

(2.34) $P_{X} (0) = 1 - p, P_{X} (1) = p .$

The coin tossing experiment would produce a Bernoulli random variable. In that case, we may map the outcome H to the value X = 1 and T to X = 0. Also, we would use the value p = 1/2 assuming that the coin is fair. Examples of engineering applications might include radar systems where the random variable could indicate the presence (X = 1) or absence (X = 0) of a target, or a digital communication system where X = 1 might indicate a bit was transmitted in error while X = 0 would indicate that the bit was received correctly. In these examples, we would probably expect that the value of p would be much smaller than 1/2.

B. Binomial Random Variable

Consider repeating a Bernoulli trial n times, where the outcome of each trial is independent of all others. The Bernoulli trial has a sample space of S = {0,1} and we say that the repeated experiment has a sample space of S_n = { 0, 1 }ⁿ, which is referred to as a Cartesian space. That is, outcomes of the repeated trials are represented as n element vectors whose elements are taken from S. Consider, for example, the outcome

The probability of this outcome occurring is

(2.36) $\begin{matrix} Pr (ξ_{k}) = Pr (1, 1, \dots, 1, 0, 0, \dots, 0) = Pr (1) Pr (1) \dots Pr (1) Pr (0) Pr (0) \dots Pr (0) \\ = {(Pr (1))}^{k} {(Pr (0))}^{n - k} = p^{k} {(1 - p)}^{n - k} . \end{matrix}$

In fact, the order of the 1's and 0's in the sequence is irrelevant. Any outcome with exactly k 1's and n − k 0's would have the same probability. Now let the random variable X represent the number of times the outcome 1 occurred in the sequence of n trials. This is known as a binomial random variable and takes on integer values from 0 to n. To find the probability mass function of the binomial random variable, let A_k be the set of all outcomes which have exactly k 1's and n − k 0's. Note that all outcomes in this event occur with the same probability. Furthermore, all outcomes in this event are mutually exclusive. Then,

P_X (k) = Pr(A_k ) = (# of outcomes in A_k )*(probability of each outcome in A_k

(2.37) $\begin{matrix} (\begin{matrix} n \\ k \end{matrix}) p^{k} {(1 - p)}^{n - k}, & k = 0, 1, 2, \dots, n \end{matrix} .$

The number of outcomes in the event A_k is just the number of combinations of n objects taken k at a time. Referring to Theorem 2.7, this is the binomial coefficient,

As a check, we verify that this probability mass function is properly normalized:

(2.39) $\sum_{k = 0}^{n} (\begin{matrix} n \\ k \end{matrix}) p^{k} {(1 - p)}^{n - k} = {(p + 1 - p)}^{n} = 1^{n} = 1.$

In the above calculation, we have used the binomial expansion

(2.40) ${(a + b)}^{n} = \sum_{k = 0}^{n} (\begin{matrix} n \\ k \end{matrix}) a^{k} b^{n - k} .$

Binomial random variables occur, in practice, any time Bernoulli trials are repeated. For example, in a digital communication system, a packet of n bits may be transmitted and we might be interested in the number of bits in the packet that are received in error. Or, perhaps a bank manager might be interested in the number of tellers that are serving customers at a given point in time. Similarly, a medical technician might want to know how many cells from a blood sample are white and how many are red. In Example 2.23, the coin tossing experiment was repeated n times and the random variable Y represented the number of times tails occurred in the sequence of n tosses. This is a repetition of a Bernoulli trial, and hence the random variable Y should be a binomial random variable with p = 1/2 (assuming the coin is fair).

C. Poisson Random Variable

Consider a binomial random variable, X, where the number of repeated trials, n, is very large. In that case, evaluating the binomial coefficients can pose numerical problems. If the probability of success in each individual trial, p, is very small, then the binomial random variable can be well approximated by a Poisson random variable. That is, the Poisson random variable is a limiting case of the binomial random variable. Formally, let n approach infinity and p approach 0 in such a way that $lim_{n \to \infty} n p = α$ . Then, the binomial probability mass function converges to the form

(2.41) $P_{X} (m) = \frac{α^{m}}{m!} e^{- α}, m = 0, 1, 2, \dots,$

which is the probability mass function of a Poisson random variable. We see that the Poisson random variable is properly normalized by noting that

(2.42) $\sum_{m = 0} \frac{α^{m}}{m!} e^{- α} = e^{- α} e^{α} = 1,$

(see Equation E.14 in Appendix E). The Poisson random variable is extremely important as it describes the behavior of many physical phenomena. It is commonly used in queuing theory and in communication networks. The number of customers arriving at a cashier in a store during some time interval may be well modeled as a Poisson random variable as may the number of data packets arriving at a node in a computer network. We will see increasingly in later chapters that the Poisson random variable plays a fundamental role in our development of a probabilistic description of noise.

D. Geometric Random Variable

Consider repeating a Bernoulli trial until the first occurrence of the outcome ξ₀. If X represents the number of times the outcome ξ₁ occurs before the first occurrence of ξ₀, then X is a geometric random variable whose probability mass function is

(2.43) $P_{X} (k) = (1 - p) p^{k}, k = 0, 1, 2, \dots .$

We might also formulate the geometric random variable in a slightly different way. Suppose X counted the number of trials that were performed until the first occurrence of ξ₀. Then, the probability mass function would take on the form,

(2.44) $P_{X} (k) = (1 - p) p^{k - 1}, k = 1, 2, 3, \dots .$

The geometric random variable can also be generalized to the case where the outcome ξ₀ must occur exactly m times. That is, the generalized geometric random variable counts the number of Bernoulli trials that must be repeated until the mth occurrence of the outcome ξ₀. We can derive the form of the probability mass function for the generalized geometric random variable from what we know about binomial random variables. For the mth occurrence of ξ₀ to occur on the £th trial, the first k−1 trials must have had m−1 occurrences of ξ₀ and k−m occurrences of ξ₁. Then

P_x (k) = Pr({((m − 1) occurrences of ξ₀ in k − 1) trials } ∩ {ξ₀ occurs on the kth trial}

(2.45) $(\begin{matrix} k - 1 \\ m - 1 \end{matrix}) p^{k - m} {(1 - p)}^{m - 1} (1 - p) = (\begin{matrix} k - 1 \\ m - 1 \end{matrix}) p^{k - m} {(1 - p)}^{n}, k - m, m + 1, m + 2, \dots$

This generalized geometric random variable sometimes goes by the name of a Pascal random variable or the negative binomial random variable.

Of course, one can define many other random variables and develop the associated probability mass functions. We have chosen to introduce some of the more important discrete random variables here. In the next chapter, we will introduce some continuous random variables and the appropriate probabilistic descriptions of these random variables. However, to close out this chapter, we provide a section showing how some of the material covered herein can be used in at least one engineering application.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123869814500059

Introduction

Mark A. Pinsky , Samuel Karlin , in An Introduction to Stochastic Modeling (Fourth Edition), 2011

1.2.3 Moments and Expected Values

If X is a discrete random variable, then its m th moment is given by

(1.6) $E [X^{m}] = \sum_{i} x_{i}^{m} Pr {X = x_{i}}$

[where the x_i are specified in (1.1)], provided that the infinite sum converges absolutely. Where the infinite sum diverges, the moment is said not to exist. If X is a continuous random variable with probability density function f (x), then its mth moment is given by

(1.7) $E [X^{m}] = \int_{- \infty}^{+ \infty} x^{m} f (x) d x,$

provided that this integral converges absolutely.

The first moment, corresponding to m = 1, is commonly called the mean or expected value of X and written m_X or μ_X. The m th central moment of X is defined as the m th moment of the random variable X − μ_X , provided that μ_X exists. The first central moment is zero. The second central moment is called the variance of X and written $σ_{x}^{2}$ or Var[X]. We have the equivalent formulas Var[X] = E [(X − μ)²] = E [X ²] − μ².

The median of a random variable X is any value v with the property that

$Pr {X \geq v} \geq \frac{1}{2} and Pr {X \leq v} \geq \frac{1}{2} .$

If X is a random variable and g is a function, then Y = g (X) is also a random variable. If X is a discrete random variable with possible values x ₁, x ₂, …, then the expectation of g (X) is given by

(1.8) $E [g (X)] = \sum_{i = 1}^{\infty} g (x_{i}) Pr {X - x_{i}},$

provided that the sum converges absolutely. If X is continuous and has the probability density function fX, then the expected value of g (X) is evaluated from

(1.9) $E [g (X)] = \int g (x) f_{X} (x) d x .$

The general formula, covering both the discrete and continuous cases, is

(1.10) $E [g (X)] = \int g (x) d F_{X} (x),$

where FX is the distribution function of the random variable X. Technically speaking, the integral in (1.10) is a Lebesgue–Stieltjes integral. We do not require knowledge of such integrals in this text, but interpret (1.10) to signify (1.8) when X is a discrete random variable, and to represent (1.9) when X possesses a probability density fX.

Let F_Y (y) = Pr{Y ≤ y} denote the distribution function for Y = g (X). When X is a discrete random variable, then

$\begin{array}{l} E [Y] = \sum_{j} y_{j} Pr {Y = y_{j}} \\ = \sum_{i} g (x_{i}) Pr {X = x_{i}} \end{array}$

if y_i = g (x_i ) and provided that the second sum converges absolutely. In general,

(1.11) $\begin{array}{l} E [Y] = \int y d F_{Y} (y) \\ = \int g (x) d F_{X} (x) . \end{array}$

If X is a discrete random variable, then so is Y = g (X). It may be, however, that X is a continuous random variable, while Y is discrete (the reader should provide an example). Even so, one may compute E [Y] from either form in (1.11) with the same result.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123814166000010

RANDOM VARIABLES AND EXPECTATION

Sheldon M. Ross , in Introduction to Probability and Statistics for Engineers and Scientists (Fourth Edition), 2009

EXAMPLE 4.2a

Consider a random variable X that is equal to 1,2, or 3. If we know that

$\begin{matrix} p (1) \frac{1}{2} & and & p(2) = \frac{1}{3} \end{matrix}$

then it follows (since p (1) + p (2) + p (3) = 1) that

A graph of p(x) is presented in Figure 4.1.

The cumulative distribution function F can be expressed in terms of p(x) by

If X is a discrete random variable whose set of possible values are x ₁, x ₂, x ₃,…, where x ₁ < x ₂ < x ₃ <…, then its distribution function F is a step function. That is, the value of F is constant in the intervals [x_i-1 , x_i ) and then takes a step (or jump) of size p(x_i ) at x_i .

For instance, suppose X has a probability mass function given (as in EXAMPLE 4.2a) by

Then the cumulative distribution function F of X is given by

$F (a) = {\begin{matrix} 0 & a < 1 \\ \frac{1}{2} & 1 \leq a < 2 \\ \frac{5}{6} & 2 \leq a < 3 \\ 1 & 3 \leq a \end{matrix}$

This is graphically presented in Figure 4.2.

Whereas the set of possible values of a discrete random variable is a sequence, we often must consider random variables whose set of possible values is an interval. Let X be such a random variable. We say that X is a continuous random variable if there exists a nonnegative function f (x), defined for all real x ɛ (-∞,∞), having the property that for any set B of real numbers

(4.2.1) $P {X \in B} = \int_{B} f (x) d x$

The function f (x) is called the probability density function of the random variable X .

In words, Equation 4.2.1 states that the probability that X will be in B may be obtained by integrating the probability density function over the set B. Since X must assume some value, f(x) must satisfy

All probability statements about X can be answered in terms of f (x). For instance, letting B =[a, b ], we obtain from Equation 4.2.1 that

(4.2.2) $P {a \leq X \leq b} = \int_{a}^{b} f (x) d x$

If we let a = b in the above, then

In words, this equation states that the probability that a continuous random variable will assume any particular value is zero. (See Figure 4.3.)

The relationship between the cumulative distribution F (.) and the probability density f (.) is expressed by

$F (a) = P {X \in (- \infty, a)} = \int_{- \infty}^{a} f (x) d x$

Differentiating both sides yields

That is, the density is the derivative of the cumulative distribution function. A somewhat more intuitive interpretation of the density function may be obtained from Equation 4.2.2 as follows:

$P {a - \frac{ɛ}{2} \leq X \leq a + \frac{ɛ}{2}} = \int_{a - ɛ / 2}^{a + ɛ / 2} f (x) d x \approx ɛ f (a)$

when ε is small. In other words, the probability that X will be contained in an interval of length ε around the point a is approximately ε f (a). From this, we see that f (a) is a measure of how likely it is that the random variable will be near a .

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123704832000096

varnermannery1952.blogspot.com

Source: https://www.sciencedirect.com/topics/mathematics/discrete-random-variable

Discrete RandomÃ¢â‚¬â€¹ Variable Continuous RandomÃ¢â‚¬â€¹ Variable

Discrete Random Variable

Basic Concepts in Probability

1.2.3 Continuous Random Variables

Discrete Lifetime Models

Geeta Distribution

Probability Distributions of Interest

8.1.3 Poisson distribution

Multiple Random Variables

Section 5.7 Covariance and Correlation Coefficient

Mathematical foundations

2.4.1 Random variables

Sampling distributions

4.4 The normal approximation to the binomial distribution

Correction for continuity for the normal approximation to the binomial distribution

Pairs of Random Variables

Section 5.4 Conditional Distribution, Density and Mass Functions

Introduction to Probability Theory

2.8 Discrete Random Variables

A. Bernoulli Random Variable

B. Binomial Random Variable

C. Poisson Random Variable

D. Geometric Random Variable

Introduction

1.2.3 Moments and Expected Values

RANDOM VARIABLES AND EXPECTATION

EXAMPLE 4.2a

0 Response to "Discrete RandomÃ¢â‚¬â€¹ Variable Continuous RandomÃ¢â‚¬â€¹ Variable"

Postar um comentário

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel