One-Dimensional Random Variables
General Notion of a Random Variable
In describing the sample space of an experiment we did not specify that an individual outcome needs to be a number. In fact, we have cited a number of examples in which the result of the experiment was not a numerical quantity. For instance, in classifying a manufactured item we might simply use the categories "defective" and "non defective." Again, in observing the temperature during a 24-hour period we might simply keep a record of the curve traced by the thermograph. However, in many experimental situations we are going to be concerned with measuring something and recording it as a number. Even in the above-cited cases we can assign a number to each (nonnumerical) outcome of the experiment. For example, we could assign the value one to non defective items and the value zero to defective ones. We could record the maximum temperature of the day, or the minimum temperature, or the average of the maximum and minimum temperatures. The above illustrations are quite typical of a very general class of problems: In many experimental situations we want to assign a real number x to every element s of the sample space S. That is, x = X(s) is the value of a function X from the sample space to the real numbers. With this in mind, we make the following formal definition.
Definition.
Let e be an experiment and S a sample space associated with the experiment. A function X assigning to every element s∈S, a real number, X(s), is called a random variable.
Notes: (a) The above terminology is a some what unfortunate one, but it is so universally accepted that we shall not deviate from it. We have made it as clear as possible that X is a function and yet we call it a (random) variable!
(b) It turns out that not every conceivable function may be considered as a random variable. One requirement (although not the most general one) is that for every real number x the event {X(s) = x} and for every interval I the event {X(s) ∈ I} have well defined probabilities consistent with the basic axioms. In most of the applications this difficulty does not arise and we shall make no further reference to it.
(c) In some situations the outcome s of the sample space is already the numerical characteristic which we want to record. We simply take X(s) = s, the identity function.
(d) In most of our subsequent discussions of random variables we need not indicate the functional nature of X. We are usually interested in the possible values of X, rather than where these values came from. For example, suppose that we toss two coins and consider the sample space associated with this experiment. That is,
S = {HH, HT, TH, TT}.
Define the random variable X as follows: X is the number of heads obtained in the two tosses. Hence X(HH) = 2, X(HT) = X(TH) = I, and X(TT) = 0
(e) It is very important to understand a basic requirement of a (single-valued) function: To every s E S there corresponds exactly one value X(s). This is shown schematically Different values of s may lead to the same value of X. For example, in the above illustration we found that X(HT) = X(TH) = 1.
The space Rx, the set of all possible values of X, is sometimes called the range space. In a sense we may consider Rx as another sample space. The (original) sample space S corresponds to the (possibly) nonnumerical outcome of the experiment, while Rx is the sample space associated with the random variable X, representing the numerical characteristic which may be of interest. If X(s) = s, we have S = Rx.
Although we are aware of the pedagogical danger inherent in giving too many explanations for the same thing, let us nevertheless point out that we may think of a random variable X in two ways:
(a) We perform the experiment e which results in an outcome s E S. We then evaluate the number X(s).
(b) We perform ε, obtaining the outcome s, and (immediately) evaluate X(s). The number X(s) is then thought of as the actual outcome of the experiment and Rx becomes the sample space of the experiment.
The difference between interpretations (a) and (b) is hardly discernible. It is relatively minor but worthy of attention. In (a) the experiment essentially terminates with the observation of s. The evaluation of X(s) is considered as something that is done subsequently and which is not affected by the randomness of e. In (b) the experiment is not considered to be terminated until the number X(s) has actually been evaluated, thus resulting in the sample space Rx. Although the.
first interpretation, (a), is the one usually intended, the second point of view, (b), can be very helpful, and the reader should keep it in mind. What we are saying, and this will become increasingly evident in later sections, is that in studying random variables we are more concerned about the values X assumes than about its functional form. Hence in many cases we shall completely ignore the underlying sample space on which X may be defined.
EXAMPLE 1
Suppose that a light bulb is inserted into a socket. The experiment is considered at an end when the bulb ceases to burn. What is a possible outcome, say s? One way of describing s would be by simply recording the date and time of day at which the bulb burns out, for instance May 19, 4:32 p.m. Hence the sample space may be represented as S = {(d, t) I d = date, t = time of day}. Presumably the random variable of interest is X, the length of burning time. Note that once s = (d, t) is observed, the evaluation of X(s) does not involve any randomness. When sis specified, X(s) is completely determined. The two points of view expressed above may be applied to this example as follows. In (a) we consider the experiment to be terminated with the observation s = (d, t), the date and time of day. The computation of X(s) is then performed, involving a simple arithmetic operation. In (b) we consider the experiment to be completed only after X(s) is evaluated and the number X(s) = 107 hours, say, is then considered to be the outcome of the experiment. It might be pointed out that a similar analysis could be applied to some other random variables of interest, for instance Y(s) is the temperature in the room at the time the bulb burned out.
EXAMPLE 2.
Three coins are tossed on a table. As soon as the coins land on the table, the "random" phase of the experiment is over. A single outcome s might consist of a detailed description of how and where the coins landed. Presumably we are only interested in certain numerical characteristics associated with this experiment. For instance, we might evaluate
X(s) = number of heads showing,
Y(s) = maximum distance between any two coins,
Z(s) = minimum distance of coins from any edge of the table.
If the random variable X is of interest, we could, as discussed in the previous example, incorporate the evaluation of X(s) into the description of our experiment and hence simply state that the sample space associated with the experiment is {O, 1, 2, 3}, corresponding to the values of X. Although we shall very often adopt precisely this point of view, it is important to realize that the counting .of the number of heads is done after the random aspects of the experiment have ended.
Note: In referring to random variables.we shall, almost without exception, use capital letters such as X, Y, Z, etc. However, when speaking of the value these random variables
assume we shall in general use lower case letters such as x, y, z, etc. This is a very important distinction to be made and the student might well pause to consider it. For example, when we speak of choosing a person at random from some designated population and measuring his height (in inches, say), we could refer to the possible outcomes as a random variable X. We might then ask various questions about X, such as P(X >= 60). However, once we actually choose a person and measure his height we obtain a specific value of X, say x. Thus it would be meaningless to ask for P(x >=60) since x either is or is not >= 60. This distinction between a random variable and its value is important, and we shall make subsequent references to it. As we were concerned about the events associated with the sample space S, so we shall find the need to discuss events with respect to the random variable X, that is, subsets of the range space Rx. Quite often certain events associated with S are "related" (in a sense to be described) to events associated with Rx in the following way.
Definition. Let ε be an experiment and S its sample space. Let X be a random variable defined on S and let Rx be its range space. Let B be an event with respect to Rx; that is, B ⊂ Rx. Suppose that A is defined as
A = {s∈ S / X(s)∈B}.
In words: A consists of all outcomes in S for which X(s)∈B In this case we say that A and B are equivalent events.
Notes: (a) Saying the above more informally, A and Bare equivalent events whenever they occur together. That is, whenever A occurs, B occurs and conversely. For if A did occur, then an outcome s occurred for which X(s) ∈ B and hence B occurred. Conversely, if B occurred, a value X(s) was observed for which s ∈ A and hence A occurred. (b) It is important to realize that in our definition of equivalent events, A and Bare associated with different sample spaces.
EXAMPLE 3
Consider the tossing of two coins. Hence S = { H H, HT, TH, TT}. Let X be the number of heads obtained. Hence Rx = {O, l, 2}. Let B = { l}. Since X(HT) = X(TH) = l if and only if X(s) = 1,we have that A = {HT,TH} is equivalent to B.
We now make the following important definition.
Definition. Let B be an event in the range space Rx. We then define P(B) as
follows:
P(B) = P(A), where A ={s ∈ S/ X(s) ∈ B} .............(2)
In words: We define P(B) equal to the probability of the event A ⊂ S, which is equivalent to B, in the sense of Eq. (1).
Notes: (a) We are assuming that probabilities may be associated with events in S. Hence the above definition makes it possible to assign probabilities to events associated with Rx in terms of probabilities defined over S.
(b) It is actually possible to prove that P(B) must be as we defined it. However, this would involve some theoretical difficulties which we want to avoid, and hence we proceed as above.
(c) Since in the formulation of Eq. (2) the events A and B refer to different sample spaces, we should really use a different notation when referring to probabilities defined over S and for those defined over Rx, say something like P(A) and Px(B). However, we shall not do this but continue to write simply P(A) and P(B). The context in which these expressions appear should make the interpretation clear.
(d) The probabilities associated with events in the (original) sample space S are, in a sense, determined by "forces beyond our control" or, as it is sometimes put, "by nature." The makeup of a radioactive source emitting particles, the disposition of a large number of persons who might place a telephone call during a certain hour, and the thermal agitation resulting in a current or the atmospheric conditions giving rise to a storm front illustrate this point. When we introduce a random variable X and its associated range space Rx we are inducing probabilities on the events associated with Rx, which are strictly determined if the probabilities associated with events in S are specified.
EXAMPLE 4.
If the coins considered in Example 3 are "fair," we have P(HT) = P(TH) = 1/4. Hence P(HT, TH) = 1/4+1/4 = 1/2 (The above calculations are a direct consequence of our basic assumption concerning' the fairness of the coins.) Since the event {X = l} is equivalent to the event {HT, TH}, using Eq. (4.1), we have that P(X = 1) = P(HT, TH) = 1/2. [There was really no choice about the value of P(X = l) consistent with Eq. (4.2), once P(HT, TH) had been determined. It is in this sense that probabilities associated with events of Rx are induced.]
Note: Now that we have established the existence of an induced probability function over the range space of X (Eq. 4.1 and 4.2) we shall find it convenient to suppress the functional nature of X. Hence we shall write (as we did in the above example), P(X = 1) = t. What is meant is that a certain event in the sample space S, namely {HT, TH} = {s / X(s) = l} occurs with probability l Hence we assign that same probability to the event { X = 1} in the range space. We shall continue to write expressions like P(X = 1), P(X ≤ 5), etc. It is very important for the reader to realize what these expressions really represent.
Once the probabilities associated with various outcomes (or events) in the range space Rx have been determined (more precisely, induced) we shall often ignore the original sample space S which gave rise to these probabilities. Thus in the above example, we shall simply be concerned with Rx= {O, 1, 2} and the associated probabilities (1/4, 1/2, 1/4). The fact that these probabilities are determined by a probability function defined over the original sample space S need not concern us if we are simply interested in studying the values of the random variable X.
In discussing, in detail, many of the important concepts associated with random variables, we shall find it convenient to distinguish between two important cases: the discrete and the continuous random variables.
first interpretation, (a), is the one usually intended, the second point of view, (b), can be very helpful, and the reader should keep it in mind. What we are saying, and this will become increasingly evident in later sections, is that in studying random variables we are more concerned about the values X assumes than about its functional form. Hence in many cases we shall completely ignore the underlying sample space on which X may be defined.
EXAMPLE 1
Suppose that a light bulb is inserted into a socket. The experiment is considered at an end when the bulb ceases to burn. What is a possible outcome, say s? One way of describing s would be by simply recording the date and time of day at which the bulb burns out, for instance May 19, 4:32 p.m. Hence the sample space may be represented as S = {(d, t) I d = date, t = time of day}. Presumably the random variable of interest is X, the length of burning time. Note that once s = (d, t) is observed, the evaluation of X(s) does not involve any randomness. When sis specified, X(s) is completely determined. The two points of view expressed above may be applied to this example as follows. In (a) we consider the experiment to be terminated with the observation s = (d, t), the date and time of day. The computation of X(s) is then performed, involving a simple arithmetic operation. In (b) we consider the experiment to be completed only after X(s) is evaluated and the number X(s) = 107 hours, say, is then considered to be the outcome of the experiment. It might be pointed out that a similar analysis could be applied to some other random variables of interest, for instance Y(s) is the temperature in the room at the time the bulb burned out.
EXAMPLE 2.
Three coins are tossed on a table. As soon as the coins land on the table, the "random" phase of the experiment is over. A single outcome s might consist of a detailed description of how and where the coins landed. Presumably we are only interested in certain numerical characteristics associated with this experiment. For instance, we might evaluate
X(s) = number of heads showing,
Y(s) = maximum distance between any two coins,
Z(s) = minimum distance of coins from any edge of the table.
If the random variable X is of interest, we could, as discussed in the previous example, incorporate the evaluation of X(s) into the description of our experiment and hence simply state that the sample space associated with the experiment is {O, 1, 2, 3}, corresponding to the values of X. Although we shall very often adopt precisely this point of view, it is important to realize that the counting .of the number of heads is done after the random aspects of the experiment have ended.
Note: In referring to random variables.we shall, almost without exception, use capital letters such as X, Y, Z, etc. However, when speaking of the value these random variables
assume we shall in general use lower case letters such as x, y, z, etc. This is a very important distinction to be made and the student might well pause to consider it. For example, when we speak of choosing a person at random from some designated population and measuring his height (in inches, say), we could refer to the possible outcomes as a random variable X. We might then ask various questions about X, such as P(X >= 60). However, once we actually choose a person and measure his height we obtain a specific value of X, say x. Thus it would be meaningless to ask for P(x >=60) since x either is or is not >= 60. This distinction between a random variable and its value is important, and we shall make subsequent references to it. As we were concerned about the events associated with the sample space S, so we shall find the need to discuss events with respect to the random variable X, that is, subsets of the range space Rx. Quite often certain events associated with S are "related" (in a sense to be described) to events associated with Rx in the following way.
Definition. Let ε be an experiment and S its sample space. Let X be a random variable defined on S and let Rx be its range space. Let B be an event with respect to Rx; that is, B ⊂ Rx. Suppose that A is defined as
A = {s∈ S / X(s)∈B}.
In words: A consists of all outcomes in S for which X(s)∈B In this case we say that A and B are equivalent events.
Notes: (a) Saying the above more informally, A and Bare equivalent events whenever they occur together. That is, whenever A occurs, B occurs and conversely. For if A did occur, then an outcome s occurred for which X(s) ∈ B and hence B occurred. Conversely, if B occurred, a value X(s) was observed for which s ∈ A and hence A occurred. (b) It is important to realize that in our definition of equivalent events, A and Bare associated with different sample spaces.
EXAMPLE 3
Consider the tossing of two coins. Hence S = { H H, HT, TH, TT}. Let X be the number of heads obtained. Hence Rx = {O, l, 2}. Let B = { l}. Since X(HT) = X(TH) = l if and only if X(s) = 1,we have that A = {HT,TH} is equivalent to B.
We now make the following important definition.
Definition. Let B be an event in the range space Rx. We then define P(B) as
follows:
P(B) = P(A), where A ={s ∈ S/ X(s) ∈ B} .............(2)
In words: We define P(B) equal to the probability of the event A ⊂ S, which is equivalent to B, in the sense of Eq. (1).
Notes: (a) We are assuming that probabilities may be associated with events in S. Hence the above definition makes it possible to assign probabilities to events associated with Rx in terms of probabilities defined over S.
(b) It is actually possible to prove that P(B) must be as we defined it. However, this would involve some theoretical difficulties which we want to avoid, and hence we proceed as above.
(c) Since in the formulation of Eq. (2) the events A and B refer to different sample spaces, we should really use a different notation when referring to probabilities defined over S and for those defined over Rx, say something like P(A) and Px(B). However, we shall not do this but continue to write simply P(A) and P(B). The context in which these expressions appear should make the interpretation clear.
(d) The probabilities associated with events in the (original) sample space S are, in a sense, determined by "forces beyond our control" or, as it is sometimes put, "by nature." The makeup of a radioactive source emitting particles, the disposition of a large number of persons who might place a telephone call during a certain hour, and the thermal agitation resulting in a current or the atmospheric conditions giving rise to a storm front illustrate this point. When we introduce a random variable X and its associated range space Rx we are inducing probabilities on the events associated with Rx, which are strictly determined if the probabilities associated with events in S are specified.
EXAMPLE 4.
If the coins considered in Example 3 are "fair," we have P(HT) = P(TH) = 1/4. Hence P(HT, TH) = 1/4+1/4 = 1/2 (The above calculations are a direct consequence of our basic assumption concerning' the fairness of the coins.) Since the event {X = l} is equivalent to the event {HT, TH}, using Eq. (4.1), we have that P(X = 1) = P(HT, TH) = 1/2. [There was really no choice about the value of P(X = l) consistent with Eq. (4.2), once P(HT, TH) had been determined. It is in this sense that probabilities associated with events of Rx are induced.]
Note: Now that we have established the existence of an induced probability function over the range space of X (Eq. 4.1 and 4.2) we shall find it convenient to suppress the functional nature of X. Hence we shall write (as we did in the above example), P(X = 1) = t. What is meant is that a certain event in the sample space S, namely {HT, TH} = {s / X(s) = l} occurs with probability l Hence we assign that same probability to the event { X = 1} in the range space. We shall continue to write expressions like P(X = 1), P(X ≤ 5), etc. It is very important for the reader to realize what these expressions really represent.
Once the probabilities associated with various outcomes (or events) in the range space Rx have been determined (more precisely, induced) we shall often ignore the original sample space S which gave rise to these probabilities. Thus in the above example, we shall simply be concerned with Rx= {O, 1, 2} and the associated probabilities (1/4, 1/2, 1/4). The fact that these probabilities are determined by a probability function defined over the original sample space S need not concern us if we are simply interested in studying the values of the random variable X.
In discussing, in detail, many of the important concepts associated with random variables, we shall find it convenient to distinguish between two important cases: the discrete and the continuous random variables.
No comments:
Post a Comment