Handling Uncertainty using Random Variables

saubhagya verma
6 min readJun 29, 2021
A night sky filled with stars.

Have you ever tried to count the stars? Most of us have been fascinated by this question. The overwhelming number of stars has baffled humanity and at the same time given us an interesting perspective that we can’t calculate or quantify everything in our universe. Here, comes the concept of variables to our rescue. It may seem obvious to us now that we use variables to solve for unknown values via using an equation. However, The very notion of representing uncountable and unknown things using a variable has lead to the development of myriad tools to find the hidden truths of the world. In this blog, we will discuss random variables which are frequently used in descriptive statistics to describe statistical problems in a structured manner.

Random Variables (RV)

Just as the name suggests, a random variable is just a variable that is used to represent the different outcomes that can occur in an experiment. For instance, when you roll a common dice you can get values such as 1,2,3,4,5, and 6. You can easy to write the outcomes of rolling a dice along with their probabilities since it has only 6 outcomes and has a uniform probability of 1/6.

Fingers aren’t enough to keep track of every outcome
Fingers aren’t enough to keep track of everything

However, it’s not always that simple. Oftentimes, the outcomes could be infinite as we will see in the case of continuous variables. So our traditional methods become redundant for representing the outcomes of a random experiment with countably infinite outcomes. Therefore, we denote all the possible outcome values of the experiment via a random variable (‘X’), and Voila, we have defined a function for all the outcomes of the experiment. Similarly, we can use P(X) to calculate the respective probabilities of the outcomes in the experiment. A random variable is We’ll learn about how these probabilities are calculated but first let’s understand the various kinds of random variables.

Types of Random Variable

  • Discrete Random Variable: Discrete random variable is used to represent the results of an experiment if its outcomes are countable. In simple terms, if you can count the individual discrete points of an experiment then we can use a discrete random variable to denote the results. For example, if you want to find the outcomes of picking a random card from a pack of cards.
  • Continuous Random Variable: Continuous random variable is used to represent the results of an experiment if its outcomes are uncountable. In other words, you can’t count all the exhaustive outcomes of the experiment. So, we represent the outcomes in a certain range of intervals to get an idea of their frequency and calculate their probability accordingly. As per my observations, You can see continuous random variables in experiments where you want to find the metric values such as height, weight, speed, etc of a population.

Terminologies Used for Random Variable:-

  1. Probability Distribution Function: We use this function to find out how the total probability is divided amongst the individual events of a random experiment. Basically, it is an equation that links each outcome of an experiment with its probability of occurrence. The function used to represent the distribution of probabilities for each outcome of a discrete random variable is called Probability Mass Function. Similarly, the function used to represent the probability distribution of a continuous random variable is called Probability Density Function.
Example of Probability Mass Function (PMF)
Example of Probability Density Function (PDF)

2. Cumulative Distribution Function: The cumulative distribution function tells us the probability of a random variable not exceeding a certain value. We use P(x<a) to represent the cumulative distribution function of a random variable. The intuition behind the cumulative Distribution function remains the same for both discrete and continuous random variables. The only difference between them is that in discrete cases we use the summation process whereas in continuous cases we use the integration method to calculate the cumulative probability of the random variable.

Example of Cumulative Distribution Function (for Discrete Variable)
Example of Cumulative Distribution Function (for Continuous Variable)

3. Expected Value of Random Variable: Since we have discussed so much about how the random variables work, we also introduce a very interesting concept of expectation into the picture. Basically, the Expected Value is nothing but a weighted average of Probability values of each outcome multiplied with the respective outcome to get an average number that gives us a central value to represent the entire experiment’s results. In Economic and Financial models, this concept is very prevalent because the computation of the level of uncertainty is an important aspect for any predictive model deployed to calculate the changes in random experiments. Now let us look at the respective formulas used to calculate the Expected Value of a Random Variable.

Expected Value of a random variable (E[X])

4. Variance of Random Variable: Since we have talked about the central value (expected value of X), a question arises that what is the variation in the probability values with respect to the mathematical expectation. In simple terms, just as we calculated the variance of mean in Statistics 101, we need to do the same for Random variable as well. Thus, we try to find out how much spread out are the outcomes of the random experiment by calculating the Variance of the Random Variable. As in the case of Expectation, the formula for calculating the Variance is quite similar in Discrete and continuous cases of random variables.

The variance of a Random Variable (V[X])

Linear Combination of Random Variables:-

You may be wondering if we have covered all the concepts related to random variables, but we haven’t even scratched the surface yet! In the previous section, we have gained knowledge about the basic terminologies that are commonly used whenever we talk about random variables. We have been provided with a straightforward formula to calculate the PDF, CDF, E[X], and V[X]. However, At times the answers won’t be that straightforward. We would be required to form an equation containing the above 4 terminologies as parameters and asked to find another value. For such cases, we have formulated a set of equations that are very helpful in doing Linear combination of Random variables. Let us look at some of the principles that we would be used for performing Linear combination of Random Variables.

This was it for today! We will discuss more about the various distributions of random variables that are frequently used in day-to-day modeling. Stay tuned, stay safe and stay connected!

--

--

saubhagya verma

I am a budding Data Science enthusiast who is actively learning about the various facets of Data.