Back
Quside Blog
Pseudo Random Number Generator & unpredictability

24 de octubre de 2022
3.8 min read

Randomness plays a dominant role in our daily lives. From the most trivial decision to the most relevant events, all of them seem to be imbued with a layer of unpredictability much of the time. Very often, we want to reduce this component of unpredictability; although, many times, we want just the opposite: Areas such as cryptography depend directly on the adversary’s inability to predict a message exchanged between two parties. In cases like these, unpredictability is an indispensable and often scarce resource.

The scarcity of pure sources of randomness stems from a beautiful coincidence: random phenomena and their study may well be the area of mathematics most closely related to physical phenomena. Without relying on physical systems, it is impossible to generate randomness by purely mathematical processes: we need a source of uncertainty coming from some natural process for, once processed, feeding the mathematical algorithms that make use of it.

However, although it is indeed not possible to generate randomness by mathematical processes, there is a long history of attempts to develop algorithms that, while not random in principle, do try to mimic (and in some cases very convincingly) natural sources of randomness. These algorithms, generally known as “pseudo-random number generators,” try to reproduce some of the characteristics of real random numbers, which are:

  • They are unbiased generators: none of the possible outcomes coming from the generator occurs with higher probability than another; thus, there are no “privileged” or “punished” results, being impossible to have, from the beginning, any information that could help us to predict the outcome of the generator.
  • They are uncorrelated generators: the relationship between the result of a generator and any of the previous results it has emitted is non-existent. Thus, knowing the entire history of the generator’s behavior does not give any information about future results since these are independent of each other.

Fulfilling this last property is absolutely impossible for a pseudo-generator, given that its results are produced from an algorithmic process that is affected by its previous history and will affect future results. Thus, given a sufficiently long sequence of random numbers from a pseudo-generator, one could, in theory, reverse the random number generation process. Therefore, this would compromise the (apparent) unpredictability of the pseudo-generator.

To avoid using inadequate pseudo-random number generators, these are generally studied from two different perspectives, mainly related to this hypothesis of uncorrelation and independence:

  • From a statistical perspective, the correlation between the numbers sequentially generated by a PRNG is mathematically explored using extensive randomness suites. These suites generally explore different types of correlations between the numbers and discard as “low quality” those PRNGs that show behaviors in some of these correlation metrics that are very different from those that a purely random source would theoretically show. Suites of this type are developed and supported by institutions such as NIST (United States) or the University of Montreal (Canada)
  • From a computational perspective, we explore the number of operations necessary to, once a sequence generated by the pseudo-generator is known, invert its generation algorithm and determine the execution point of the algorithm. From this point on, all the results of this pseudo-generator would lose all the (apparent) unpredictability they have. This parameter is generally referred to as the “computational complexity” of a generator.

The best existing pseudo-generators, which are a must in cryptographic environments, are those that simultaneously show no statistically notable correlation and whose computational difficulty makes it impractical to attempt to invert the generation algorithm.

Staying with only one of the two severely compromises the unpredictability of a system. For example, this is the case with the default generator of the V8 JavaScript engine, which despite passing a large number of statistical tests, is susceptible to being inverted with as little as five samples.

In any case, the best guarantee of success in environments requiring unpredictability is to have a high-quality and high-speed source of entropy. In this way, it is the physical processes that come to the rescue of cryptographic applications, providing them with the randomness they need to function correctly.

How are pseudo-random numbers generated?

For generating pseudo-random numbers, an algorithm makes some transformation on a given state of the generator. Both a new state and the pseudo-random number are returned from this transformation.

Is it useful?

What are pseudo-random number generators used for?

These PRNGs are used when so much randomness is needed that the entropy sources themselves cannot cope with the demand. They are also widely used in simulation environments.

Is it useful?

What types of pseudo-random number generators exist?

Many different types of algorithms generate pseudo-random sequences. Still, mainly they can be divided between those which are cryptographically secure (providing the best guarantees to date) and those which aren’t.

Is it useful?