RNG on Blockchain

How to Implement Random Number Generation in a Blockchain

NEOWIZ PLAY STUDIO BLOCKCHAIN LAB

Generating random numbers in a computer

Natural phenomena have uncertainties. This property makes perfect predictions of natural phenomena impossible. A most common example would be the weather forecast from a supercomputer.

It is easy to find applications using random numbers. A ladder game is a typical example. However, due to the inherent predictable nature of computers in the applications implemented with random numbers, the random number generation algorithm is not as intuitive as the user interface of those apps. Random numbers generated on a computer are, in fact, pseudo random numbers which are calculated to act like random numbers.

In general, a state value of a machine is used in generating a random number with a computer. All time dependent values can be state values and normally expressed in current time in milliseconds of a CPU time. The value used in random number generation is called a seed and if the seed numbers are equal, then the results are always the same.

Random number generation in a blockchain

In blockchain, it is difficult to apply the same method used in a centralised service of using a state value of a machine. The next two characteristics of the blockchain are the reasons behind this difficulty.  

  1. A transaction can only be written on a block once all nodes in the chain verifying the transaction with the same result.
  2. A seed determines the outcome but the seed is controlled by the provider. So  whoever controls the seed will determine the outcome.

As each node has different state values, the current method of random number generation from a state value of a single machine will not be valid. Consequently, the transaction validation will fail and cannot be written on a block.

 

In the current centralised system, in order to use the seed, we must be able to trust the people providing it, e.g. random.org. Therefore the credibility or existing history of the provider plays an important role. However in a decentralised blockchain, there always is the  possibility that all nodes can act in a malicious manner and the possibility of random number manipulation by the seed provider is more than likely.

Random number generation in existing blockchains

The way we have been generating the random numbers is similar to the commit and reveal method used in a bet game design. Also using the blocktime does have some similarity to the method of using block hash values.  

We will introduce a couple of well known methods of generating random numbers on existing blockchain and suggest a new method with better fairness by making comparisons to each method.

Method of using block state value

It literally uses the state value of each block such as Blockhash, BlockNumber, block time, and so forth. The state values ​​of individual nodes are different, but the state values ​​of the blocks are the same across all nodes. These values become the seed to generate random numbers. (As explained earlier, if the state value of a block becomes a seed, the same random number must be generated at all the nodes and hence there is not a failure in validation.)

Small entropy 

The abundance of possible seeds (higher the randomness) provides the greater entropy. In the case of using the state value of the block, the seed has a small entropy, i.e. low randomness. As the state values have to be agreed upon, most of them can be predictable.  The randomness of this particular method is not high as the block cannot supply sufficient entropy.

Miner dependency

In addition,  the state value of a block is determined by a miner (a node that generates a block). If the reward for random number generation is relatively larger than the reward of producing blocks, there is a possibility of the manipulation of the number generation by the miner.

Method of commit and reveal

It is a method of using a hash function without an inverse function. The committer requests a random number by selecting a  prioer hashed string and the revealer generates a random number using the raw string fo the hashed string as a seed. As this particular method can also use strings, it provides larger entropy than the previous method.

Vulnerability of Fork attack
There is a fatal weakness that can be exploited via a fork if the string is exposed. The possibilities are as follows;

  1. The malicious actor can observe the transactions in the most recent block if it contains a random number.
  2. Any transaction that contains a random number has a revealed raw string.
  3. The actor can attempt to fork on the previous block of the most recent block that contains the transaction with the hashed string.  The committer knows exactly which random number the hashed string will generate, so he/she will continually create the blocks including the most favourable transaction.
  4. If the fork has been successful, then the random number of the actor’s choice will be generated. And the original blocks after the fork will become stale blocks and will be ignored from all nodes. (The incentives exist when the reward is much greater than the expected cost of the fork.)

Contract storage

As an arbitrary string is provided by the revealer, it must be stored in a smart contract in order to achieve better transparency. In other words, there will be a cost as it occupies the storage of the smart contract. There also is an issue of the duplicate selection where the string is used by one or more users. In this case, only one transaction will be valid and the other will be nullified, but this may hinder the user experience. As the current volume of transactions is not high, it may not be a significant issue, but as the volume increases, we believe it is going to be a major problem as the smart contracts with the commit and reveal method will not be able to handle the large amount of traffic.

 

Oracle

It is a method of using the centralised service in a blockchain. One of the most used method is to call the random.org through oraclize.it. It is only possible when the centralised service is completely trustworthy. As the credibility of an institution is not an area that we can guarantee, it will be excluded in our discussion.

Our method of random number generation

We are proposing the following method in generating a random number in a blockchain. For the sake of the argument, let’s assume that a person who requests a random number as a revealer and who provides the number as a committer.

  1. The revealer opens the public key used in the random number generation.
  2. The committer requests the random number with an arbitrary string.
  3. The committer’s request is included in the block.
  4. A smart contract generates a message including  the arbitrary string in step 2 as well as the block time in step 3.
  5. The revealer will create a signature with the private key corresponding to the public key in step 1 and add it to the message generated in step 4 before sending it to the smart contract.
  6. The smart contract will determine the validity of the signature with the signature generated in step 5, the message in step 4 and the public key in step 1.
  7. Once the signature passes the validation, it will become the seed to generate a random number.



Resolution of the previous issues through our method

Small entropy
As the committer can increase the random numbers of the arbitrary strings to a near infinity, there is not going to be a small entropy problem.

Vulnerability of Fork attack
It manipulate the most recent block based on the revealed result. In the existing commit and reveal approach, it is not possible to change the string provided according to the users’ selection as the revealer provides an arbitrary string. As the miner is manipulating the numbers, the commit and reveal method is exposed to the same attack as the same state value of the block is used. For the sake of the argument, let’s assume that there is a game that pays a thousand times the reward when a user matches a number between one and a thousand. If the user bets on a number 678 and finds out that the result is going to be 243, then in the commit and reveal approach, the person can manipulate the recent block by going back ‘in time’ through fork and bet on 243 to receive a thousand fold profit.

In our approach, we let the committer to provide the arbitrary strings. Therefore it is possible that the string is dependent on the user’s choice. Like in the previous example of selecting a random number between one and a thousand, the user can include the bet of 678 in the arbitrary string when committing. In this scenario, the most anyone can gain is to recover the money he/she lost through fork, but the cost of fork will be considerable. Hence it will greatly reduce the incentive for anyone to try to attempt any fork.

Miner dependency
Since the committer provides the arbitrary strings, a function is required to yield different results for any reused strings. In our approach, an additional entropy is provided through the block time. Like the block time, as long as the value cannot be influenced by either the committer or the revealer, it can be used as the supply of the additional entropy. The results can be affected by the block time entered by the miner . However as the time of the reveal is ahead of the time to determine the block time, it cannot be regarded as a manipulation by the miner.

Contract storage
As the committer provides the arbitrary strings, it is possible to reduce the storage consumed in the smart contract. In the commit and reveal method, as the revealer provides these strings, they need to be in the smart contract taking up the storage of the smart contract, hence incurring the cost. There also is an issue of the duplicate selection where a string is used by one or more users. In this case, only one transaction will be valid and the other will be nullified, but this may hinder the user experience. On the other hand, in our proposed method, any data in generating random numbers is not stored in the contract. Therefore it is possible to dramatically lower the cost of smart contracts the existing method by the arbitrary string for generating seed is supplied preferentially through the commiter, not through the revealer and also be able to handle a larger amount of traffic.

Simulation result

The results are shown below in a diagram 1 showing the distribution of each random number occurrence.



Diagram 1

Chi square test

To verify the fairness, the Chi square test was conducted at 95% confidence interval as shown in table 1.

If there is n number of variation of generated random numbers, the degree of freedom is n -1. The results of the Chi square tests on 4 different cases, ( 2 number, 10 numbers, 100 numbers and 1,000 numbers) with sample size of one million each.

Degree of Freedom

Chi Square test Statistics

Critical Value

Test Result

1

1.638

3.841

Pass

9

10.352

16.919

Pass

99

103.892

123.225

Pass

999

977.726

1073.643

Pass

Table 1

We confirmed the fairness of our proposed random number generation method based on the diagram 1 and Chi square test results.