2.1 Cryptographic Principles Used in the Bitcoin System – Hashing

Let’s look at the principles of cryptography used in the Bitcoin system. Bitcoin is known as a cryptocurrency, but in fact it is not cryptographic, and the content of all transactions on the blockchain is public, including the address of the account and the amount of money transferred.
There are two main functions in cryptography used in Bitcoin. One is hashing and the other is signing.

The hash function used in cryptography is called cryptographic hash function. it has two important properties, one is called collision resistance. the collision in this place means hash collision, if there are two inputs x and y, x is not equal to y and the hash function is H. The calculated H(x) is equal to H( y), then this is called a hash collision – the hash values calculated for two different inputs are equal. We encounter hash collisions when we use hash tables, where different inputs may be mapped to the same location in the hash table.

In general, hash collisions are unavoidable. This is because the input space is much larger than the output space. For example, if we have a 256-bit hash value, how big is the output space? The probability of all the hash values being taken is 2 to the 254th power, and that’s how big the output space is. But the input space can be infinite, so it is any number of possibilities. According to the pigeonhole principle, there are bound to be two inputs that are mapped to the same output. So the collision resistance here does not mean that there will be no hash collisions. Some books on this property called collision free, this statement is not accurate, because it is easy to give people a misunderstanding, as if the collision will not occur. But in fact collisions exist. What he means by this is that there is no efficient way to artificially create hash collisions. Given an x, there is no efficient way to find another y such that the hash values of x and y are exactly equal.

What is the use of the collision resistance property? It can be used for a message digest, for example, there is a message called m, we take his hash value H(m), this hash value can be considered as the message digest used to detect tampering with the message, if someone has changed the content of the message. If someone changes the content of the message, its hash value will change, the collision resistance nature. So that you can not find another m a skim, so that this m a skim, after taking the hash, and the original hash value is equal. So there is no way to be able to tamper with the content without being detected.

This is a use of collision resistance. Note that no hash function can be mathematically proven to be collision resistance, that is to say, just said such an important property, from the theory is not proved, this can only rely on experience in practice, some hash function after a long period of practical examination, the world has so many cryptography experts. No one has been able to find a way to artificially create a hash collision, so we think. These hash functions are collision resistance, which is the practical experience there are also some hash functions. Previously, we think it is COLLISION RESISTANCE, but then everyone found a way to create hash collision, which is a very famous example is md5.
md5 used to be a very popular hash function. We originally thought he was very safe, but now no longer, we have known how to artificially create hash collisions.

Cryptography hash function, there is a second property, called hiding. Hiding means that the hash function calculation process is one-way, is irreversible. Given an input x, you can calculate its hash value H(x), but from this hash value H(x) there is no way to invert the original input x. In other words, this hash value does not disclose any information about the input, which is hiding. of course, if you want to know the input, there is a way. Still using the exhaustive method, the input of all possible values traversed once, see which hash value and that output is equal, you can know what the original input, brute force solution.

Hiding this property holds if the input space is large enough to make the brute-force method infeasible, and the input should be distributed evenly, with the probabilities of the various values being about the same. If the input space is large, but in most cases the values are concentrated in a few values, then it is easier to break.
The combination of Hiding and collision resistance properties can be used to realize digital commitment, which is sometimes called the digital equivalent of a sealed envelope.

What is a sealed envelope used for in real life? For example, you write your prediction on a piece of paper, put it in a sealed envelope, and give the envelope to a third-party notary for safekeeping. The next day you open it and verify the result. In the electronic world, this prediction can be used as input x to calculate a hash value, and then that hash value can be published. Because of the nature of hiding, so from this hash value does not know what the prediction result is, after the prediction result is published, because of the nature of collision resistance, so this prediction result is not tampered with, if it is changed, it will not be able to match with the original published hash value. This is the function of a sealed envelope.
In practice, there are some details to pay attention to, that is, the premise of hiding this property is that the input space should be large enough, the distribution should be relatively uniform. If the input does not meet this property, then the common method is to splice a random number after the input, and then take the hash together. At this time, x followed by splicing a random called nonce, and then the entire take hash H(x||nonce). This nonce is a random number selected to ensure that after splicing the entire input is random enough, and then the distribution is also uniform enough. These are some of the details to pay attention to in practice.

In addition to the above two properties required in cryptography, the hash function used in Bitcoin requires a third property called pazzle friendly. this means that the calculation of the hash value is unpredictable in advance. It’s very hard to guess what the final hash will be just by looking at the input. So, if you want to calculate the hash value is within a certain range, there is no good way, only one by one input to try to see which input calculated hash value happens to fall within the required range. For example, want to get a hash value, the first k bits are 0, the back can be any, the whole is 256 bits. What input would compute such a hash? No idea. This pazzle friendly property suggests that it is not known in advance which input is more likely to produce this hash. To get the hash value, you have to try them one by one, there is no shortcut.

Why this property is called pazzle friendly is because, as we’ll see later, the process of bitcoin mining is to find a nonce — a random number, and that nonce is combined with other information in the block header as an input, and then you get a hash that’s less than or equal to a certain value of a specified target field. specified target field value.

Bitcoin is a blockchain, which is a linked list of blocks. Each block has a block header, which has a number of fields, one of which is a random number that we can set – nonce. The process of mining is to keep trying the random numbers to make the whole block header smaller or smaller. The process of mining is to keep trying each random number, so that the hash of the entire block header falls within the specified range. This pazzle friendly nature means that there is no shortcut in the process of mining, you can only keep trying a large number of nonce to find the result that meets the requirements, that’s why this process can be used as a proof of workload, called proof of work. you dig into the mine to find the nonce that meets the requirements, it must be because you have done a lot of work. It must be because you have done a lot of work, because there is no other shortcut. Note that although the mining process requires a lot of work to find a nonce that meets the requirements, once someone finds such a nonce and releases it, it is very easy for other people to verify that the nonce meets the requirements. Just count the hash value once, take this nonce as part of the header, count the hash value once, and see if it is less than or equal to the target domain value. This shows that mining is hard, but verification is easy. This property is called difficult to solve; but easy to verify.

The hash function used in Bitcoin is called SHA-256, which stands for Securec Hash Algorithm, and it satisfies all three of the properties we’ve mentioned (collision resistance, hiding, pazzle friendly). Puzzle friendly is a bit similar to collision resistance, these two properties are related, but not exactly the same.

发表评论

您的邮箱地址不会被公开。 必填项已用 * 标注