The Software Economist Blog: multi-hash

Showing posts with label multi-hash. Show all posts

Friday, August 17, 2018

Syncing Blockchain from a certain block or time

At blockchain synchronization one of the biggest problem is that the full blockchain has to be synchronized and validated from the genesis block. Supposing we have an UTXO based system, it is actually necessary, because there might be UTXO-s which are at the beginning of the chain, but despite they can be spent. We could however consider with an account-balance based system not to download and validate the whole blockhchain just like the last thousand blocks, as the correct state is contained at the last state as well all the other blocks are related only a a consistency and security guarantee. Such an algorithm might have raise the following issues:

- Depending on the consensus mechanism downloading the last thousand block can be as much secure as downloading everything from the genesis block. In proof of work a long range attack from a thousands block in the past is as much impossible as doing a long range attack from the beginning. Similar might be true for proof of stake and other consensus algorithms as well.

- The real challenge is however to get the quasi genesis block in a reliable way. Certainly getting thousands blocks that are fake are not necessarily simple, but an attacker could simply send an older version of the chain fragment, let we call it as a replay blockchain fragment attack.

- To prevent a replay blockchain fragment attack, we can introduce a the block numbers in the block headers. So, first step of the P2P algorithm would be to query the block heights, and bases on the block heights, the blocks with the exact numbers can be queried.

- However, there might be one more attack vector. Even if we have block identity information in the block headers, and attacker might try to build up a blokchcain segment in the future, knowing simply the block id-s and broadcast this fake segment as soon the blocks from the certain id are queried. Let we call this attack as alternative future blockchain fragement attack. There is no known good to implement a chain resolution strategy that can efficiently distinguish between the valid chain and an alternative future.

Wednesday, August 15, 2018

blockid in a multi-hash blockchain system

In a multihash blockchain system, individual blocks do not have one hash identity but several ones, at least one pair for each hash link. This guarantees the consistency of the blocks. Despite if one wants to download or refer to a block, it is much more practical if we have one common id, that can be refereed either at the communication protocol or from the blockchain explorers. Such an id can be for example the hash of all of the exiting hash links and hash pointers. Certainly, it is an open question if such a construct open a new attack vector or vulnerability, because the block hash directly is not contained in the hash list of the blockchain.

Wednesday, August 8, 2018

Block headers in a multi-hash blockchain system

From a practical point of view a multi-hash blockchain means a kind of a multi-header blockchain system. For each hash pointer or each pair of hash pointers we have to save a:
- nonce for each hash pointer
- difficulty for each pair of hash pointers
- general retention policy for each pair of hash pointers
- actual retention policy for a given hash pointers

At calculating the new hash of the hash pointers, the following data should be taken into account:
- all the information that is stored in the hash pointer and was previously mentioned
- the hash values of the previous hash pointers
- merkle root of the transactions
- merkle or patricia root of the state

The block of a multi-hash system contains:
- several hash links
- the transactions
- the new state

Thursday, August 2, 2018

Block identification in a multi-hash blockchain

In a classical blockchain platform, each block is identified by a the hash of the block which contains all of the possible information of the transaction, nonce, difficult, previous block hash and so on. From a practical point of view it works as an id of the block. In a multi-hash blockchain, the situation is however a little bit more complicated as there are at least two different hashes of the block. They are certainly, not absolutely independent from each other, as they contain the same data and transactions, however the resulted hash values seem to be totally independent from each other because of the characteristics of the cryptographic hashes. There might be several ideas to identify the blocks:

- one option might be to simply use the two hashes. However as soon there is not just a pair of hashes but a complex hashmatrix, we might as well handle several docents of hash values, which makes the process relative less user-friendly.

- we might as well use something which is calculated based on the consistent information of the hashes, but itself directly is not contained in the chain. One idea would be the hash of all of the hash pointers.

- Another idea might be to use something which is "bellow" the hash pointers, like the hash of all data. It might provide the possibility to easily check if a certain data is available in the chain or not. Certainly, the disadvantage is that it might provide an extra attack vector for tricky hackers.

Tuesday, July 31, 2018

Hash matrix - a blockchain algorithm with multiply retention policy

Hunting for the holy grail of a blockchain solution with multiply retention policy, we might as well consider the extended version of the multi-hash blockchain structure. There are different transactions or different pieces of data with different retention policies hashed together with a double hashing structure where regularly one of the hashes will be reseted realizing practically a forgetting functionality on the chain. On top data from a longer retention data can be hashed together with data from a smaller retention period realizing a kind of hash-matrix that can be seen on the picture bellow:

The structure is pretty much straightforward at the first stage, as the immutable structure has only one hash chains, however the exact implementation will be questionable at later stages. The problem that in the long retention policy structure there are actually two independent hash-chains, which are of course contain the same set of transaction and data but they differ on the hash value. It is a little bit questionable which of these chain should be hashed together with the next one. Some of the possibilities might be:
- Both hash chain is from the long retention policy chain is hashed together with the short retention policy one. This structure however might cause the exponential explosion of the different hash values, as at the case of short retention policy there might be the possibility that we have to manage 4 chains. There might be some simplification of the reset time of the different policies can divide each other and they are well scheduled.
- Only the data or transaction of the previous retention policy chain is copied into the next phase, without the information of the previous hash values. It can however decrease the reliability of the whole structure.

Monday, July 30, 2018

Multi-hash blockchains and peer gossip protocol

In the bitcoin network the peer synchronization protocol works as follows: First the the peers identify which is the highest block on the blockchain by exchanging the biggest block number by getblock(). After that, the blockchain with the highest block number can send the inventory, meaning all of the hashes of the blocks that he has. With the given inventory, the peer can start to download the blocks one by one:

Supposing we have a multi-hash blockchain, there are several ways of synchronizing the blocks. In all cases however, at the "inv" call not only the block headers but the hash policy has to be transferred. If it is a simple multi-hash blockchain containing only one pair of hashes, then based on the inventory and retention policies, the node can synchronize from the last but one hash pointer reset.

If it is a hybrid blockchain containing immutable hashes and hashes with limited retention policies, then the situation is a little bit more complicated. From a practical point of view there are multiply parallel coexisting blockchains where the hash headers of the longer retention policy variables can be hashed in the shorter ones, however not in the reverse direction. At the synchronization a block has to be asked either by the shortest retention time block header, or by all possible block headers. Similarly at the information transfer it should be paid attention that expired block information should not be transferred.

Saturday, July 28, 2018

State storage in Multi-hash blockchains

Multi-hash blockchains can be a great way for storing reduced information on the chain and providing solutions for lower security blockchain solutions. There migh be a catch however, state is not necessarily stored 100% percent at each block, there might be other strategies as well. As an example, there might be an option to store only the last couple of states, because based on the transactions and initial states the new states can be calculated or reinitialized. Another option might be to store the state redundant between blocks especially regarding parts that did not change, like on the following picture:

At any case, if the hash pointer is reinitialized, it means indirectly that the old blocks and state information might be deleted without effecting the consistency of the chain. As a consequence, there should not remain any dependency regarding the old blocks or the old chain. It implies that in such situation the state tree should be built up again without external dependencies. As the state is validated by the old state and the transactions, the exact storage of the state information should be irrelevant to the consensus mechanism, it just requires more space at hash pointer resets.

Replay attack on the Blockchain

Replay attack can be interpreted in two ways on the blockchain:

1. If there is a hard-fork on the blockchain, and the system is spited into two concurrent platforms, there is a possibility to copy one signed transaction from one chain and put this transaction to the other chain. Usually at a hard-fork there is a mechanism that explicitly avoids a replay attack, like there is a modified transaction semantics, or even just one bit on the forked blockchain, so signed transactions of the old chain will not be valid on the forked one.

2. Even without forking there is the possibility to copy an old transaction and try to replay it again on the blockchain. It is not too effective at an UTXO system, because the system will know that the old transaction output has been already spent. However if it is an account/balance based system, further algorithms must be used. One way to avoid replay attack on an account/balance based system is to implement a counter at each variable that has to be increased at each new transaction. Another way can be to create a nonce for each transaction randomly and automatically, the system has to ensure that the same nonce can not be applied two times. There might be some mixed solution as well, where quasi random nonces are used in an incremental fashion, like:

nonce_next = hash(nonce_prev)

In a multi-hash blockchain system we have most likely account/balance based systems, implying that we have to use one of the nonce or counter based solution. That means that the state of the blockchain is actually not the state of all of the balances, but the tuples of balance and nonce

state_i = <balance_i, nonce_i_j >

On chain governance in multi-hash Blockchain

In case of a multi-hash blockchain system, where a pair of a hash-pointer can have different reset time, there might be the possibility to control this reset time based on a common consensus on chain consensus. At a given consensus system, the hash pointers might be stored as:

<p1, reset_time_1, remaining_reset_time_1>

<p2, reset_time_2, remaining_reset_time_2>

The reset algorithm can work as:

if (remaining_reset_time_i <= 0) {

do hash reset;

remaining_reset_time_i = reset_time_i;

}

The reset time is however stored on the blockchain, so there might be a decentralized voting mechanism, controlled either by the miners or by the community that can change the reset time to a new value. This is however not so simple. The problem is that the reset period of p1 hash pointer has to be in the middle of the hash period of the p2, otherwise the system security can be more easily compromised. One algorithm can be the following:

Supposing we have accepted a new future_reset_time, we can do the following modification at the following reset like in this example for 1, supposing that the resets have 50% time delay to each other:

if (remaining_reset_time_1 <= 0) {
do hash reset;
remaining_reset_time_1 = future_reset_time;
remaining_reset_time_2 = future_reset_time / 2;
reset_time_1 = future_reset_time
reset_time_2 = future_reset_time
}

Certainly, it is an open question how agreement on the new retention time can work in the most efficient way. One idea can be to introduce some special variables in the blockchain that store such a system information and a voting mechanism, like special voting transactions by end-users, or voting by mining power that can make a proposed variable final.

Data retention policy and multi-hash Blockchain

As we have seen in our previous blog, there is a possibility to build up a multihash blockchain system, where some of the hashes are reseted regularly. Such a structure provides the possibility to define different variables on the blockchain with different retention policies, or with different memories. There might be a system that contains several different variables with several different retention policies, like:

- classical hash pointers without reset going back to the genesis block are the real variables of the system providing real immutability.

- variables controlled by hash pointers with regularly reset have a retention time between [N/2 - N] to the system, as older values can be practically forgotten without effecting the consistency of the blockchain.

- there can not only one, but several different level of variables, with several different sets of retention policies, like storing the value for a year, for 3 years, or for forever.

One question must be still answered: what should happen if the two variables are combined but they have different retention policies. As a general rule, we can say that variables with big retention time can always influence variables with small retention time. Unfortunately, it is not true in the other direction, if a small retention time value influences a big retention time value, the retention policy might be compromised.

Another idea might be to implement the actual retention time configuration on chain, meaning that a common on.chain consensus might do some reconfiguration on-the-fly, without re-initializing the whole chain.

Consensus mechanism in a multi-hash Blockchain structure

As we have seen in our previous blogs, a blockchain system can a multi-hash system as well, that can be integrated with a multi-hash proof of work. Such a system can be easily integrated with any other consensus mechanism. As apart from proof of work, there is no nonce or nonce calculation or any similar problems, we can simply calculate the hashes, or depending on the situation one hash, and add the next block to the blockchain, with the given consensus algorithm.

Proof of work in a multi-hash Blockchain structure

A proof of work can be realized in a multi hash structure blockchain as well as we have already seen in our previous blog as well. Let we imagine the fact, that the blocks in the blockchain are connected not by one but by several hash pointers. These hash pointers might be totally independent from each other, however they might be dependent as well Supposing we want to have proof of work in such a structure, we have several options:

1. There is one hash pointer with nonce playing the role alone in proof of work

2. Each hash has a nonce, and the proof of work is equable distributed among the nonces

3. Each hash has a nonce, but the proof of work is distributed among the nonces in a weighted way.

Further difference is that the mechanism should work differently independently if there is a reset in the blockchain or not. In case of a reset, one hash pointer will be simply set to zero, as the on, or other ones should run on an increased difficulty. It is easier at an equable distributed difficulty, it might be a little bit more challenging at a weighted distribution.

Friday, July 27, 2018

Memory of a blockchain and multi-hash Blockchain

Memory of a blockchain is simple the number of blocks, pieces of information, transactions or state variables that are hashed together and are represented in the final hash pointers. Regarding the final hash pointer of a standard blockchain solution, it has infinite memory, regarding that the information is hashed back to the genesis block. However this should not always be the case, certainly an infinite memory provides a better security, however after a certain point, it seems to be irrelevant if we keep the history only for a couple of years or to forever. In this sense, it makes sense imagine blockchain platforms that have less than infinite memory.

Thursday, July 26, 2018

Forgettable Blockchain hash structure (multi-hash blockchain)

Blockchain platforms are not really optimal in the sense that actually immutability is not always an option for many different applications. Such an application is for example a GDPR conform identity management which should have the possibility to delete data or modify data in a final way, meaning that old versions not remaining in the blockchain.

One way of doing it can be simply resting the hash pointer chain time to time and forget the old values. Certainly, the problem is that at reseting the pointer, the whole system will be very much vulnerable for the different kind of attacks. This can be avoided by using two independent hash pointer chains and resting them with a delay, meaning that at resetting a hash pointer p1, the variables still need to be compatible with hash pointer p2, like on the following picture:

Certainly, such a system has less security than a classical blockchain solution and it can be embedded easily into a state based solution, and not so easily into an UTXO based ones. Further consideration is required if both transactions and state variables are stored as information. Certainly, the logic should be applied to the state variables and indirectly to the transactions. The system might be combined with classical Blockchain solutions as well, separating between variables that should be persevered in the blockchain forever from those that should be persevered only for a given time frame.


"On a long enough timeline we will all become Satoshi Nakamoto.."
Daniel Szego

	*Daniel Szego*
Having spent one and a half decade in software development, engineering, R&D, project management and leading software companies; and having two master degrees one in engineering and one in business administration, I thought I summarize some of my theoretical and practical thought on the software industry and enterprise softwares.

	Contact