The Software Economist Blog: theory

Showing posts with label theory. Show all posts

Saturday, January 5, 2019

Notes on atomic and cross-chain swaps

Atomic swaps realize cooperation between several two blockchains in the sense that a transaction on one chain can be executed if and only if another transaction is executed on another chain.

The simplest working mechanism of such an atomic swap is the following:

1. Let we imagine that a we want to make an integration between a P public and C consortium network in a way that a Tc transaction is executed on the C consortium network if and only if a Tp public transaction is executed on the public network.

2. We create a random x secret value and a H(x) cryptographic hash of the random number.

3. We create a timed hash contract or transaction in a way that the transaction is on the blockchain but it will be executed if and only if x is openly presented on the chain, usually with the help of a transaction. If the x value is not presented in a certain time period, the transaction is automatically reverted. In case of Bitcoin for instance this revert means that the cryptocurrency is efficiently transferred back from the multisig wallet to the sender. However this is not necessarily the only possible use-case. Any logic with a transaction execution and revert can be used.

4. If x secret is presented on one system, the transaction will be executed. As in this case x secret is public, it can be used to execute to other transaction as well on the other system.

Certainly the system might have some disadvantages that can be fine-tuned: like in non smart contract based systems, the double spending should be avoided by the algorithm and the participants should be effectively online during the swap. On top, privacy might be an issue as well.

Saturday, December 22, 2018

Designing an optimal software frameworks

Programming languages and software platforms represent an interface technology between the Turing machines of the computer and the Neorotex of the human brain. It means that they should be designed as much to the computers as to the humans. As an example good software frameworks should be represented as conveyor chains to the humans: having an explicit number of steps that can be parametrized by people in a logical order, having at each step only an explicit number of possible choices. It is not because, that is the way how the computer can best handle things, it is because, that is the way how humans can handle things. The design can be further fine-tuned with considering further limitations of the human thinking, like having the 7+-2 limitation of the short term memory, an optimal conveyor chain might contain the same amount of steps and each step might have 7+-2 input elements as well. Considering the fact that the Neocortex is pretty much hierarchical the chain should can be built up in a hierarchical way containing sub-conveyor chains in each step, up to 7+-2 steps. The whole structure might contain elements not only from the Neocortex but from the human collaboration and social interactions as well. As an example, there might be different teams working on different different steps of the software with different competence areas and the collaboration of the teams might directly be supported by the software framework itself.

Monday, December 17, 2018

Notes on decentralized business logic platforms

The disruptive technological ideas behind the blockchain applications gives the possibility to design better middleware frameworks as well for business logic. An architecture might have the following proprieties:
- Elements of the business logic are separated into transactions an atomic processing units.
- Transactions are signed by end-users providing authentication and maybe privacy in the whole system.
- Processed transactions are signed by the processing units as well providing a defense mechanism against tampering.
- Processing units can be configured with different combinations, like on the same computer or on different machines.
- Processing units can be configured with different scaling strategies, like scaling for performance, or scaling for security, like having different Byzantine fault tolerant algorithms.
- Service level agreement for a service should be defined and measured as well.
- Processing of a processing unit might be guaranteed by security deposit, that can be somehow automatically payed out if the service level agreement is not matched.
- Special consideration has to be taken for processing units making serialization, like writing something into a database or ledger.

Sunday, December 16, 2018

Secure multiparty protocol on the blockchain

Implementing a secure multiparty protocol on the top of the blockchain requires some special considerations. Examples might be for such protocols if semi trusted actors want to communicate with each other with the help of a consortium distributed ledger solution, like sharing salary data on the blockchain in a way that only average of the salary will be available, or similarly aggregating ghg emission data on a consortium distributed ledger, in a way that the emission data of the individual companies are not revealed only sum of the data.

Integrating blockchain with secure multiparty protocols have two major issues:

- Visibility of the data: by default all data on the blockchain is visible for all of the participants, which is not so optimal in case of a secure multiparty protocol. As a consequence, either an encryption algorithm should take place, or some of the data and communication should happen off-chain.

- Trust model: classical secure multiparty protocols assume that the actors are trusted. In the context of distributed ledger solutions, the assumption of the trust model is weaker, like assuming Byzantine faults as well.

A secure multiparty sum might be implemented on the blockchain with the following steps:

1. Each participant {1..k} generates off-chain a private and public keys

2. Each participant publishes the public key to the chain.

3. Each participant has a Vi value that should be summarized with the help of the secure multiparty protocol.

4. Each participant splits the Vi value into randomly into k pieces {v1, v2, ... vk} for each node.

5. The values are encrypted by the public keys of the participants, in a way that the first value ifs encrypted by the public key of the first node, the second value in encrypted by the public key of the second node and so on, forming {E(v1), E(v2), ... E(vk)} encrypted values for each node.

6. All of the data is published to the blockchain forming practically as a trusted communication channel.

7. Each node selects the data from the blockchain that is encrypted with its public key and decrypts them with the private key. At the end each node will know k pieces of decrypted data in a way that each value comes from different nodes.

8. Each node creates an individual sum of the different values, cause that the same summary is manifested at each individual nodes.

9. As an optional step the produced data might be published to the blockchain as well. We can build in here some kind of a Bytantine fault tolerance, like in a way that the sum values are published with the help of blind voting algorithm, where we can choose the sum values that is chosen by most of the participants (supposing that most of the participants are honest).

Thursday, December 13, 2018

Distributed ledger and trust model

At collecting requirements for distributed systems, one of the most important requirement of the application is the trust model. Firstly, general trust model must be exactly specified:

- Untrusted model: in untrusred model participants do not now each other and do not trust each other. Despite the system has to guarantee the participants can cooperate and can exchange value. In untrusted model almost the only logical choice is the public blockchain model, possibly with high security.

- Semitrusted model: in semitrusted model, the participants might know each others and might trust each others up to a certain level, but they do not fully trust each others. In such a models, a consortium blockchain solution might be a good solution.

- Fully trusted model: in a fully trusted model, the participants know each other and trust each other. In such a model, a blockchain solution is pretty much an overkill.

The exact model can be further fine-tuned, like considering a more complicated architecture, including storage, computation, resource intensive computation, user interface, external oracles, communication channels. At each part we can define how much do we trust in the certain medium, or with other words, how much do we want create the certain part to be byzantine fault tolerant.

Wednesday, December 12, 2018

Bitcoin blockheader and on-chain governance information

The bitcoin header contains many information. Most of them are responsible to maintain the consistency of the bitcoin blockhcain. However, there is one that is a little bit exceptional and that is the difficulty target. Difficulty target is actually related rather to on-chain governance and not strongly to the consistency of the chain. The model can be actually extended in a way that there is more than one piece of on-chain information in the block header apart from difficulty. As a general extension there can be a specific data area only for on-chain governance information for which there are special rules how they can be changed and which information is secured with the help of a merkle tree which root is written into the block header.

Pseudorandomly choosing the next leader at delegated proof of stake

Cryptographic secure random number generation is one of the biggest challenge for every blockchain protocol. It can be extremely important at different Nakamoto consensus or similar algorithms, because the next leader should be chosen with a cryptographically secure pseudorandom generator, otherwise denial of service attack can be easily carried out against the leader. At a delegated proof of stake system, however such an algorithm can be realized easily, and supposing that at least one node acts in an honest way producing real pseudorandom number, the result should be pseudorandom as well. The following algorithm sketch produces the required result:

- The actual leader node creates a private public key pairs and initiates a request for the delegate nodes producing a random number. This request also contains the public key that is actually related to the request.

- The delegate nodes individually create a pseudorandom random number with some internal or hardware algorithm. These random numbers are encrypted by the public key of the request and spent back to the leader in an encrypted form. During this phase, the random information is encrypted, so practically, nobody can see or manipulate the exact random number.

- The leader decrypts the with the help of the private key the random numbers and combines them into a final random number, like with the help of a similar algorithms:

R1 XOR R2 XOR ... XOR Rn

sha256(... sha256(sha256(R1) XOR R2) ... XOR Rn)

at the end both the final random number and the private key is revealed in the blockchain, so everybody can proof if the choice algorithm was correct.

If the random numbers are generated independently from each other and there is at least truly random number the result will be probably truly random as well. If the randomness of the numbers generated by the individual nodes can be checked, there can be a critpoeconomical incentive mechanism as well to reward or slash certain behaviors.

There is one denial of service attack against the scheme though. If the leader does not reveal the information at the end of the round, like the random key or the random number at the end, a new leader should be chosen, however without the given pseudo-random number. This case requires further investigation.

Tuesday, December 11, 2018

Blockchain actually doesn't need blocks

Blocks in a blockchain protocol have several functions. Major function is clearly to have an order of the transactions, which order is the same for all of the distributed - decentralized nodes in order to avoid double spending. Like if there a two transactions that are double spends, every node can consider the first one as valid and the second one as invalid. Just for avoiding double spending however, we do not need an order. A partial order of transactions is enough in a sense if two transactions want to consume the same account or transaction there must be a common order among these two transactions. So, instead of blocks and full orders, it clearly makes sense to investigate more efficient storage structures for transaction processing.

Deletable blockchain with delegated proof of stake

Previous idea with the deletable blockchain platform can not only be realized in a quorum or consortium pattern, but with the help of a delegated proof of stake mechanism as well. In delegated proof of stake, like in EOS, some delegates are elected by the community of nodes either by an explicit voting mechanism or somehow indirectly with voting by cryptocurrency stakes. To produce a valid block, a majority of the delegates have to agree in the block and sign it. Similarly an external data that is not part of the chain can be validated by the delegates and signed by them. Certainly, there should be a cryptographical or cryptoeconomical mechanism that guarantees that in a round the same piece of information is not signed two times. The signature should be realized by one time keys of the delegate nodes, in order that the we can be sure that the given information was signed in the given round. As the information is not stored in the blockchain self, the architecture does not guarantees that the given information exist, if the information exist however we can 100% check if it was signed by the blockchain itself.

The external information to be signed can a block of an account - state based blockchain system, where certainly validity of the blocks and transactions must be checked by the delegates during the signature. There must be a mechanism making sure that the last, or last couple of off-chain blocks are stored, but regarding older blocks, the should not necessarily be stored. They can be deleted without causing consistency problems in the chain.

Monday, December 10, 2018

Vending machines and replicated state machines

For distributed state machines, the usual example that is used in a crypto community is the vending machine. Vending machine is the usual example for the state machines in computer science. It is however a bad example for a distributed state. The main example for that is that it is a physical object that can be imagined pretty difficult in a real decentralized implementation. It would mean something similar that you get your coke only if all of the vending machines around the world produce the same output, which is actually pretty weird. Instead a better decentralized example should be used possibly not including any kind of a physical object. As a simple example, a conditional money transfer can be executed if some party signs the transaction and these signatures are validated by the majority of the nodes around the world.

Transaction ordering and double spending

Distributed systems usually suffer from the so called double spending problem, there should not be two transactions applied to the same account in the same round. There might be several strategies to be followed based on the exact ledger storage implementation, however from a consensus point of view usually a superset of double spending is used, which is a common ordering of the transactions in a way that each node sees exactly the same order of transactions. This implies actually that there is no double spending, because if there are two transactions applied to the same account or unspent transaction, the first one can be chosen as valid and the second one as double spend and the same order will be used by every node in the consensus. The other direction will not hold however, there might be consensus algorithm avoiding double spend without a requirement of full ordering of the transactions. Actually, from a theoretical point of view only partial order is required, creating an order of the transactions that want to change the same account.

Certainly, from a theoretical basis, the situation can be a little bit more complicated if we consider general smart contracts as well instead of cryptocurrency transfer. At smart contracts a transaction might consume information of an account and modifies another one. It is important here to have consistency, instead of double spending, like a transaction should not be dependent on the value of an account if another transaction tries to write the same account variable.

Sunday, December 9, 2018

The difference between delegated proof of stake and proof of stake with proxy staking

Delegated proof of stake and proof of stake with proxy staking represents two similar approaches, but they have differences as well. In both approaches one financial motivation is that accounts having cryptocurrency but not wanting to take part direct in the consensus algorithm can indirectly take part by a locking the cryptocurrency at a node that makes validation and gaining a revenue for that. In this sense, the approach is pretty similar to participate financial in a company and getting financial revenue for that. In delegated proof of stake, the delegates are chosen directly by the accounts wanting to lock the money in the system. The motivation is here usually something similar than to EOS, or Tendermint having a finite number of validators to finalize the consensus. This finite number of validators are chosen directly by stakeholders, like the top most well-financed nodes are getting always into the actual set of validators. In proof of stake with proxy staking however, the explicit validator candidates are chosen by a different algorithm, like with a central authority. The stakeholders can choose one node to lock down some cryptocurrency and gain revenue on that, however the exact validator set will be not reorganized based on the locked in cryptocurrency. It can only be modified by a different algorithm.

Can a final consensus algorithm fork ?

That is a difficult question, but simple put there are systems that provide probabilistic finality and prefer availability against consistency at network partition. These systems can be forked. On the other hand, there are Byzantine fault tolerant systems (BFT) that prefer consistency against availability at network partition and they can not really fork. Certainly, BFT systems do not scale so good as systems with probabilistic or economic finality.

Saturday, December 8, 2018

Notes on off-chain information a blocks

From a practical point of view, if a piece of information can be found in the blockchain, it gives two kinds of a guarantee. On the one hand, the information exist, meaning that at verifying the whole chain, the information has to exist and has to be downloaded, otherwise the consistency can not be identified. On the other hand, the information is valid; actually the whole consistency of the blockchain gives the guarantee if the piece of information is valid. However there might be a solution to guarantee information consistency without the need of giving a guarantee for availability. If a piece of information is signed by the blockchain itself but it is not stored directly in the chain, it gives a guarantee, like an off-chain proof that the information was valid without storing that piece of information directly in the chain. Certainly, the situation will be a little bit more complicated if the blockchain can actually fork.

#InformationAvailability #InformationValidity

Sunday, November 18, 2018

How to create native external oracle with Nakamoto consensus

Similarly to the previous blog desisgning a decentralized native external oracle can be realized in a same way as a native random oracle. The principle is basically, the same: on the one hand the miners or validators measure the external data sources and put them into blocks or at least temporal blocks. On the other hand, imported data should be kept in secret, because otherwise new miners could influence or even hack the algorithm itself.

The full multi-block native external oracle algorithm can be described as follows:

1. An imitator having a {Priv, Pub} private and public key pairs creates a external oracle request that includes the pubic key as well:

request_extern (Pub, N, ext)

, where Pub is the public key of the requestor, ext is the external data source and N is the number of round during the external data source has to be measured.

2. At mining or validation, a miner or validator measures the value of the external data source that is encrypted by the public key of the reqestor and put at in the request itself. So after the first validation, the request will look like:

request_extern (Val1)(Pub, N-1, ext)

where Val1 is the measured external data encrypted by the public key. To ensure security Val1 value should be put into the blockchain as well, not necessarily forever but at least during the N phase of execution of the native external oracle. So with other words, the request would be splited into two parts:

request_extern (Val1)

will be written into the blockchain

request_extern (Pub, N-1,ext)

can be available as a new transaction request that is propagated throughout the network in transaction pools, waiting to be mined or validated. Similarly, after k<N rounds, there would be k encrypted external values in the blockchain:

request_extern (Val1)

request_extern (Val2)

...

request_extern (Valk)

and a new request as a transaction which is

request_extern (Pub, N-k, ext)

3. After N rounds, the requestor can aggregate the external values and decrypt them with the Priv private key.

Ext1 = Decrypt_Priv(Val1)

Ext2 = Decrypt_Priv(Val2)

...

ExtN = Decrypt_Priv(ValN)

The individually external values should be aggregated in a way that the they provide a correct average value even if some of the nodes try to hack or game the system. The algorithm should contain an incentive mechanism as well, giving a reward to nodes that gave correct values in this way motivating nodes to produce correct data, providing a Schelling point as decision making algorithm. Supposing that around one third of the nodes and measurements can be fault, we can have the following algorithm:

a. filter 33% of the most extreme values
b. make an average of the rest that remained providing the real external value
c. reward the nodes whose values were not filtered out based on the distance of the average.

The algorithm has unfortunately some performance issues: It takes N rounds to find out a real decentralized external value. This can be pretty long and pretty costly depending on the given blockchain platform. Considering a Nakamoto consensus the algorithm is pretty difficult to speed up, as the basic idea is that the external values are comming from N individual different sources that actually means at a Nakamoto consensus N different blocks. This implies the fact as well that the data source should keep the value for a long-enough time, like preserving the same value for hours. The algorithms can not really be used with fast-changing data sources.

evaluate_extern (Priv)

At this moment the private key can be published to the blockchain and the evaluation algorithm can be implemented with a smart-contract in a decentralized way.

From a practical point of view a smart contract with built-in native external oracle would look as:

function with_ext (input params) return (output params) {

...

variable ´= External(<external_datasource>);

...

}

for evaluating such a smart contract, there should be two transaction used:

- the first one would call the function with a public key of the external oracle making the initialization in the external data measurement.

transaction init with_ext pub_key_ext

- the second transaction would publish the private key and make the whole evaluation and the execution of the rest of the business logic, like:

transaction exec with_ext priv_key_ext

The proposed system has unfortunately one denial of service attack possibility. The one who initiated the external oracle has the private key and can calculate the result previously. If this result does not make benefit for him, he choose not to reveal the private key or not to execute the second transaction.

Saturday, November 17, 2018

How to create a native random oracle with Nakamoto consensus

Creating a real native random oracle is one of the holy grail of the blockchain industry. As the problem is not so difficult if the consensus mechanism is a quorum as the nodes participating in the consensus making decisions independently from each other, it is more difficult at a Nakamoto consensus. The problem is with the Nakamoto consensus that the temporal leader creating the next block is practically a "dictator" of the next block and can influence like the random number. The algorithm can be however improved here as well with two ideas:

- creating a real random number is taking several round where several node guesses a random number which are then aggregated at the end. Certainly this is not necessarily a real solution as active leaders might see the previous random values and might influence the next influence in way that is profitable to the nodes. To avoid such a situations we could use the following idea:

- the random numbers are encrypted by a public key of a requestor. As a consequence the next node do not really see the previous values of the previous blocks, so it can not influence the final result.

The full multi-block native random oracle algorithm can be described as follows:

1. An imitator having a {Priv, Pub} private and public key pairs creates a random oracle request that includes the pubic key as well:

request_random (Pub, N)

, where Pub is the public key of the requestor and N is the number of round during the random number has to be generated.

2. At mining or validation, a miner or validator creates a native random number which is encrypted by the public key of the reqestor and put at in the request itself. So after the first validation, the request will look like:

request_random (Val1)(Pub, N-1)

where Val1 is the generated random number encrypted by the public key. To ensure security Val1 value should be put into the blockchain as well, not necessarily forever but at least during the N phase of execution of the random oracle. So with other words, the request would be splited into two parts:

request_random (Val1)

will be written into the blockchain

request_random (Pub, N-1)

request_random (Val1)

request_random (Val2)

...

request_random (Valk)

and a new request as a transaction which is

request_random (Pub, N-k)

3. After N rounds, the requestor can aggregate the random values and decrypt them with the Priv private key.

Rand1 = Decrypt_Priv(Val1)

Rand2 = Decrypt_Priv(Val2)

...

RandN = Decrypt_Priv(ValN)

The individually generated random numbers should be aggregated in a way that the randomness is preserved even if some of the values are not really randomly generated. The exact algorithm here is questionable, but it is important that the original entropy of the requested random number is maintained even if some of the nodes are cheating. Ideas might be:

sha256(Rand1, Rand2, .... RandN)

sha256(Rand1 xor Rand2 xor ... RandN)

sha256( ... sha256(sha256(Rand1))... )

The algorithm has two drawbacks that can be fine-tuned on a long run:

- The algorithm takes N rounds to find out a real decentralized random number. This can be pretty long and pretty costly depending on the given blockchain platform. Considering a Nakamoto consensus the algorithm is pretty difficult to speed up, as the basic idea is that random numbers are comming from N individual different sources that actually means at a Nakamoto consensus N different blocks.

- Based on the evaluation algorithm we should assume that some of the miners or validators created a good random number. Consider a good byzatnine evaluation function at the end even if some of the nodes cheat the resulting random number can be a good one, like cryptographically secure. The problem is however that even the hones nodes are not really incentivized to create good random numbers, hence we can not really measure or punish if nodes produce good random numbers. It can be certainly an assumption that certain number of the nodes are honest, but actually it would be much better to measure and validate this fact.

A further question can arise how the data exactly used like within a smart contract. As private keys should not be stored in the blockchain, there should be an additional round with the wallet software to encrypt and aggregate the information which might introduce elements of unnecessary centralization and difficult to build in directly into a decentralized smart contract. It is important to note however that this private key is not necessarily the same as the private key of the account so it can be actually revealed as soon as all the N nodes created a guess for a random number. As a consequence a second transaction can be used to calculate the real random number, like:

evaluate_random (Priv)

At this moment the private key can be published to the blockchain and the evaluation algorithm can be implemented with a smart-contract in a decentralized way.

From a practical point of view a smart contract with built-in native random oracle would look as:

function with_rand (input params) return (output params) {
...
variable ´= Rand();
...
}

for evaluating such a smart contract, there should be two transaction used:

- the first one would call the function with a public key of the random oracle making the initialization in the secret random numbers.

transaction init with_rand pub_key_rand

- the second transaction would publish the private key and make the whole evaluation and the execution of the rest of the business logic, like:

transaction exec with_rand priv_key_rand

The proposed system has unfortunately one denial of service attack possibility. The one who initiated the random oracle has the private key and can calculate the result previously. If this result does not make benefit for him, he choose not to reveal the private key or not to execute the second transaction.

Blockchain forking and CAP theorem

According to the CAP theorem, every distributed system can have one property from the three: consistency, availability and partition tolerance. As most of the systems a vulnerable for partition tolerance the question is usually if they prefer consistency over availability or vica versa. Public blockchains prefer simply availability, if there is a network partition or a disagreement in the network, the blockchain splits or forks to two different partitions having two different knowledge about the world. Other systems, like Hashgraph try to solve the forking problem with different other mechanisms, however there is probably no miracles, if the blockchain can not fork in case of a network separation it will stop working. Simply put they prefer consistency over availability.


"On a long enough timeline we will all become Satoshi Nakamoto.."
Daniel Szego

	*Daniel Szego*
Having spent one and a half decade in software development, engineering, R&D, project management and leading software companies; and having two master degrees one in engineering and one in business administration, I thought I summarize some of my theoretical and practical thought on the software industry and enterprise softwares.

	Contact