The Software Economist Blog: architecture

Showing posts with label architecture. Show all posts

Sunday, December 23, 2018

Architecting Blockchain and archiving

Realizing an archiving solution with the help of blockchain has many considerations. First of all, blockhcian is not very efficient to store a large amount of data. For this reason, we usually use a mixed architecture, namely a centralized or decentralized storage for storing the documents and a blockchain platform to store the integrity data of the document versions:

The architecture provides many different versions and combinations:

- Blockchain: can be public or a consortium one. It might work with many different consensus algorithm providing different kind of and different strength of cryptoeconomical guarantees.

- Storage: can be totally centralized, like a file storage or a cloud storage. It can be decentralized as well, like realized by IPFS, Swarm or Bittorrent.

Integrity of a document can be realized by hashing the document data with a timestamp and with some metadata and writing the data into the blockchain. This saves the integrity information into the chain and provides a hint that the document did exist. In real implementations, further consideration must be done, since the simple hash value might be vulnerable to a dictionary or rainbow table attack. For this reason, the simple hash value might be extended with a random salt, or optionally the document might be encrypted first and only the encryptoed version is written into the chain.

A further architecture possibility can be if we do not want to save even the hash value into the chain. In this scenario the blockchain is only used to track a certain number of trusted validators and a document can be regarded as valid if a majority of the tracked validators sign the document with some metadata. In this architecture there is no information about the existence of the document in the chain, but if the document exist, we can prove if it is valid.

Last but not least, we can have some consideration about the fact how the archiving logic works. The archiving logic might be somewhat more complicated, having like different rules for archiving. In such a scenario we might as well evaluate if the logic itself should run centralized or decentralized, like with the help of a Byzantine fault tolerant system.

Monday, December 17, 2018

Notes on decentralized business logic platforms

The disruptive technological ideas behind the blockchain applications gives the possibility to design better middleware frameworks as well for business logic. An architecture might have the following proprieties:
- Elements of the business logic are separated into transactions an atomic processing units.
- Transactions are signed by end-users providing authentication and maybe privacy in the whole system.
- Processed transactions are signed by the processing units as well providing a defense mechanism against tampering.
- Processing units can be configured with different combinations, like on the same computer or on different machines.
- Processing units can be configured with different scaling strategies, like scaling for performance, or scaling for security, like having different Byzantine fault tolerant algorithms.
- Service level agreement for a service should be defined and measured as well.
- Processing of a processing unit might be guaranteed by security deposit, that can be somehow automatically payed out if the service level agreement is not matched.
- Special consideration has to be taken for processing units making serialization, like writing something into a database or ledger.

Sunday, December 16, 2018

Secure multiparty protocol on the blockchain

Implementing a secure multiparty protocol on the top of the blockchain requires some special considerations. Examples might be for such protocols if semi trusted actors want to communicate with each other with the help of a consortium distributed ledger solution, like sharing salary data on the blockchain in a way that only average of the salary will be available, or similarly aggregating ghg emission data on a consortium distributed ledger, in a way that the emission data of the individual companies are not revealed only sum of the data.

Integrating blockchain with secure multiparty protocols have two major issues:

- Visibility of the data: by default all data on the blockchain is visible for all of the participants, which is not so optimal in case of a secure multiparty protocol. As a consequence, either an encryption algorithm should take place, or some of the data and communication should happen off-chain.

- Trust model: classical secure multiparty protocols assume that the actors are trusted. In the context of distributed ledger solutions, the assumption of the trust model is weaker, like assuming Byzantine faults as well.

A secure multiparty sum might be implemented on the blockchain with the following steps:

1. Each participant {1..k} generates off-chain a private and public keys

2. Each participant publishes the public key to the chain.

3. Each participant has a Vi value that should be summarized with the help of the secure multiparty protocol.

4. Each participant splits the Vi value into randomly into k pieces {v1, v2, ... vk} for each node.

5. The values are encrypted by the public keys of the participants, in a way that the first value ifs encrypted by the public key of the first node, the second value in encrypted by the public key of the second node and so on, forming {E(v1), E(v2), ... E(vk)} encrypted values for each node.

6. All of the data is published to the blockchain forming practically as a trusted communication channel.

7. Each node selects the data from the blockchain that is encrypted with its public key and decrypts them with the private key. At the end each node will know k pieces of decrypted data in a way that each value comes from different nodes.

8. Each node creates an individual sum of the different values, cause that the same summary is manifested at each individual nodes.

9. As an optional step the produced data might be published to the blockchain as well. We can build in here some kind of a Bytantine fault tolerance, like in a way that the sum values are published with the help of blind voting algorithm, where we can choose the sum values that is chosen by most of the participants (supposing that most of the participants are honest).

Wednesday, December 12, 2018

Bitcoin blockheader and on-chain governance information

The bitcoin header contains many information. Most of them are responsible to maintain the consistency of the bitcoin blockhcain. However, there is one that is a little bit exceptional and that is the difficulty target. Difficulty target is actually related rather to on-chain governance and not strongly to the consistency of the chain. The model can be actually extended in a way that there is more than one piece of on-chain information in the block header apart from difficulty. As a general extension there can be a specific data area only for on-chain governance information for which there are special rules how they can be changed and which information is secured with the help of a merkle tree which root is written into the block header.

Tuesday, December 11, 2018

Blockchain actually doesn't need blocks

Blocks in a blockchain protocol have several functions. Major function is clearly to have an order of the transactions, which order is the same for all of the distributed - decentralized nodes in order to avoid double spending. Like if there a two transactions that are double spends, every node can consider the first one as valid and the second one as invalid. Just for avoiding double spending however, we do not need an order. A partial order of transactions is enough in a sense if two transactions want to consume the same account or transaction there must be a common order among these two transactions. So, instead of blocks and full orders, it clearly makes sense to investigate more efficient storage structures for transaction processing.

Deletable blockchain with delegated proof of stake

Previous idea with the deletable blockchain platform can not only be realized in a quorum or consortium pattern, but with the help of a delegated proof of stake mechanism as well. In delegated proof of stake, like in EOS, some delegates are elected by the community of nodes either by an explicit voting mechanism or somehow indirectly with voting by cryptocurrency stakes. To produce a valid block, a majority of the delegates have to agree in the block and sign it. Similarly an external data that is not part of the chain can be validated by the delegates and signed by them. Certainly, there should be a cryptographical or cryptoeconomical mechanism that guarantees that in a round the same piece of information is not signed two times. The signature should be realized by one time keys of the delegate nodes, in order that the we can be sure that the given information was signed in the given round. As the information is not stored in the blockchain self, the architecture does not guarantees that the given information exist, if the information exist however we can 100% check if it was signed by the blockchain itself.

The external information to be signed can a block of an account - state based blockchain system, where certainly validity of the blocks and transactions must be checked by the delegates during the signature. There must be a mechanism making sure that the last, or last couple of off-chain blocks are stored, but regarding older blocks, the should not necessarily be stored. They can be deleted without causing consistency problems in the chain.

Saturday, December 8, 2018

Blueprint of a deletable blockchain platform

Based on our previous blog, creating off-chain proofs, there might be the possibility to create a blockchain algorithm where you can actually delete. For the architecture we defined the following architectural considerations:

- The nodes taking part in the consensus should have an identity, a private and public key key pair where the private key is kept secret.

- The consensus must be implicitly or explicitly based on a quorum, meaning that two thirds of the nodes should sign a given piece of information with validity.

- The consensus can be an indirect quorum, like in Tendermin, where there are several rounds for the consensus, like one for choosing a leader node and proposing a block and a second one for validating the given node by a quorum.

- On-chain governance must contain a transaction that adds new nodes of the system. After successfully adding a node to the system, the public key of a node (e.g. identity of a node) must be added to the blockchain.

- The blockchain in a classical sense contains only governance information, e.g. the public keys of the different validator nodes.

- There must be a mechanism and a special transaction for deleting validator nodes from the blockchain, like with the help of an explicit transaction, or with the help of an automatic mechanism, like if a validator does not sign a block for an amount of time, it will be deleted from the active validators.

- The architecture prefers consistency against availability at network separation, so it can not be forked and can not be long range attacked, unless more than one third of the nodes private key is leaked.

- Efficient network separation is a working denial of service against the architecture.

- Non-governance information, like smart contracts, or cryptocurrency transactions must be built up in a account-state based fashion, meaning that for a getting a valid state of the system, it is enough to read the last valid state.

- Non-governance information is not stored in blockchain but in a separate block that is stored off-chain and validated by an off-chain proof by the blockchain, see the following picture

- Validator nodes have a two level key mechanism: master key is the main key of the node, but in each round, the node creates a new private - public key pair where the public key is published into the blockchain. This round key is used to sign off chain blocks.

- At each round validity of the off-chain information and data is validated by all of the validator nodes and signed by the public keys that are specific for a certain round.

- An external observer can check if a given block is valid, by checking the public keys of the validator nodes and checking if they really signed the given block. As public keys are individual for each round, both the exact round and the fact if a given block is the last one can be easily identified.

- At signing an external block, there can be different strategies defined, like checking the last block or validating the last couple of blocks.

- Nothing guarantees that the external external block are stored. As it is on the one hand an advantage, because this way old information can be deleted, it is a drawback as well, because nothing guarantees that the old blocks are actually stored.

- There should be a mechanism that guarantees that the given last nodes are stored somewhere. This mechanism must crash fault tolerance, but not necessarily Byzantine, because based on the block proofs it can identified if the block is valid.

- The storage mechanism and probably the whole infrastructure as well must be supported by a cryptoeconomical incentive mechanism.

- The algorithm should have an economical incentive mechanism that slashes or discourages nodes signing more than one blocks on a round.

Wednesday, November 28, 2018

Architecting Blockchain platforms communicating with external data source

Integrating an external data-source with a blockchain solution is usually not an easy task. The major problem is that smart-contract systems can not directly call an external data source, because if different peers at evaluating the external data source see different pieces of data, they can not come to a consensus. So external data integration requires certainly solving some technological challenges. However even at considering the use-case and the general architecture, there are some questions that can be raised:

1. Decentralization model of the blockchain: depending on the use-case, systems can be built up to totally public and consortium blockchains as well.

2. Trust model of the oracle: in certain use-case we might as well say that there is one trusted data-source, custodian oracle, that we trust. There might be the case however that we want to integrate data from multiply data-sources in a way that no single data-source is trusted. Such a system can be implemented with the help of a game theoretical approach, usually Schelling point, providing a fully decentralized oracle algorithm. Such systems are realized for example by the prediction markets, ike Augur or Gnossis.

3. Trust of the communication medium: the communication medium is usually the internet that is pretty much untrusted, meaning that there is a need for both encrypting the data and preventing tampering, with like message authentication or authenticity proofs. There might be the case however, that we trust in the communication medium. As an example, if the oracle is an IoT source that is hosted by the same cloud provider as our consortium blockchain, we might as well can trust the communication.

Monday, November 19, 2018

Architecting for Byzantine fault tolerance

Designing computer architectures of the future will be surely extended by some new features, namely byzantine fault tolerance and trust model. As fault tolerance is usually an aspect to investigate, future systems can be designed for Byzantine fault tolerance, meaning that even if parts of the system is hacked, the system deliver correct results. One aspect that needs to be taken into account is the RAFT theorem which implies that in case of network partition the system has to choose between availability and consistency. Another important design choice is the trust model. At analyzing the trust model each component of the system has to be investigated in terms if a service of the system works only if we trust in the given component. In this sense we can distinguish between trusted, trustless or semi-trusted services or components.

Sunday, November 18, 2018

Notes on multi block algorithms and protocols

The research of the decentralisation is focusing pretty much at the moment for the scalability of different protocols and platforms. Bases on the current research directions, there might as well efficient blockchain protocols in a couple of years. So, we might as well investigate the possibilities of creating algorithms and protocols that can not e executed in one block or one transaction but can only be realized by several actions, crossing several blocks. Actually, layer 2 protocols, like lightning network or raiden are going a little bit in this direction. Multi block protocols can provide many services that are not imaginable with current one block architectures.

How to create native external oracle with Nakamoto consensus

Similarly to the previous blog desisgning a decentralized native external oracle can be realized in a same way as a native random oracle. The principle is basically, the same: on the one hand the miners or validators measure the external data sources and put them into blocks or at least temporal blocks. On the other hand, imported data should be kept in secret, because otherwise new miners could influence or even hack the algorithm itself.

The full multi-block native external oracle algorithm can be described as follows:

1. An imitator having a {Priv, Pub} private and public key pairs creates a external oracle request that includes the pubic key as well:

request_extern (Pub, N, ext)

, where Pub is the public key of the requestor, ext is the external data source and N is the number of round during the external data source has to be measured.

2. At mining or validation, a miner or validator measures the value of the external data source that is encrypted by the public key of the reqestor and put at in the request itself. So after the first validation, the request will look like:

request_extern (Val1)(Pub, N-1, ext)

where Val1 is the measured external data encrypted by the public key. To ensure security Val1 value should be put into the blockchain as well, not necessarily forever but at least during the N phase of execution of the native external oracle. So with other words, the request would be splited into two parts:

request_extern (Val1)

will be written into the blockchain

request_extern (Pub, N-1,ext)

can be available as a new transaction request that is propagated throughout the network in transaction pools, waiting to be mined or validated. Similarly, after k<N rounds, there would be k encrypted external values in the blockchain:

request_extern (Val1)

request_extern (Val2)

...

request_extern (Valk)

and a new request as a transaction which is

request_extern (Pub, N-k, ext)

3. After N rounds, the requestor can aggregate the external values and decrypt them with the Priv private key.

Ext1 = Decrypt_Priv(Val1)

Ext2 = Decrypt_Priv(Val2)

...

ExtN = Decrypt_Priv(ValN)

The individually external values should be aggregated in a way that the they provide a correct average value even if some of the nodes try to hack or game the system. The algorithm should contain an incentive mechanism as well, giving a reward to nodes that gave correct values in this way motivating nodes to produce correct data, providing a Schelling point as decision making algorithm. Supposing that around one third of the nodes and measurements can be fault, we can have the following algorithm:

a. filter 33% of the most extreme values
b. make an average of the rest that remained providing the real external value
c. reward the nodes whose values were not filtered out based on the distance of the average.

The algorithm has unfortunately some performance issues: It takes N rounds to find out a real decentralized external value. This can be pretty long and pretty costly depending on the given blockchain platform. Considering a Nakamoto consensus the algorithm is pretty difficult to speed up, as the basic idea is that the external values are comming from N individual different sources that actually means at a Nakamoto consensus N different blocks. This implies the fact as well that the data source should keep the value for a long-enough time, like preserving the same value for hours. The algorithms can not really be used with fast-changing data sources.

evaluate_extern (Priv)

At this moment the private key can be published to the blockchain and the evaluation algorithm can be implemented with a smart-contract in a decentralized way.

From a practical point of view a smart contract with built-in native external oracle would look as:

function with_ext (input params) return (output params) {

...

variable ´= External(<external_datasource>);

...

}

for evaluating such a smart contract, there should be two transaction used:

- the first one would call the function with a public key of the external oracle making the initialization in the external data measurement.

transaction init with_ext pub_key_ext

- the second transaction would publish the private key and make the whole evaluation and the execution of the rest of the business logic, like:

transaction exec with_ext priv_key_ext

The proposed system has unfortunately one denial of service attack possibility. The one who initiated the external oracle has the private key and can calculate the result previously. If this result does not make benefit for him, he choose not to reveal the private key or not to execute the second transaction.

Saturday, November 17, 2018

How to create a native random oracle with Nakamoto consensus

Creating a real native random oracle is one of the holy grail of the blockchain industry. As the problem is not so difficult if the consensus mechanism is a quorum as the nodes participating in the consensus making decisions independently from each other, it is more difficult at a Nakamoto consensus. The problem is with the Nakamoto consensus that the temporal leader creating the next block is practically a "dictator" of the next block and can influence like the random number. The algorithm can be however improved here as well with two ideas:

- creating a real random number is taking several round where several node guesses a random number which are then aggregated at the end. Certainly this is not necessarily a real solution as active leaders might see the previous random values and might influence the next influence in way that is profitable to the nodes. To avoid such a situations we could use the following idea:

- the random numbers are encrypted by a public key of a requestor. As a consequence the next node do not really see the previous values of the previous blocks, so it can not influence the final result.

The full multi-block native random oracle algorithm can be described as follows:

1. An imitator having a {Priv, Pub} private and public key pairs creates a random oracle request that includes the pubic key as well:

request_random (Pub, N)

, where Pub is the public key of the requestor and N is the number of round during the random number has to be generated.

2. At mining or validation, a miner or validator creates a native random number which is encrypted by the public key of the reqestor and put at in the request itself. So after the first validation, the request will look like:

request_random (Val1)(Pub, N-1)

where Val1 is the generated random number encrypted by the public key. To ensure security Val1 value should be put into the blockchain as well, not necessarily forever but at least during the N phase of execution of the random oracle. So with other words, the request would be splited into two parts:

request_random (Val1)

will be written into the blockchain

request_random (Pub, N-1)

request_random (Val1)

request_random (Val2)

...

request_random (Valk)

and a new request as a transaction which is

request_random (Pub, N-k)

3. After N rounds, the requestor can aggregate the random values and decrypt them with the Priv private key.

Rand1 = Decrypt_Priv(Val1)

Rand2 = Decrypt_Priv(Val2)

...

RandN = Decrypt_Priv(ValN)

The individually generated random numbers should be aggregated in a way that the randomness is preserved even if some of the values are not really randomly generated. The exact algorithm here is questionable, but it is important that the original entropy of the requested random number is maintained even if some of the nodes are cheating. Ideas might be:

sha256(Rand1, Rand2, .... RandN)

sha256(Rand1 xor Rand2 xor ... RandN)

sha256( ... sha256(sha256(Rand1))... )

The algorithm has two drawbacks that can be fine-tuned on a long run:

- The algorithm takes N rounds to find out a real decentralized random number. This can be pretty long and pretty costly depending on the given blockchain platform. Considering a Nakamoto consensus the algorithm is pretty difficult to speed up, as the basic idea is that random numbers are comming from N individual different sources that actually means at a Nakamoto consensus N different blocks.

- Based on the evaluation algorithm we should assume that some of the miners or validators created a good random number. Consider a good byzatnine evaluation function at the end even if some of the nodes cheat the resulting random number can be a good one, like cryptographically secure. The problem is however that even the hones nodes are not really incentivized to create good random numbers, hence we can not really measure or punish if nodes produce good random numbers. It can be certainly an assumption that certain number of the nodes are honest, but actually it would be much better to measure and validate this fact.

A further question can arise how the data exactly used like within a smart contract. As private keys should not be stored in the blockchain, there should be an additional round with the wallet software to encrypt and aggregate the information which might introduce elements of unnecessary centralization and difficult to build in directly into a decentralized smart contract. It is important to note however that this private key is not necessarily the same as the private key of the account so it can be actually revealed as soon as all the N nodes created a guess for a random number. As a consequence a second transaction can be used to calculate the real random number, like:

evaluate_random (Priv)

At this moment the private key can be published to the blockchain and the evaluation algorithm can be implemented with a smart-contract in a decentralized way.

From a practical point of view a smart contract with built-in native random oracle would look as:

function with_rand (input params) return (output params) {
...
variable ´= Rand();
...
}

for evaluating such a smart contract, there should be two transaction used:

- the first one would call the function with a public key of the random oracle making the initialization in the secret random numbers.

transaction init with_rand pub_key_rand

- the second transaction would publish the private key and make the whole evaluation and the execution of the rest of the business logic, like:

transaction exec with_rand priv_key_rand

The proposed system has unfortunately one denial of service attack possibility. The one who initiated the random oracle has the private key and can calculate the result previously. If this result does not make benefit for him, he choose not to reveal the private key or not to execute the second transaction.

Sunday, October 7, 2018

Designing a transaction oriented programming language

Blockchain programming languages have to be transaction oriented by design. Ideas might be taken from transaction oriented programming languages or early artificial intelligence approaches to implement procedural knowledge, like scripts or frames. Blockchain programming language has to have to following two main feature:

1. Even if the core can be realized by an imperative approach, each transaction must be supported by a strong declarative description providing elements for possible:

- defining the input and output parameters of the transaction

- defining before and after safety checks for both the input and the output parameters

- defining the states that the transaction will change

- defining after and before safety checks for implied states.

- defining the list of cooperating transactions

- defining advanced security model in a declarative way.

2. The transaction must have a good and easy human readable approach. As blockchain implementations describe collaboration models between several actors, the best way would be to use and advanced hierarchical approach of artificial intelligence for representing procedural knowledge, like scripts. Such a formalism might have the following properties:

- the spot

- the scene

- the roles - actors

- properties and objects

- scenes

- activities - set of actions

Monday, October 1, 2018

A blockchain protocol for scheduled transactions

Although many most of the blockchain protocols do not gives the possibility to schedule a transaction for execution, there is actually a simple way of doing that. First we have to use a transaction for that there is something similar already in bitcoin, saying that a given transaction can be mined only after a certain number of blocks, like:

Tr: send money from A to B after BlockNumber N signed by private key A

It practically means that the transaction stays in the pool of transactions until block number N because previously it can not be included in the chain as a valid transactions. As soon as the block N has been reached, the miners are free to mine the transaction.

The problem is however, that there is no incentive to mine this transaction at around block N. In fact storing such a transaction for a long time might be costly. As a consequence an additional incentive mechanism must be built in that incentives putting the transaction as fast as possible after block N into a block. In fact nodes can be motivated to mine the transaction as soon as possible, as only the first who mined gets the reward. The incentive mechanism, might be even pretty 'sharp' as well rewarding a transaction only if it gets mined between the blocks [N - N+3]. A timed transaction can be cancelled before the block N, by double spending it.

Tuesday, September 25, 2018

The role of blocks in a blockchain protocol

In a standard blockchain protocol, mining or minting a block has two different purposes. On the one hand, there must be a common set of ordered transactions that are acceptable to the whole network and avoids double spending. Avoiding double spending is actually equivalent with the fact that there a common order of transactions that is the same to the whole network. On the other hand, the transactions must be applied to the state. As these two steps are actually happens in the same round, this is not necessary a must. Application can be imagined that applying transactions to the state happens only a couple of steps later, as the whole network perceives the same ordered steps of transactions. In certain special circumstances applying to the state might be non-deterministic as well, containing some built-in optimization possibilities.

Sunday, September 23, 2018

On chain governance with Hashgraph

Creating a native on-chain governance on Hashgraph is similar to creating any kind of an on-chain governance system. For the first run there must be some special internal governance variables that might influence consensus parameters, node configuration or any other details of the core algorithm. Such variables should have two properties:

1. They can not be changed absolute freely only after a kind of a community voting.

2. They must be anchored somehow with the nodes or with the core algorithm.

For modifying on-chain properties there must be a special transaction, instead of a change value of a blockchain field transaction, there should be:

TR: I vote for changing the value of a governance variable - signed by the private key of an account that can actually vote for such a change. Such accounts might be associated with nodes, or with consortium members of the Hedera Hashgraph.

After the voting transactions are validated and ordered and the voting period might be at the end, each node can apply the result to the state individually. A possible algorithm might be:

- count the votes for a given change, if it is more than 50% apply to the change to the state.

Creating a Byzantine fault tolerant decentralized exchange with Hashgraph

The ordering algorithm of the Hashgraph architecture provides a way to realize a real decentralized exchange among different internal cryptoassets. Let we imagine that we have several native cryptoassets in a account/state based system. What we have to do is to introduce special transactions that represent buy and sell orders. As an example a buy order might look like:

Tr_order: buy 100 cryptoassets at 10 - signed by a private key of an account that has the balance of buying.

Similarly, a sell order might look like:

Tr_order: sell 50 cryptoassets at 11 - signed by the private key of an account that gets the cryptoasset at the end.

The order transactions are validated and ordered with the help of the gossip of the gossip protocol and after a couple of rounds every node will see exactly the same order of the order transactions. This list of order transactions is actually the order book of the decentralized exchange.

At applying the transactions to the state, we have to do two things:

- there must be a kind of order matching algorithms on the order book, with other words on the ordered order transactions. Examples might be like FIFO or Pure Pro Rata. A list of possible order matching algorithms can be found for example under the following link.

- the orders that are matched has to be applied to the state. It practically means increasing and decreasing certain balances, similarly a simple transaction is applied to the state. In this case however there might be more than two accounts that has to be modified in the same time.

As the above mentioned algorithm is deterministic and the nodes see exactly the same order of order transaction, the nodes can calculate the algorithm independently from each other and getting the same result. As the core algorithm is Byzantine fault tolerant, the exchange will probably have the same property.

The decentralized exchange can be further improved in a way that instead of hard-coding the order matching algorithm, realizing an on-chain governance process to change order-matching on the fly.

For experimental implementation take a look the following github repository.

Creating native decentralized oracle with Hashgraph

Hashgraph provides the possibility to realize a fully decentralized Byzantine native oracle service. Let we imagine that we want to call an external data source in a smart contract function, like

function () {
state = external_oracle("http://external_datasource/%22)
}

At event propagation each {N1, N2, ... Nm} node has to call the http://external_datasource/ independently from each other producing a {V1, V2 ... Vm} possibly different values as results for the calls. These values are propagated, voted and ordered by the events together with the transactions.

As soon as both the transactions and the values are communicated ordered and voted (so practically at applying to the state), we can use the following function to calculate the result of oracle:

external_oracle calculation:
step1: filter from {V1, V2, .... Vm} the extreme big or small values, like 33% of all of the values
step2: do an average of the non-filtered values
step3: the result of the function is the average

Since at the time of the calculation each node sees exactly the same set of {V2, V2, ...Vm} values and the external_oracler calculation is deterministic, at each state there will be exactly the same value calculated even if they are calculated independently from each other.

Since the communication architecture (Hashgraph) is Byzantine fault tolerant, probably the Oracle itself will be probably Byzantine fault tolerant as well (might require some more considerations)

The algorithm can be cryptoeconomically further secured, like giving a bigger transaction fee for nodes producing values that are less far for the end result of the external_oracle calculation function (providing a Schelling point from a game theoretical point of view).

Demo implementation can found in the following github repository.

Hashgraph, transaction censoring and transaction delaying

The Hashgraph algorithm provides from a mathematical point of view fair ordering and access. However it is important to point out that these mathematical proofs working on an event level, that might be a little bit different if we consider things on a transaction level or from a real architecure point of view.

One attack might be to censor a newly created transaction on a node level, before the transaction even really reaches the network, before it will be put into a node. The censorship can be realized by directly hacking the Hedera software, but it might be realized indirectly as well, like censoring the transactions at the operation system or at the network level. For such a censorship attack, there might be several resolution:

- users might use only trusted nodes. However in an untrusted or semitrusted architecture this is not a lucky algorithm.

- users might broadcast a transaction to a node and monitor the system. If the transaction was not propagated, the transaction can be broadcasted again, or to another node, or several other nodes as well. It can work in situations where delaying or timing of the transactions can not cause problems.

- to be 100% sure that the transaction is propagated as fast as possible, users has to broadcast the transaction to 33% +1 nodes. As the theoretical limit of a Byzantine fault tolerant architecture 33% in such a situation at least one node should work honestly and propagate the transaction right away.

Similar attack can be realized as a newly created transaction is propagated to the network by a node but not with the next possible event, but with a later one, cause indirectly a delay in the transaction. It might cause serious economical damage with certain timing relevant applications, like online gambling and games. To be sure that no delaying can occur with a newly created transaction, the most secure way is again to propagate the transaction to 33%+1 nodes in the system. In such situation, if one honest node propagates the transaction to the network as soon as it gets, than the transaction will be part of the hashgraph right away, independently if other faulty or hacked nodes delegate the transaction only with a delay.

Fair ordering of transactions in Hashgraph

The Hashgraph virtual voting algorithm provides a fair ordering of the events in the following sense:

- Fair access: means that no individual can stop a transaction from entering the system,

or even delay it very much.

- Fair timestamps: A consensus timestamp that reflects when the majority of the network

members received that event. This consensus timestamp is “fair”, because it is not possible for a

malicious node to corrupt it and make it differ by very much from that time.

- Fair transaction order: A common order of the events that is to be seen exactly the same way after virtual voting by each node and that can not be influenced by individual nodes, or the minority of the nodes.

There is one problem however, the mathematical side of the algorithms and the proofs guarantee a fair ordering of the events and not the transactions them self. Certainly, if each event contains at most one transaction, than fair ordering of the events equals with the fair ordering of the transactions. If however on event can contain multiply transactions, the situation is not so simple. A node creating multiply transaction for an event can game the order of the transactions in a certain event and even between multiply events.

To avoid unfair ordering of transactions in an event an improvement might be to define an order and a timestamp of the transactions inside a node. On such an order might be to consider the hashes of the transactions in an event in a increasing order, like:

sha256(Tr_1), sha256(Tr_2), ... sha256(Tr_k) in an increasing order, where {Tr_1, Tr_2, ... Tr_k} is the set of transactions associated to an event.

It does not really matter how an individual node orders the transactions in an event physically, the events are always regarded as the ordering of the transaction hashes. As a consequence it can not be gamed by the individual nodes unless transactions or signatures for transactions are forged that is usually pretty difficult because of the cryptographical algorithms.

Similarly a global order of the transactions can be defined as well:

- the global order of the events is provided by the Hashgraph algorithms.

- the order of the transactions in an event is provides by the ordering of the hashes.

Based on the global order and the Hashgraph algorithm for attaching a timestamp for each event, the transactions can be associated by a timestamp as well:

- if an event contains only one transaction, the transaction timestamp equals with the timestamp of event.

- if an event contains several transactions, the transactions must be timestamped in the event. A possible algorithm might be the following:

Consider an Ei event that contains the {Tr_1,Tr_2, .... Tr_j} transactions that are ordered as {sha256(Tr_1), sha256(Tr_2) ... sha256(Tr_j)}. The time period between Ei and Ei-1 can be evenly distributed among the transactions in a way that if a transaction gets a higher ordering, it gets a newer timestamp.

Last but not least, a fair access can be only be reached if transaction are communicated to several nodes, like one 33% +1 of the exiting nodes.


"On a long enough timeline we will all become Satoshi Nakamoto.."
Daniel Szego

	*Daniel Szego*
Having spent one and a half decade in software development, engineering, R&D, project management and leading software companies; and having two master degrees one in engineering and one in business administration, I thought I summarize some of my theoretical and practical thought on the software industry and enterprise softwares.

	Contact