The Software Economist Blog: IT

Showing posts with label IT. Show all posts

Saturday, December 1, 2018

Notes on the strengths of the cryproteconomical guarantees

Cryptoeconomical guarantees are actually weaker than the pure cryptographical ones. Cryptoeconomical guarantee means that hacking the system is not profitable, meaning that more money will be lost than gained. As an example proof of steak is based on such a cryptoeconomical guarantee, that states that if I want to hack the system and it become well known you lost all of your stake. As it is a logical motivation that most of the people or rational actors behave economically rational, meaning that they tend to maximize the profit or benefit in some sense. However, the problem is that most of the economical guarantees exist only in the system itself based on assets and resources that are defined by the system itself. So, it is certainly if I try to hack the system in a proof of stake I will loose my stake, but perhaps I my economical rational profit maximization is not based hundred percent to the systems itself. As an example, if the hacking attempt will be well-known, the general trading value of the platform against like USD might fall, that means that I can make a profit from shorting and I might as well do profit even if I loose my stake.

On the contrary cryptographical proofs use another scarce resource, which is usually computation. Simply put, most practical cryptographical algorithm assumes that breaking the algorithm is computational infeasible meaning that it would take millions of years even with the most advanced, state of the art computers. This is a kind of practically impossible guarantee which is much stronger than a not profitable one.

As a consequence, a cryptographical or computational guarantee is always stronger than a simple cryptoeconomical one, mostly because economical rationality can be interpreted only in the system but in general economical rationality is something more complex.

Perhaps it is important to note though that proof of work is actually an economical guarantee not a cryptographical one, because the amount of computational power to break the consensus does depend on the competition of the miners. With other words, it depends on the amount of money invested to mining equipment.

Monday, November 12, 2018

Notes on DevOps, agile development and maintenance cost

Surprisingly, techniques like DevOps and Agile did not actually make the software industry easier or user-friendlier. The facts of automated and regular software deliveries made certainly possible to adapt the software more frequently on the user requirements, however they made the maintenance and operation of the software more difficult. Simply put running a software that has daily delivery is not easy, but the biggest problem is that most of the software components do not run individually, but with the help of docent or hundreds further software and software modules together. If you consider that each of these software can be released on a daily basis and usually documentation is the last priority of these systems, it surely results an enormous cost in maintenance, if it is possible at all.

On solution might be the appearance of the AI based software configuration and maitainance systems. I mean from a pure theoretical point of view, there might be the idea of making our software systems simpler, but to be realistic, that is not going to happen.

Saturday, October 13, 2018

On the need of zero knowledge order book matching algorithms

There are many ideas for implementing a decentralized exchange protocol on blockchain. As it might be realized naturally with the help of a quorum consensus, it is not so much easy with a Nakamoto one. The problem is that in a Nakamoto consensus, a leader is choosen in each round to create and start propagating the next block. This leader will act as a dictator for the certain block, meaning that it can influence the transactions that will be minded into the block. It might be a DEX realization idea to create sell and buy orders and transactions and realize the matching algorithm with the help of mining, however it can be "gamed" very strongly in a Nakamoto consensus by the given miner. Hence, buying and selling orders can be "gamed" by the rest of the network, like the transactions themself can be delayed or censored. A solution to these problems might be the definition of zero knowledge order matching algorithms, where the creator of the order, and the exact amounts or other order details are hidden by cryptography. Certainly, it is an open question if efficient order book matching algorithms can be defined in such a way.

Tuesday, October 9, 2018

Notes on semi public blockchain networks, Hashgraph and regulation

There is an interesting combination between public and private blockchain networks and that is the sol called semi-public network. In such a field, the network will be operated or governed by several semi-trusted actors, but the network itself from a development or application point of view is open for everybody. However, it is pretty much questionable if that model can work at all. The model is that an application or transaction can only be censored if a majority of the consortium members will vote for that. It is good question what is going to happen on the practical side if an application is not conform with a regulation of a certain region. One model is that an application can only be filtered if most of the regions vote the application out. Exactly the opposite can happen however from the practical side as well, there might be the situation where an application has to be conform not just with one regulation of a certain region but with all regulation of all of the regions.

Sunday, October 7, 2018

Designing a transaction oriented programming language

Blockchain programming languages have to be transaction oriented by design. Ideas might be taken from transaction oriented programming languages or early artificial intelligence approaches to implement procedural knowledge, like scripts or frames. Blockchain programming language has to have to following two main feature:

1. Even if the core can be realized by an imperative approach, each transaction must be supported by a strong declarative description providing elements for possible:

- defining the input and output parameters of the transaction

- defining before and after safety checks for both the input and the output parameters

- defining the states that the transaction will change

- defining after and before safety checks for implied states.

- defining the list of cooperating transactions

- defining advanced security model in a declarative way.

2. The transaction must have a good and easy human readable approach. As blockchain implementations describe collaboration models between several actors, the best way would be to use and advanced hierarchical approach of artificial intelligence for representing procedural knowledge, like scripts. Such a formalism might have the following properties:

- the spot

- the scene

- the roles - actors

- properties and objects

- scenes

- activities - set of actions

Saturday, October 6, 2018

Notes on blockchain programming languages

Blockchain and smart contract programming languages have actually a pretty bad approach as most of them start from an object oriented approach. In an object oriented approach, the fundamental modelling approach is the object that has some properties and some functions that mostly meant for changing the internal state of the of a given object. However in a blockchain system, the central modelling unit must be the transaction that will be executed and validated throughout the distributed ledger. Transactions do some kind of a change in the state which might be represented by objects or any other kind of formalism, nevertheless the focus is on the transactions, so the central point of the modelling must be the transaction as well. Probably that is the reason that object oriented initiatives, like solidity, are not natural in blockchain context and sometimes difficult to interpret what they should do. Languages, like the modelling and transaction language of Hyperledger fabric composer are more natural to follow.

Saturday, September 1, 2018

How to implement a Blockchain from scratch - block and blockchain objects

Building up a blockhcain archtiecture, two of the most important roles are certainly the blocks and the blockchain. At the implementation at least three important scenarios have to be taken into account:

- Mining: at mining, the miner creates a brand new block, adds the transactions to the block from the transaction pool and solves some cryptographic puzzle. As the block is created correctly it will be added to the local blockchain and will be communicated on the network with the help of a gossip protocol.

- Synchronization: at synchronization, the blocks are queried from the network one by one, they are validated and added to the chain. At synchronization, there is not necessarily a forking strategy, the blocks can be queried based on the block id which can be provided as on consistent set of the chain. Similarly, the other side of the synchronization is to provide a set of transaction id-s which is regarded as the longest consistent blockchain.

- Gossiping blocks: If the node receives a block on the network, first of all the network should already by synchronized. if we receive a new block, it must be validated and a fork resolution algorithms has to be executed as well. If the block extends the longest chain, the block should be added to the longest chain. If a block extends one of the alternative chains, it can be added to the alternative chain and it must be decided if we have a new chain longest chain. There might be the case, that the block can not be added to the chain at all, if so it can be added to a pool of orphaned blocks. Last but not least, there might be the situation as the block can not be added to the chain alone, but only with the help of another block that is already in the pool of orphaned blocks.

Tuesday, August 28, 2018

How to implement a Blockchain from scratch - syncing accounts between state and wallet

In an account balance based blockchain system, there are accounts both in the blockchain state and in the wallet as well. It is important to understand the life cycle and syncronization between these elements:

- The accounts in the wallet should represent only a copy of the accounts of the state.

- Extended information can be stored art the accounts of the wallet, as an example the private keys for making signature simpler.

- The accounts of the state should contain only public keys or addresses derived from public keys, not private key should be stored in the account of the chain.

- After every new block, the wallet has to be synchronized. It is an open question how the synchronization should be carried out with the fork resolution strategy. There might be different strategies, like showing always the values of the top block of the actual state or waiting for a certain number of confirmations to avoid forks.

- If a new transaction is initiated, it might refer to accounts that are still not in the state, only the public private keys or address were generated and they are only stored in the wallet.

- At a currency transfer transaction the from account has to be in the state with a big enough fund and with a consistent nonce.

- At a currency transfer transaction the to account should not necessarily be in the state. It can be added at the mining with the amount of money that is transferred to. It is important that the to account must be compatible with the from account if we consider a multi-asset scenario.

- There must be a couple of genesis accounts and or coinbase transactions for each cryptoasset, for the initial distribution of the monetary supply. The exact implementation depends on the issuance of the cryptoasset. For creating a genesis or coinbase of a new crptoasset, a new validation rule, perhaps a brand new transaction type has to be introduced.

- At a data setting transaction, the initial account must not necessarily exist, it can be added anytime if there is a valid signature related to the address of the account.

Monday, August 27, 2018

Blockchain strategy and bootstrapping the ecosystem for developers

With the appearance of the newer and newer blockchain platforms, every company tries to position and bootstrap the platform differently regarding the developer community and attract more and more developers the work with the platform. Strategies might vary as:

- creating a brand new platform with a brand new language: the best example for that is solidity, as it was one of the first language for blockchain development it made sense to invent a brand new language. Similar attempts were initiated by Vyper or by the modelling language of Hyperledger Fabric Composer.

- supporting a well-known language: many platforms tries to use a well-known language which was previously widely used by programmers, like Java or Javascript, and attract as much developers from the given language as possible. Similar strategy is the choice of Java for Hashgraph, or the Java and Javascript for Hyperledger Fabric.

- last but not least, there are platforms that support multiply programming languages like Tendermint or Babel, trying indirectly attract everybody who is a developer throughout the world.

The strategy can be however extended. As the aim should not be be of any such a platform to attract as much developer as possible, but as much application builders or applications as possible. In this sense strategical direction can be to attract business users instead of developers and provide no-cost or low-cost application development environments. Another idea might be to provide an interface or possibility to integrate different exiting business applications, or use directly a domain specific languages for modelling businesses integrated with the blockchain platform, like different BPMN notations.

Tuesday, August 21, 2018

How to implement a Blockchain from scratch - gossip protocol

Blockchain protocols have several different ways of communication, there are gossip and non-gossip based ones. The beginning of the network communication is usually a non-blockchain based one, a peer connects to several neighboring peers, checks versions of the peers and queries further peer information if it is required. Similarly, synchronizing the blockchain is not a gossip protocol either: the peer queries the neighbors for the latest block number and based on an inventory query it will synchronize the whole blockchain. Blocks and transactions are propagated with the help of a gossip protocol. The logic is something similar:

- If the node initiates a new valid transaction, the transaction is added to the transaction pool and propagated to all neighbouring peers.

- If a node receives a transaction, first the validity of the transaction has to be validated. If the transaction is valid, it has to be checked if the transaction is already somewhere mined in the blockchain or in the transaction pool. If so nothing has to be done. If not, the transaction has to be added to the transaction pool and the transaction has to be propagated to the connecting peers except from the one from that we got it.

- If a miner mined a new block, the block has to be propagated to the network, and the local wallet has be updated based on the new block information.

- If a node gets a block on the network, first the validity of the block has to be checked. It might be a little bit difficult, because it might still not in the blockchain. Therefore there should be an explicite set containing stale blocks that still can be not added to the blockchain. A new block is valid if it can be added directly to the blockchain, or there is already a stale block in the pool and the two blocks can be added to the blockchain. If it can not be added to the chain, it should be saved in the stale blocks pool. If the block is already in the blockchain or in the stale blocks pool, there should not be propagated further. Otherwise the block must be propagated to the neighboring peers.

To avoid network overload, it is possible to use only the block and transaction id-s in the gossip, flooding process and getting the content of the data only if it is necessary.

Saturday, August 18, 2018

How to implement a Blockchain from scratch - event bus

Key component of the every blockchain architecture must be a reliable event bus. There are many parallel actors working with the data of a node, like

- peers gossiping and requesting information, like new transactions, or new blocks

- wallets initiating transactions

- miners or validators working directly or indirectly with a node

- blockchain explorers requesting important data regularly

- and of course an advanced logging system writing everything to a local log and supporting both standard and debug mode is also required.

For this reason, it is practical that every node implements an event bus with the funcionalities:

- different actors can push different pieces of information on the bus, with the type of information and the severity of the information or error.

- different actors can subscribe for different pieces of information, as an example, a logger would write everything into a file, a wallet would be interested on events if the blockchain gets synchronized, if the initiated transaction gets mined or validated, if the balance of a supervised account changes and so on. Similarly, a blockchain explorer interested if there is a new transaction which is being gossiped into the system, if there is a new but still not validated block, if there is a new validated block and so on.

Even some part of the standard protocol might work totally asynchronous from each other realizing the central communication protocol via an internal event bus of the node.

On the accounts of a account balances based blockchain system

Accounts of an account-balance based blockchain system are practically storage spaces that access is controlled by a public - private key ownership or identity proof and validated by the nodes. In a simplest case an account stores the balance of a use, a signature with the private key provides kind of a proof of identity, that you are allowed to access to the given account. It is important to note that in a transfer transaction, there are two balances that are validated differently:

- the from balance must be checked if it matches with the signature of the private key of the account and the balance must be bigger than the amount of cryptocurrency to be transferred.

- the to account must be compatible however with much less rules. Practically, it must not eve exist, it can be added to the chain on the fly.

Ethereum extends this simple model to two different kinds of accounts: externally owned accounts and smart contract accounts. As externally owned accounts can practically initiate transactions to smart contracts, with prooved ownership of private public key information, smart contracts only react for external events. However the model can be even further extended. In a certain blockchain system, there can be:

- Any kind of different account, with different state storage and validation rules, like accounts for miners, validators, special roles executing optimizations...

Friday, August 17, 2018

Syncing Blockchain from a certain block or time

At blockchain synchronization one of the biggest problem is that the full blockchain has to be synchronized and validated from the genesis block. Supposing we have an UTXO based system, it is actually necessary, because there might be UTXO-s which are at the beginning of the chain, but despite they can be spent. We could however consider with an account-balance based system not to download and validate the whole blockhchain just like the last thousand blocks, as the correct state is contained at the last state as well all the other blocks are related only a a consistency and security guarantee. Such an algorithm might have raise the following issues:

- Depending on the consensus mechanism downloading the last thousand block can be as much secure as downloading everything from the genesis block. In proof of work a long range attack from a thousands block in the past is as much impossible as doing a long range attack from the beginning. Similar might be true for proof of stake and other consensus algorithms as well.

- The real challenge is however to get the quasi genesis block in a reliable way. Certainly getting thousands blocks that are fake are not necessarily simple, but an attacker could simply send an older version of the chain fragment, let we call it as a replay blockchain fragment attack.

- To prevent a replay blockchain fragment attack, we can introduce a the block numbers in the block headers. So, first step of the P2P algorithm would be to query the block heights, and bases on the block heights, the blocks with the exact numbers can be queried.

- However, there might be one more attack vector. Even if we have block identity information in the block headers, and attacker might try to build up a blokchcain segment in the future, knowing simply the block id-s and broadcast this fake segment as soon the blocks from the certain id are queried. Let we call this attack as alternative future blockchain fragement attack. There is no known good to implement a chain resolution strategy that can efficiently distinguish between the valid chain and an alternative future.

Notes on blockchain system state

The state of the blockchain can be different in different platforms and representations. In an UTXO bases system, the system state is indirectly represented by the outputs of the transactions. In an account - balance based system, each account contains an internal variable storing the balance and the account can only be changed if the owner signs a transaction with the private key. The model can be extended to contain variables and other elements in the state, just like with Ethereum, where the different state variables can be changed under different validation conditions. Further improvements can be imagined however, both the state and the validation can be hundred percent customized just like at Tendermint and a couple of similar blockchain systems. There might be a separation between standard and system blockchain state, giving an opportunity to on-chain governance. We might as well imagine that the blockchain state contains full scale objects, including properties and functions as well where the object state can be changed only via the functions that has to be signed by the private key related either to the public key of the object or the public key of the functions.

Wednesday, August 15, 2018

blockid in a multi-hash blockchain system

In a multihash blockchain system, individual blocks do not have one hash identity but several ones, at least one pair for each hash link. This guarantees the consistency of the blocks. Despite if one wants to download or refer to a block, it is much more practical if we have one common id, that can be refereed either at the communication protocol or from the blockchain explorers. Such an id can be for example the hash of all of the exiting hash links and hash pointers. Certainly, it is an open question if such a construct open a new attack vector or vulnerability, because the block hash directly is not contained in the hash list of the blockchain.

Tuesday, August 14, 2018

Blockchain and cryptographical rolling hash

Blockchain solutions has one of the key advantage that they are pretty much hacking resistant, because of the chain of blocks and hashes that going back up to the genesis block. However, most of the applications do not necessarily require such a huge security guarantee, it is probably enough if data can not be hacked and consistent considering the last couple of months or years. On the other hand, there can be the requirement to delete from a blockchain solution or has the possibility to forget or modify data. On solution might be to design a blockshain platform where data is not hashed back to the genesis block, but only the last couple of like thousands of blocks are considered. Certainly, it is a question how such algorithm can be realized in a hacking resistant way. One solution might be to try to use instead of cryptographic hashes, a kind of cryptographical rolling hashes, where only a certain number of past values are considered as input.

Sunday, August 12, 2018

How to implement a Blockchain from scratch - smart contract simplified

In a simple account/ balance/ state based blokchain system implementing smart contracts is pretty straightforward. Accounts represent for the first run not necessarily just balance but a kind of a general data as well that can be modified by the smart contracts. In order to create smart contracts, you should define the language or smart contract programming environment itself and the effect that a smart contract can result in the state. Certainly one way of doing it is to define a virtual machine which guarantees that the smart contract is executed exactly the same way on every peer. However we might as well consider an exiting virtual machine as well, like the java virtual machine and limit somehow the effects of the program. As an example a simple smart contract could:

- read some of the state information of the blockchain manifested by accounts data and balances. This state information is the previous block on which we want to mine our contract.

- having some computation on top.

- changing the data value of a certain account.

- storing the smart contract code somehow as data or string, like with the help of serialization

- creating a special transaction containing the smart contract as data with the sign of the account that you want to modify, indicated indirectly the owner of the account allows the data to be changed.

At mining process:

- The signature of the smart contract transaction has to be checked.

- It has to be made sure that only the effected accounts are modified.

- The code has to be run and the new data value must be calculated.

- It has to be made sure that the smart contract does not cause infinite loop, one way of doing it is to avoid general loops, or to terminate the contract after a certain number of iteration resulting the transaction as invalid. Certainly another way can be built in as well, like with the help of a cryptoeconomical mechanism longer smart contract runtime can disincentivized, just like as with Ethereum.

- The new value or values have to be applied.

- The block must contain the valid transaction and the new valid state, which is the new value of the computed accounts.

At validation process, practically the same steps have to be repeated, without the last one, which is putting the transaction and state to a new block and doing proof of work calculation or voting of a byzantian consensus mechanism:

- The signature of the smart contract transaction has to be checked.

- It has to be made sure that only the effected accounts are modified.

- The code has to be run and the new data value must be calculated.

- The calculation must have finite time.

- It has to be checked if the new values of the state of the given are the calculated values based on the values of the previous block.

The wallet functionality has to be extended:
- to have the possibility to write or integrate smart contracts.
- to transform the programs into data, like with the help of serialization.
- to create transactions based on the smart contract.
- to sign them.
- to broadcast them on the network.

How to implement a Blockchain from scratch - extended wrapper classes

At designing at implementing a blockchain system from scratch, there might be some contradictory design perspectives: on the one hand elements of the blockchain that are stored or transported on the network must require as less storage as possible, resulting in lower bandwidth or data storage requirements. On the other hand for efficient processing, some further information is usually required. Examples are:

- Blocks in a basic scenario should store the difficulty, the nonce and the a hash value of the these values values together with the merkle roots of the transactions and state and the previous block hash as well.

- Blocks in an extended scenario might contain explicitly a link for the previous block, for the further, some information if it is an orphans block or on the block height.

- Accounts in a basic scenario should contain an address of an account, which is usually a public key, and some kind of a change request, like transferring money, or changing a value. On top, certainly a nonce value.

- Accounts in an advanced scenario are related rather to the accounts of a certain wallet, so they might contain explicitly the private key and meta-information if the account is synchronized with the blockchain, or still not available in the blockchain.

There might be a similar consideration for other elements of the blockchain as well, like Block Headers, Transactions or Peers. Practically every object that is moved on the network can be considered as implemented as a basic version containing only the relevant data, and as an extended version containing all the computational relevant data.

The new divide of the information society

There will be a new divine in the information society between people producing information and people consuming information. As information can be simpler and more automatically produced every day, like with the help of massive online courses, chat bots, and different online platforms, it will be more and more information produced by less and less people.

Services will become more and more automated, meaning that more customer service will be carried out by bots or robots, resulting a cost factor saving in the employees and enabling fully automated decentralized companies. As an average customer it will be more and more complicated to fight through the bureaucracy of bots and robots and get to a "living" person.

Saturday, August 11, 2018

How to implement a Blockchain from scratch - network protocols

At creating a blockchain solution from scratch, one of the most important part is to design a set of network protocols. It is important to note that these protocols are on the application level, under the hood they might be realized by simple socket sending on a TCP, something more abstract RPC or JSON-RPC, or even with onion routing. At any case, the following protocols should be considered:

0. Get client version: connecting to a known peer and getting the version of that peer. It prevents the usage of incompatible peers, that might be highly important especially if there are updates of the code regularly.

1. Update peer information: A brand new peers is usually started with a couple of preconfigured nodes further peer information. These peer information are queried for the further known peers until the number of known peers reach a certain limit (like 15 in Bitcoin by default). The peers may go offline and come back online again, therefore the active peers have to be checked regularly if they are still alive, if not different strategies can be implemented, like:
- deleting the inactive peer from the cached peer list.
- waiting for a certain time if the peer will be online again.
- querying all the still active node to get more peers that are hopefully active.
- or a combination of the previous strategies.

2. Syncronizing the blocks and states: As a first step of using the blockchin, the blockchain must be more or less up to date with the rest of the world. As a first step, the node can connect with all of the peers and ask the size of the blockchain. Based on that information, it knows exactly if the blockchain needs to be synchronized or not. As a second step a

3. Propagating transactions and propagating blocks: if new transactions or blocks arrive, they must be registered locally and further communicated possibly as fast as possible to all of the connecting peers in the form of a gossip protocol. If it is a transaction, it must be added to the local transaction pool, if it is a block it must be added to the possible top headers of the blockchain. It is especially critical with the new blocks as the winner of the mining competition depends on the speed of the propagation. This phase should be available just after the blockhcain has already reached at least a quasi synchronized status. It is however an interesting general question how the propagating mechanism might stop, without effectively over flooding the network. One way might be to implement something as a finite hop system, in which a given information is propagated only at a given time. Another idea might be to implement regularly handshakes where peers exchange information on relevant transactions and blocks first before transporting them.

In all of the communication protocols, it is always questionable, if we consider a kind of a push or pull protocol. In both protocols, the design direction should be to transfer as less amount of data as possible, and only if it is necessary. As a consequence, most protocols should transfer the inventory first, meaning the hash values of blocks and transactions. After that the nodes should be able to download the content of the hash values on-demand.


"On a long enough timeline we will all become Satoshi Nakamoto.."
Daniel Szego

	*Daniel Szego*
Having spent one and a half decade in software development, engineering, R&D, project management and leading software companies; and having two master degrees one in engineering and one in business administration, I thought I summarize some of my theoretical and practical thought on the software industry and enterprise softwares.

	Contact