Blockchain-Based Secure and Fair IoT Data Trading System with Bilateral Authorization

.


Introduction
Internet of Things (IoT) is the interconnection of physical devices, such as appliances, vehicles, and wearable devices, through the Internet. It has a number of benefits for industries and professionals. IoT devices are equipped with sensors to generate massive amounts of data. Moreover, these interconnected devices can communicate with each other and with humans, which leads to new applications and services that can improve efficiency, convenience, and quality of life [1]. The data generated by IoT devices can also be analyzed to gain insights and make informed decisions. With the development of IoT infrastructure and the rapid growth of IoT devices in recent years, we are experiencing an unprecedented surge in data.
Nowadays, there are many IoT application domains in our lives, such as healthcare, transportation, environment, and so on [2,3], and we are living in the era of the data economy. The huge volume of data collected by various IoT devices is regarded as a valuable asset. In particular, the data can be used for commercial purposes with the proliferation of machine learning solutions for artificial intelligence and big data analytics [4,5]. Such a trend requires a data marketplace [6,7] for buying and selling data, as shown in Fig. 1. However, online data trading platforms face some restrictions in trustworthiness. First of all, security and fairness are crucial challenges because online trading is carried out between non-face-to-face participants who do not fully trust each other. Therefore, it is essential to develop a secure and fair data trading platform where data can be traded between sellers and buyers in a trustworthy manner. Fair data trading implies that either the seller gets paid for the data and the buyer obtains the purchased data, or both parties fail to achieve their desired outcome. However, since online parties tend not to trust each other, it is hard to achieve a fair exchange without a trusted intermediary [8]. Hence, conventional data trading systems rely on a trusted third party (TTP) that has centralized control of the data trading platform and responsibility for recording the evidence of interactions [9,10]. However, the lack of accountability and transparency in such a centralized system are still concerns, as well as the single point of failure problem.
Blockchain is a decentralized tamper-resistant ledger technology that offers a secure and transparent way to record and verify transactions [11,12]. It functions as a distributed database where every node in the blockchain network has a copy of the ledger. Each transaction is verified and added to the blockchain through a consensus mechanism, which ensures that all nodes agree on the validity of the transaction. Once a transaction is added, it becomes immutable and cannot be altered or deleted. While commonly used to keep track of financial transactions, blockchain can be applied to a wide range of applications associated with smart contracts.
Since the advent of the blockchain, it is regarded that the role of the TTP can be replaced with the blockchain associated with smart contracts due to its reliability, transparency, and financial properties. Although the blockchain has the advantage that it can provide a decentralized platform for transparent and immutable ledgers, it has a limit on storage capacity. So, it is not viable to store bulk data on the blockchain. As an approach to this limitation, a combination of blockchain and external storage, such as the cloud, is being considered. In this approach, external storage offers a way to store and access the actual data for sale, whereas the blockchain is used to keep track of the actions taken by both the seller and the buyer during data trading.
However, in such a system model, it is required to guarantee access control and source identification for the data managed by the external storage service. Data owners (or sellers) may want to specify who can access their data entrusted to external storage as a policy. Data requesters (or buyers) may want to specify an attribute for certain data owners from whom they want to purchase data. Taking healthcare data as an example, a seller may want to provide its data only to hospitals or doctors but not insurance companies, and a buyer may want to obtain the data from men in their 20 s. This requires bilateral authorization, where the seller and the buyer must meet each other's policies. Therefore, it is necessary to design a secure and fair data trading system that enables access control and source identification for both parties.
Regarding the fair trading protocol considered in this paper, Chen et al. proposed a blockchainbased non-repudiable IoT data trading [13], where the trading behaviors of the seller and the buyer are recorded on the blockchain to facilitate dispute resolution. However, the authors do not address the issue of secure data trading. Thereby, data confidentiality is not guaranteed, as the secret key for data decryption is published in plaintext on the blockchain. In [14], the authors introduced the idea of a secure and fair data trading system, but did not present the detailed protocol for bilateral authorization.
Inspired by [13], in this paper, we aim to enhance the protocol of Chen et al. by taking secure data trading into account from the viewpoint of access control to the data and source identification. More specifically, in order to design a bilateral authorization-enabled secure data trading system, the proposed system makes use of the identity-based matchmaking encryption (IB-ME) scheme [15]. In the proposed system, the seller specifies the attribute of the target buyer (i.e., policy) under IB-ME encryption, and the buyer specifies the attribute of the target seller under IB-ME decryption. By exploiting the security guarantee of IB-ME, the data encrypted by the seller can be decrypted only by the intended buyer. Moreover, if the decryption is correct, it implies that the seller and the buyer are both valid parties specified by the policies of each other.
To support the use of IB-ME, a trusted off-chain arbitrator also acts as the key generator for IB-ME in the proposed system. However, the role of the arbitrator differs from that of the TTP in the conventional system. In the conventional system, the TTP is responsible for managing all trading transactions as interactive evidence which will be used to make arbitration when a dispute occurs. On the other hand, because the interactions between the seller and the buyer are handled by the smart contract on the blockchain, the arbitrator is not directly involved in dealing with data trading in the proposed system, except in a disputable situation.
At this phase, it is important to note that encrypting the whole data (in the form of a large file) by using IB-ME may not be practical due to its performance. Hence, the proposed system incorporates the use of the all-or-nothing transform (AONT) [16] to input only a few transformed data blocks to IB-ME encryption while still maintaining the confidentiality of the whole data. Due to the property of AONT, it is hard to recover the original data without knowing all parts of the transformed data. In the proposed system, the seller first transforms its data by AONT and splits the transformed data into two parts, one small part and the remaining large part. Then, the seller provides the large part through external storage and publishes the small part on the blockchain in encrypted form using IB-ME. To obtain the complete data, the buyer must purchase the encrypted small part by paying the cost through the smart contract even if the buyer can access the large part from external storage. Therefore, the proposed system reduces the storage burden of the blockchain and the computations of IB-ME while maintaining the security of the data.
The contributions of this paper are summarized as follows: • A secure data trading system architecture based on blockchain and external storage is proposed, which enables access control by the seller and source identification by the buyer at the same time. • To guarantee fairness, the threats caused by a dishonest seller or buyer are classified, and a fair data trading protocol is designed with the inclusion of arbitration. The proposed protocol encourages trading parties to act with honesty by utilizing the smart contract. • To demonstrate the efficiency of the proposed protocol, the off-chain computation overhead and the on-chain costs are evaluated.
The rest of this paper is organized as follows: Section 2 presents encryption schemes for secure data sharing/trading and related work on blockchain-based data trading systems. System architecture and design goals considered in this paper are presented in Section 3. Cryptographic building blocks for the proposed system are presented in Section 4. The proposed secure and fair data trading protocol is designed in Section 5. Security and performance of the protocol are evaluated in Section 6. Finally, Section 7 concludes this paper.

Related Work
In order to implement access control for secure data sharing in IoT and cloud computing, attribute-based encryption (ABE) [17] is widely used. ABE is a promising tool that enforces receiver access control to limit access to sender data, but does not provide sender access control for source identification [18]. Access control encryption (ACE) [19] is another type of encryption that allows fine-grained control over information flow. However, ACE is more suitable for organizations with hierarchical regulation rather than IoT environments [20]. Recently, a new cryptographic primitive called matchmaking encryption (ME) was proposed [15]. ME enables the sender to specify receivers who can reveal the messages, and the receiver to determine that the received message is from the desired sender.
Research on a blockchain-based decentralized data trading model has received a great deal of attention, and several solutions for data sharing/trading in the field of the IoT industry have been introduced [21][22][23][24]. Kang et al. proposed a data trading strategy in the vehicular P2P network in which blockchain and smart contracts are adopted for secure data caching and authorized data sharing [25]. Dixit et al. proposed a decentralized platform for the digital data marketplace enabled by the blockchain, which hosts IoT data in a reliable and fault-tolerant manner [26].
With regard to secure and fair data trading, Dai et al. proposed a blockchain-based secure data trading ecosystem [27] named SDTE. In their system, the data broker conducts business with the buyer on behalf of the seller, but neither the broker nor the buyer can access the raw data owned by the seller. However, SDTE requires a special hardware security module such as Intel Software Guard Extensions (SGX). Li et al. introduced a decentralized data trading framework based on blockchain to guarantee data availability and fairness in data trading [28]. They presented two different solutions. One is to improve reliability and data availability by making use of homomorphic encryption and data sample techniques. The other is to integrate smart contract with double-authentication-preventing signatures [29] to achieve fairness during data trading.
Alsharif et al. proposed a blockchain-based medical data marketplace model [30], in which sellers enforce access control policies on the encrypted records and buyers verify the correctness of the records without revealing any information about them. In order to achieve the design goals, their model is based on ciphertext-policy attribute-based encryption (CP-ABE) [31] and zero-knowledge succinct non-interactive argument of knowledge (zk-SNARK) [32]. However, their model has the weakness that the secret value for decrypting the ciphertext is disclosed on the blockchain at the withdrawal phase.
Li et al. [33] proposed a blockchain-based secure data trading platform by using plaintext checkable encryption (PCE) [34]. Regarding transaction security and data protection, the encrypted data for sale is stored not on a blockchain but on distributed storage to alleviate the storage pressure of the blockchain. Instead, the decryption key of the data is traded on the blockchain platform. Fair exchange on this platform relies on the miners who act as arbitrators for resolving disputes and executing smart contract. However, in the event of a dispute, it causes a problem that the ciphertext and secret key are known to miners who are not trusted on the blockchain network.
The systems proposed by Alsarif et al.'s and Li et al.'s are particularly relevant to the proposed system. However, their systems only focused on the access control of the seller for data confidentiality, and also involved complex cryptographic operations with smart contracts that resulted in high computational costs on the blockchain. On the other hand, the proposed system does not involve complex cryptographic operations in on-chain procedures, but these operations are processed by each off-chain party. Table 1 briefly shows the features of the proposed system.  Fig. 2 shows the data trading system architecture which consists of data owner (or seller), data requester (or buyer), arbitrator, blockchain layer, and storage layer.
• Data Owners or Sellers (S = {S 1 , S 2 , . . . , S m }): Each seller S i ∈ S advertises and offers its own data for sale through the data trading system. In order to allow only the buyer desired by the seller to access the data, S i sets the access control policy which specifies the preferable type of buyer. Then, S i publishes the trading data encrypted by using IB-ME under its access policy. For this purpose, S i needs to obtain its identity-based encryption key for IB-ME from the arbitrator. A buyer B j ∈ B initiates the data purchase by requesting interesting data from the system. At this phase, the buyer B j specifies the policy required to a data seller, and then places an order with the seller S i which satisfies the policy of B j . In order to obtain access right to the data owned by S i , B j must be issued the decryption key under its identity from the arbitrator for IB-ME as well as pay the price of the data in accordance with the smart contract. Then, B j gets to obtain the data if and only if the encrypted data given by S i is associated with the identity of B j which corresponds to the access policy of S i .
• Arbitrator (A): The arbitrator is an off-chain entity trusted by the trading participants. If a dispute occurs, the arbitrator takes part in off-chain arbitration to resolve the dispute. It is generally assumed that the arbitrator is a neutral third party to help resolve disputes fairly and efficiently. In the proposed system, the arbitrator also acts as a key generator for IB-ME which generates an encryption key for each seller and a decryption key for each buyer, respectively.
• Blockchain Layer: Blockchain layer provides a trusted platform that enforces data trading rules, coordinates the data trading process, and deals with payment with digital currency. Blockchain records the result and state of the data trading protocol run between the seller and the buyer by means of transactions. Hence, the blockchain can be viewed as a recorder of evidence to prove whether the seller and the buyer comply with the data trading protocol. Smart contract defines the valid state of data trading progress and implements transaction logic to change one state to another. Seller and buyer will interact with each other by way of the blockchain client application which invokes the smart contract to perform agreed steps of the proposed data trading protocol.
• Storage Layer: Because blockchain is not suitable for bulk data storage, external storage services such as cloud may be adopted in the storage layer to host a huge volume of data. External storage is regarded as an untrustworthy entity, so data owners outsource their data to the storage service in the form of an encoded package for confidentiality.
In addition, to clarify the proposed system, we make the following assumptions.
• Digital currency payment is implemented on the blockchain and each user has a digital wallet address/account with a balance. • The operations of the blockchain network are generic and usually understood. Each transaction submitted to the blockchain network contains the digital signature of the transaction issuer, and only confirmed transactions are included in a block appended to the blockchain. • The underlying blockchain platform is fault-tolerable even in the presence of malicious actors or failed components. Hence, the blockchain can act as a reliable platform and trusted third party.

Threat Model and Design Goals
Sellers and buyers who participate in online data trading are not fully trusted and may attempt to cheat each other for their own benefit. Table 2 classifies the behaviors of the seller and the buyer considered in the data trading system. Each type of honest or dishonest participant is denoted as HS, HB, DS, and DB. A dishonest seller may attempt to charge the buyer for payment without providing the data or correct key, or may not provide the data by denying the receipt of the payment even though the buyer has paid. Moreover, in order to make unfair profits, a dishonest seller may provide wrong data that differs from what the buyer requested. On the other hand, even after taking the data, a dishonest buyer may deny receiving the data and refuse to pay for it. In addition, a dishonest buyer may deliberately ask for unfair compensation by falsely alleging that the data or key provided is incorrect, despite having received the accurate one.
Under the threat models, we take the following design goals into account for the proposed blockchain-based secure and fair data trading system.
• Bilateral authorization and policy matching: For secure and authorized data trading, both the seller and the buyer can specify their policies that the other party must satisfy to trade the data. That is, the data trading between the seller and the buyer is achieved only if their attributes satisfy the policies specified by each other.
• Non-repudiation and fraud prevention: The participants must carry out their responsibilities for data trading, and the transactions cannot be denied later by either of the parties involved in the data trading. Furthermore, malicious behaviors such as providing incorrect data or alleging false compensation must be prevented. In this case, no benefit should be provided to the malicious party. • Fair exchange: At the end of the data trading protocol, either the seller receives the payment and the buyer receives the purchased data, or neither of them receives anything. In other words, if all participants honestly behave according to the protocol, then they will receive what they want.

Cryptographic Building Blocks
This section briefly outlines the IB-ME [15] and the AONT [16] which serve as cryptographic building blocks of the proposed system.

Identity-Based Matchmaking Encryption
Let e : G×G T be a bililnear pairing and P be a generator of G, where G and G T be two groups of a prime order q. When we denote by snd and rcv the target identities (i.e., the access policy) respectively specified by the receiver and by the sender, IB-ME is constructed as follows.
1) Setup (1 λ ): On input the security parameter 1 λ , the setup algorithm chooses two random values r, s ∈ Z q , and sets P 0 = P r . It outputs the master public key mpk = (e, G, G T , q, P, P 0 , H 1 , On input an encryption ek σ , a target identity rcv = ρ, and a message m, this algorithm proceeds as follows: On input a decryption key dk ρ , a target identity snd = σ , and a ciphertext C, this algorithm proceeds as follows: 4. If the padding is valid, return m. Otherwise, return ⊥.

All-or-Nothing Transform
AONT is a randomized transformation that can be reversed, but it is difficult to do so without having knowledge of all the message blocks in the output. It can be used as input to an encryption algorithm. Encode and decode of AONT are constructed as follows.

Proposed System
The proposed blockchain-based secure data trading system is presented in this section. Fig. 3 shows the state transition of the proposed data trading protocol processed by the smart contract, and Table 3 describes the notations used in the protocol. Once the seller has registered its encoded data to the storage layer, the buyer can request access right to decrypt the data by invoking the order function of the smart contract. The seller will then offer the requested access right through the blockchain. If there is no order to the registered data or the requested access right is not offered within a predefined expiration time, then the smart contract will cancel the data trading process. When the data is successfully recovered with the purchased access right, the buyer confirms this data trade. Then, the seller receives the payment. However, if any fraudulent behavior occurs by either the seller or the buyer, a dispute resolution process will be initiated.   The value in the data structure of the TX Suppose that a seller S i wants to sell data D and a buyer B j wants to purchase the data owned by S i through the proposed data trading system. S i and B j perform the data trading protocol with the mediation of the arbitrator A as described in the following sections.

Data Trading Protocol
Before describing the detailed protocol, we assume that the smart contract was deployed on the blockchain layer and the arbitrator set up its own master secret key msk and public parameters mpk by running IB-ME.Setup (1 λ ) algorithm as a key generator. The mpk is publicly known to the system.

Advertisement and Request Phase
The seller S i advertises a description of the data D for sale attached with the policy required to buyers, and the buyer B j searches for the data of interest and submits a request with the policy required to sellers, as follows. where desc D is brief description about the data D, rcv is the access policy that a buyer must satisfy, and price D is the price of the data D. 2) If B j is interested in the data advertised by TX ad i , B j submits the request transaction TX req j : = {TX ad i .id, snd}, where TX ad i .id is the identifier referenced by the request and snd is the policy specifying the attribute required to the seller.

Register Phase
When S i finds the request for its data, S i first encodes the data by using AONT encoding algorithm and outsources the encoded data to the storage layer. Then, S i invokes the register function of the smart contract to publish proof of the data to be traded on the system. At this phase, S i also makes a guarantee deposit that will be temporarily locked in the smart contract and confiscated if S i would act dishonestly. 1) For data D, S i prepares AONT encoded data as D stub |D pkg ← AONT.Encode (D), outsources the D pkg part to the storage layer, and then computes the hashes of the data as follows: 2) S i invokes the register procedure of the smart contract by submitting the transaction TX reg i : = {TX req j .id, uri, , s|p , P , deposit S i , exp reg }. Here, deposit S i is the digital currency that the seller puts down a deposit on, uri is the link for downloading the data package from the storage, and , s|p , P are commitments of the data to be offered later. 3) On input the register call, if the referenced TX req j exists and S i 's account balance is sufficient for deposit S i , then the smart contract sets the state as ij = "registered" and publishes the register transaction.
Note, exp reg specified in TX reg i is the expiration time until when the buyer must place an order for the registered data. If the buyer does not place an order within exp reg , smart contract will discard this data trading and give deposit S i back to S i .

Order Phase
Once the encoded data is registered, B j can download D pkg from the storage layer. However, due to the property of AONT, D pkg alone is not perfect to recover the actual data D, so B j needs to order the remaining part D stub to S i by using the smart contract. = TX reg i . p holds), B j invokes the order procedure by submitting TX ord j := {TX reg i .id, pay B j , deposit B j , exp ord }, where pay B j is the digital currency to pre-pay the price of the data but locked until the end of the protocol and deposit B j is B j 's guarantee money.
3) On input the order call, if the current protocol state is ij = "registered", TX reg i .exp reg is not expired and B j 's account balance is sufficient for pay B j + deposit B j , then smart contract sets the state as ij = "ordered" and publishes the order transaction. Otherwise, the order of B j is discarded and the state is set as ij = "canceled".
With regard to exp ord , which is the expiration time until when the seller must offer the remaining data part, if no offer is given by S i in exp ord then this trading will be canceled and both pay B j and deposit B j be returned back to B j .

Offer Phase
When S i is notified that the trading state is in "ordered", S i publishes the D stub on the blockchain by using the smart contract so that B j can purchase the data reliably. To provide the data in a secure manner, at this phase, S i encrypts D stub by using IB-ME under the encryption key issued by the arbitrator and the target identity rcv representing the access policy to the buyer.
1) The arbitrator issues S i with the encryption key as ek S i ← IB-ME.SKGen (msk, attr S i ) under the attr S i identifying S i 's attribute after checking if S i 's attribute satisfies the buyer's policy snd (i.e., attr S i = snd). The key generation is off-chain process, and we assume an out-of-band secure channel. 2) For the remaining data part D stub , S i generates the encrypted data as C stub ← IB-ME.Enc (mpk, ek S i , rcv, D stub ).
3) S i offers the encrypted to B j data through the blockchain by invoking the offer procedure with TX offer i := {TX ord j .id, C stub , exp offer }. 4) If the current state is ij = "ordered" and TX ord .exp ord is not expired, smart contract sets the state as ij = "offered" and publishes the offer transaction.

Confirmation Phase
When the trading state is set as ij = "offered" as a response to B j 's order, B j retrieves the encrypted data C stub from the offer transaction TX offer i and reconstructs the AONT encoded data D stub |D pkg after decrypting C stub by using IB-ME under the decryption key issued by the arbitrator and the target identity snd. If the reconstructed data is valid and the actual data D is successfully decoded, B j confirms this data trade. 1) B j retrieves C stub from TX offer i offered by the seller.
2) The arbitrator issues B j with the decryption key as dk B j ← IB-ME.RKGen (mpk, msk, attr B j ) under the attr B j identifying B j 's attribute after checking if attr B j satisfies seller's policy rcv (i.e., attr B j = rcv). 3) B j decrypts C stub to get the remaining data part D stub as D stub ← IB-ME.Dec (mpk, dk B j , snd, C stub ) and combines it with D pkg downloaded from the storage layer to construct the full AONT encoded data D stub |D pkg . B j also computes the followings, where p = H D pkg is the result computed by Eq. (5) at the order phase. = TX reg i . holds then B j invokes the confirmation procedure of the smart contract by sending TX conf : = {TX offer i .id, "ok"} to finally confirm this data trade on the blockchain. 4) Upon receiving the confirmation call from the buyer before TX offer .exp offer is timeout in the state ij = 'offered', smart contract transfers B j 's payment of TX ord j .pay B j to S i 's account, and gives back TX reg i .deposit S i and TX ord j .deposit B j to S i and B j , respectively. The state is set as ij = "completed" and the data trading between S i and B j is completed normally.

Arbitration Protocols
When the seller and the buyer honestly follow the data trading protocol as described above, they can get the payment and the data, respectively. However, one party may repudiate its trading behavior or have complaints against the repudiation of the other party. Hence, an arbitration protocol to resolve such problematic situations is needed. The proposed arbitration protocols are divided into on-chain arbitration and off-chain arbitration. The former is carried out by the smart contract on the basis of the recorded proof on the blockchain, and the latter is carried out by the arbitrator when any party raises an objection to the on-chain arbitration result.

On-Chain Arbitration
Buyer and seller can initiate the on-chain arbitration procedure of the smart contact. From the seller's viewpoint, when S i finds out that the buyer B j did not confirm the data trade even though B j had taken the data, S i invokes on-chain arbitration to get B j 's payment. On the other hand, at the confirmation phase, when B j finds that the digests s|p and = H (D) computed by itself are not the same as s|p and committed to the blockchain, B j invokes on-chain arbitration by giving s and as the evidence. On-chain arbitration is processed as follows.
1) For S i 's on-chain arbitration call, if exp offer is expired and the state ij is not "completed" but still in "offered", set result = "buyer_not_confirmed" which decides that the buyer did not confirm receipt of data so the payment is not settled to the seller yet. After the TX claim is published on the blockchain, if both parties have no objections to the result and do not initiate off-chain arbitration until exp claim has passed, the smart contract finally settles the payment or gives a penalty depending on the result; transfers TX ord j .pay B j to S i 's account if the result is "buyer_not_confirmed", or returns TX ord j .pay B j back to B j 's account and confiscates S i 's deposit TX reg i .deposit S i as the penalty if the result is "seller_data_not_correct".

Off-Chain Arbitration
Unfortunately, a malicious seller may give wrong data and a malicious buyer may present false evidence, which makes the on-chain arbitration lead to a wrong decision. With claiming "seller_wrong_data", it is possible that the buyer B j attempts to cheat the on-chain arbitration by giving intentionally forged false s and so as to deny paying the cost even though B j received the correct data. It is also possible that the seller S i cheats the verification by presenting wrong or useless data D and recording hash values derived from D on the blockchain from the beginning. The verification of the data by computing the hash values will be definitely passed as valid whereas the buyer receives wrong data different from what the buyer wants. On-chain arbitration is not sufficient to handle those suspicious behaviors, so data trading parties cannot help but rely on the judgment of the arbitrator.
Therefore, in such cases, S i or B j initiates off-chain arbitration by sending TX dispute : = {TX offer .id, X claim .id } to the blockchain, in which the smart contract will set the state as ij = 'disputed', in order to ask for the arbitrator to resolve the dispute situation. Note, in the proposed system, the arbitrator also acts as the key generator for IB-ME. We make use of the key escrow property inherited from identitybased cryptography in order for the arbitrator to examine the traded data and the proofs recorded on the blockchain on behalf of S i and B j . The arbitrator deals with the off-chain arbitration as follows.
1) First, collect D pkg from the storage and C stub , , s|p , P , s , from the blockchain. 2) Reconstruct the AONT encoded data D stub |D pkg after decrypting C stub by using B j 's escrowed IB-ME decryption key dk B j . Then, recover the data D by decoding D stub |D pkg , and compute the followings: 3) If ( A p , A s|p , A ) are the same as S i 's ( P , s|p , ) but ( A s , A ) are different from B j 's ( s , ), then "seller_wrong_data" of the on-chain arbitration is regarded as resulting from B j 's forged false evidence. So, judge that B j is malicious and set judgment = "buyer_malicious". 4) Even though the verification of hash-based data correctness is valid, further check the usefulness of the data. If the data D does not meet the description desc D in TX ad i , then judge that S i offered nonsense data and set judgment = "seller_malicious". 5) Invoke the resolve procedure of the smart contract by submitting TX resolve : = {TX dispute .id, judgment}.
It is worth noting that, in the blockchain platform, the data D is an external item and trading the data is a real-world event. So, it is not possible for the on-chain procedure to verify the truth of data D on its own. Therefore, in step (4), the arbitrator can judge whether the data D actually contains useful content corresponding to the description desc D or meaningless data after looking into the data D.
Upon receiving the resolve call from the arbitrator, the smart contract processes and records the final arbitration result. If the judgment is "seller_malicious" then S i 's deposit TX reg i .deposit S i is confiscated as a penalty. On the other hand, if the judgment is "buyer_malicious" then B j 's deposit TX ord j .deposit B j is confiscated as a penalty and TX ord j .pay B j is paid to S i . The resolve transaction is published on the blockchain and the protocol is completed.
As presented so far, the proposed system is based on blockchain accompanied with smart contract and makes use of IB-ME and AONT schemes. The proposed system satisfies the design goals mentioned in Section 2, assuming the security and reliability features of the underlying cryptographic primitives and blockchain.

Bilateral Authorization and Policy Matching
When the data is provided through external storage instead of by the data owner directly, it would be needed for the seller to specify a policy restricting access to its own data and for the buyer to specify the condition of the data provider. In the proposed data trading protocol, seller access policy rcv required for buyers and buyer policy snd required for sellers are specified in TX ad and TX req at advertisement and request phase, respectively. Data D of the seller is first transformed to the AONT encoded data D stub |D pkg and only the D pkg part is provided by way of the storage layer, in which it is hard to recover the data without knowing the entire. Hence, to get the actual data, the buyer has to purchase the remaining data D stub to be given in the encrypted form by the seller.
At this phase, the data D stub is protected by using IB-ME under the policies, snd and rcv. To perform data trading, seller and buyer have to be issued their encryption key and decryption key from the arbitrator who checks that the attributes of the seller and the buyer correspond to the policies of the other, respectively. During the protocol, the encrypted data C stub ← IB-ME.Enc (mpk, ek S i , rcv, D stub ) can be decrypted by only the buyer who has the decryption key dk B j derived from its attribute (that is, attr B j = rcv) as D stub ← IB-ME.Dec (mpk, dk B j , snd, C stub ). In addition, if the encrypted data is decrypted correctly, the buyer can be sure that the seller who has the encryption key ek S i derived from its attribute (that is, attr S i = snd) offered the data. Therefore, the data trading is achieved if and only if the seller's policy for authorization to the data and the buyer's policy for source identification are matched.

Non-Repudiation and Fraud Prevention
If the seller and the buyer honestly follow the protocol, then the data trading would be completed satisfactorily. However, the seller and the buyer are not fully trusted participants to each other, so one of them may behave maliciously as mentioned in Section 2. When the seller or the buyer attempts to cheat and repudiate trading behavior, the system must be able to detect and prevent such attempts.
Data trading transactions between seller and buyer are processed by means of the smart contract executed in compliance with the pre-defined rules, and are recorded on the tamper-resistant blockchain ledger. Transactions recorded on the blockchain at each protocol phase can be regarded as proof of the data trading between the seller and the buyer, which enables non-repudiation, and the previous protocol step must be processed before proceeding to the next step.
In the proposed system, in order to acquire the actual whole data D, the buyer must obtain the D stub from the seller at the offer phase after pre-paying the cost pay B j of the data to the smart contract at the order phase. That is, the trading proceeds according to the sequence of "ordered" → "offered" state transition. Assuming the secrecy of IB-ME and AONT schemes, the buyer can hardly recover the data D without D stub as discussed above. Therefore, the system can prevent DB1 type behavior because the buyer has no choice but to prepay the cost to recover the perfect data.
The system can also prevent not only DS1 type behavior but also DS2 type behavior of the seller even though the buyer prepays the cost. The prepaid pay B j is locked by the smart contract until the buyer makes a confirmation by checking the correctness of the received data. After the buyer's order transaction, if no offer is given before exp ord is expired, pay B j is returned back to the buyer by the smart contact and the dishonest seller has nothing. Furthermore, at the confirmation phase, if the buyer received incorrect data whose hashes are not the same as the hashes recorded on the blockchain, the buyer can claim compensation for the incorrect data by invoking on-chain arbitration. Due to the collision resistance property of the cryptographic hash function, the hashes of the traded data ( , s|p , p ) recorded on the blockchain makes the seller hard to forge a different data with the same ( , s|p , p ). Although DS2 type seller may record both the wrong data and its hashes on the blockchain at the first place to defraud the on-chain arbitration, the wrong data cannot evade the off-chain examination by the arbitrator. So, DS2 type fraud of the seller can be handled and prevented.
Another concern, from the seller's perspective, is the cheating of DB2 type malicious buyer that alleges false compensation. A malicious buyer may input intentionally forged evidence ( s = s , = ) for the purpose of leading the on-chain arbitration to "seller_data_not_correct" falsely. To cope with such a suspicious case, in the proposed system, the settlement on the result of on-chain arbitration is suspended for a while so that the seller can ask for the arbitrator to judge whose fraud by invoking off-chain arbitration. Then, the arbitrator collects and examines the data and the proofs, ( P , s|p , ) recorded by the seller and ( s , ) by the buyer. DB2 type fraud of buyer can be detected, because the hashes computed by the arbitrator ( A s , A ) would be the same as seller's but not as buyer's.

Fair Exchange
Once again, when the seller and the buyer honestly follow the data trading protocol, then the data trading will be completed successfully. Then, the buyer will indeed get the data D, and the seller will get the payment pay B j , as following the state transition "registered" → "ordered" → "offered" → "completed". However, when the trading is canceled in the middle of the protocol run (i.e., ij = 'canceled'), neither the buyer's payment nor the seller's data is provided to them, respectively. So, both of them get nothing. In addition, due to the functionality of non-repudiation and dispute resolution described above, it is difficult for them to obtain unfair benefits by cheating each other, and penalties are even deducted if they perform the data trading protocol dishonestly.
Each party makes a guarantee deposit, deposit S i and deposit B j , which are locked in the smart contract until the protocol is completed. If no dispute occurs, the smart contract unlocks and returns the deposits back to each honest participant. However, if a dispute occurs, the dishonest party gets to forfeit its deposit as a penalty depending on the on-chain or the off-chain arbitration result. Eventually, the seller and the buyer will lose their guarantee money if they act maliciously during the trading protocol, which encourages both the seller and the buyer to perform the data trading honestly. Thus, the proposed system guarantees fairness so that both parties obtain what they want or gain no befit.

Performance Evaluation
The basic design principle of the proposed system is not to burden complex cryptographic operations to the smart contract as possible in order to implement lightweight on-chain procedures. Although blockchain platforms such as Ethereum support smart contract to implement complicated crypto algorithms with big-number arithmetic, the more complex operations are burdened, the more time and cost are spent. Therefore, we make the system run only basic simple operations in the onchain procedure while rather complex cryptographic operations are processed by each off-chain local entity whose results are given to the on-chain procedure as input. Table 4 shows the cryptographic overhead processed by each entity during the data trading protocol. In the system, the most time-consuming cryptographic scheme is IB-ME. To measure the overhead of IB-ME operations as shown in Table 5, we used the benchmark results of the Miracl cryptography library [35] implemented on Intel Core i7 3 GHz with a supersingular curve with 512bit based field (i.e., |G| = 512 bits). So, this result shows that each entity can process the operations efficiently.  -Mul: multiplication on G T In the proposed system, the interaction between the seller and the buyer is handled by the smart contract whose procedures are triggered by the transactions submitted by the seller and the buyer. We do not restrict the underlying blockchain platform to Ethereum, but we estimate the storage overhead and the gas units consumed in Ethereum to show the cost of the proposed protocols. According to [36], it additionally costs the input data fee to the fixed initial fee of 21000 gas units to execute every transaction. With regard to the input data fee, 4 gas units per zero valued byte of data and 16 gas units per nonzero valued byte of data are paid, respectively. So, the gas costs vary depending on the input data values. Table 6 shows the data field and size used in the proposed system. We evaluated the performance of the on-chain procedures in terms of storage overhead, gas consumption, and economic cost. We implemented the smart contract by using Solidity 0.8.7 with Remix IDE and Ethereum test network, then compared the costs of the proposed system with Alsarif et al.'s [30] and Li et al.'s [33]. The storage and the gas costs of Alsarif et al.'s and Li et al.'s are estimated by using the quantities presented in their work. Strictly speaking, it may not be appropriate to directly compare the measurements because each experiment was done separately in different environments by the authors. Nevertheless, we intend to show that the proposed system can be as efficient as others or better.
In Table 7, the gas costs are total gas units including transaction costs and code execution costs. As mentioned before, heavy cryptographic operations are not included in the proposed on-chain procedures while digital signature verification and PCE check are included in Alsarif et al.'s and Li et al.'s. Hence, the proposed system consumes less gas than others to carry out data trading even though the proposed system inputs more data. Furthermore, to show the economic expenses, we also calculated the gas price in US dollars. At the time of writing this paper, the gas price is on average 19 Gwei for 1 gas unit, 1 Gwei is 1 × 10 −9 Ether, and 1 Ether is 1660 $. Therefore, the cost for the data trading in normal is about 13 × 10 5 gas ($ 41.002) and about 17 × 10 5 gas ($ 53.618) when disputed. In today's data-driven economy, the vast amount of data gathered by various IoT devices is regarded as a valuable asset, and this trend demands a data trading platform to sell and buy data.
The emergency of the blockchain promotes the development of a decentralized trustworthy data trading platform. Therefore, in this paper, we presented a secure and fair data trading system by taking advantage of the blockchain integrated with smart contract and matchmaking encryption. The proposed system enables bilateral authorization by making use of the security features of IB-ME so that access control by the seller and source identification by the buyer are guaranteed at the same time. Moreover, we designed the fair data trading protocol incorporated with on-chain and off-chain arbitrations by leveraging the smart contract to make the parties honestly carry out the protocol. In addition, we evaluated off-chain computation overhead and on-chain costs of the proposed protocol. In comparison with existing systems relevant to our work, the proposed protocol makes it possible to implement a cost-efficient data trading system. Even though we assumed a single arbitrator, it is more desirable to adopt a decentralized arbitration system with multiple arbitrators to increase the reliability of dispute resolution. Development of a decentralized arbitration platform that allows for fair and transparent dispute resolution remains a future work.