Bee RFCs

Bee RFC Book

This process is modelled after the approach taken by the Rust programming language, see Rust RFC repository for more information. Also see maidsafe's RFC process for another project into the crypto space. Our approach is taken and adapted from these.

To make more substantial changes to Bee, we ask for these to go through a more organized design process --- a RFC (request for comments) process. The goal is to organize work between the different developers affiliated with the IOTA Foundation, and the wider open source community. We want to vet the ideas early on, get and give feedback, and only then start the implementation once the biggest questions are taken care of.

What is substantial and when to follow this process

You need to follow this process if you intend to make substantial changes to Bee and any of its constituting subcrates.

  • Anything that constitutes a breaking change as understood in the context of semantic versioning.
  • Any semantic or syntactic change to the existing algorithms that is not a bug fix.
  • Any proposed additional functionality and feature.
  • Anything that reduces interoperability (e.g. changes to public interfaces, the wire protocol or data serialisation.)

Some changes do not require an RFC:

  • Rephrasing, reorganizing, refactoring, or otherwise "changing shape does not change meaning".
  • Additions that strictly improve objective, numerical quality criteria (warning removal, speedup, better platform coverage, more parallelism, trap more errors, etc.)
  • Internal additions, i.e. additions that are invisible to users of the public API, and which probably will be only noticed by developers of Bee.

If you submit a pull request to implement a new feature without going through the RFC process, it may be closed with a polite request to submit an RFC first.

The workflow of the RFC process

With both Rust and maidsafe being significantly larger projects, not all the steps below might be relevant to Bee's RFC process (for example, at the moment there are no dedicated subteams). However, to instill good open source governance we attempt to follow this process already now.

In short, to get a major feature added to Bee, one must first get the RFC merged into the RFC repository as a markdown file. At that point the RFC is "active" and may be implemented with the goal of eventual inclusion into Bee.

  • Fork the RFC repository
  • Copy 0000-template.md to text/0000-my-feature.md (where "my-feature" is descriptive; don't assign an RFC number yet; extra documents such as graphics or diagrams go into the new folder).
  • Fill in the RFC. Put care into the details: RFCs that do not present convincing motivation, demonstrate lack of understanding of the design's impact, or are disingenuous about the drawbacks or alternatives tend to be poorly-received.
  • Submit a pull request. As a pull request the RFC will receive design feedback from the larger community, and the author should be prepared to revise it in response.
  • Each pull request will be labeled with the most relevant sub-team, which will lead to its being triaged by that team in a future meeting and assigned to a member of the subteam.
  • Build consensus and integrate feedback. RFCs that have broad support are much more likely to make progress than those that don't receive any comments. Feel free to reach out to the RFC assignee in particular to get help identifying stakeholders and obstacles.
  • The sub-team will discuss the RFC pull request, as much as possible in the comment thread of the pull request itself. Offline discussion will be summarized on the pull request comment thread.
  • RFCs rarely go through this process unchanged, especially as alternatives and drawbacks are shown. You can make edits, big and small, to the RFC to clarify or change the design, but make changes as new commits to the pull request, and leave a comment on the pull request explaining your changes. Specifically, do not squash or rebase commits after they are visible on the pull request.
  • At some point, a member of the subteam will propose a "motion for final comment period" (FCP), along with a disposition for the RFC (merge, close, or postpone).
    • This step is taken when enough of the tradeoffs have been discussed that the subteam is in a position to make a decision. That does not require consensus amongst all participants in the RFC thread (which is usually impossible). However, the argument supporting the disposition on the RFC needs to have already been clearly articulated, and there should not be a strong consensus against that position outside of the subteam. Subteam members use their best judgment in taking this step, and the FCP itself ensures there is ample time and notification for stakeholders to push back if it is made prematurely.
    • For RFCs with lengthy discussion, the motion to FCP is usually preceded by a summary comment trying to lay out the current state of the discussion and major tradeoffs/points of disagreement.
    • Before actually entering FCP, all members of the subteam must sign off; this is often the point at which many subteam members first review the RFC in full depth.
  • The FCP lasts ten calendar days, so that it is open for at least 5 business days. It is also advertised widely, e.g. on discord or in a blog post. This way all stakeholders have a chance to lodge any final objections before a decision is reached.
  • In most cases, the FCP period is quiet, and the RFC is either merged or closed. However, sometimes substantial new arguments or ideas are raised, the FCP is canceled, and the RFC goes back into development mode.

The RFC life-cycle

Once an RFC becomes active then authors may implement it and submit the feature as a pull request to the repo. Being "active" is not a rubber stamp and in particular still does not mean the feature will ultimately be merged. It does mean that in principle all the major stakeholders have agreed to the feature and are amenable to merging it.

Furthermore, the fact that a given RFC has been accepted and is "active" implies nothing about what priority is assigned to its implementation, nor does it imply anything about whether a developer has been assigned the task of implementing the feature. While it is not necessary that the author of the RFC also write the implementation, it is by far the most effective way to see an RFC through to completion. Authors should not expect that other project developers will take on responsibility for implementing their accepted feature.

Modifications to active RFCs can be done in follow up PRs. We strive to write each RFC in a manner that it will reflect the final design of the feature, however, the nature of the process means that we cannot expect every merged RFC to actually reflect what the end result will be at the time of the next major release. We therefore try to keep each RFC document somewhat in sync with the network feature as planned, tracking such changes via followup pull requests to the document.

An RFC that makes it through the entire process to implementation is considered "implemented" and is moved to the "implemented" folder. An RFC that fails after becoming active is "rejected" and moves to the "rejected" folder.

Reviewing RFCs

While the RFC pull request is up, the sub-team may schedule meetings with the author and/or relevant stakeholders to discuss the issues in greater detail, and in some cases the topic may be discussed at a sub-team meeting. In either case a summary from the meeting will be posted back to the RFC pull request.

A sub-team makes final decisions about RFCs after the benefits and drawbacks are well understood. These decisions can be made at any time, but the sub-team will regularly issue decisions. When a decision is made, the RFC pull request will either be merged or closed. In either case, if the reasoning is not clear from the discussion in thread, the sub-team will add a comment describing the rationale for the decision.

Implementing an RFC

Some accepted RFCs represent vital features that need to be implemented right away. Other accepted RFCs can represent features that can wait until some arbitrary developer feels like doing the work. Every accepted RFC has an associated issue tracking its implementation in the affected repositories. Therefore, the associated issue can be assigned a priority via the triage process that the team uses for all issues in the appropriate repositories.

The author of an RFC is not obligated to implement it. Of course, the RFC author (like any other developer) is welcome to post an implementation for review after the RFC has been accepted.

If you are interested in working on the implementation for an "active" RFC, but cannot determine if someone else is already working on it, feel free to ask (e.g. by leaving a comment on the associated issue).

RFC postponement

Some RFC pull requests are tagged with the "postponed" label when they are closed (as part of the rejection process). An RFC closed with "postponed" is marked as such because we want neither to think about evaluating the proposal nor about implementing the described feature until some time in the future, and we believe that we can afford to wait until then to do so. Historically, "postponed" was used to postpone features until after 1.0. Postponed pull requests may be re-opened when the time is right. We don't have any formal process for that, you should ask members of the relevant sub-team.

Usually an RFC pull request marked as "postponed" has already passed an informal first round of evaluation, namely the round of "do we think we would ever possibly consider making this change, as outlined in the RFC pull request, or some semi-obvious variation of it." (When the answer to the latter question is "no", then the appropriate response is to close the RFC, not postpone it.)

Help! This is all too informal

The process is intended to be as lightweight as reasonable for the present circumstances. As usual, we are trying to let the process be driven by consensus and community norms, not impose more structure than necessary.

Contributions, license, copyright

This project is licensed under Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0). Any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be licensed as above, without any additional terms or conditions.

Summary

This RFC introduces the IOTA protocol messages that were initially added in IRI#1393.

Motivation

To be able to take part in the IOTA networks, Bee nodes need to implement the exact same protocol presented in this RFC and currently being used by IRI nodes and HORNET nodes. However, it does not necessarily mean implementing the same versions of the protocol. A design decision - later explained - concludes that Bee nodes and IRI nodes will not be able to communicate with each other.

Detailed design

This section details:

  • The Message trait that provides serialization and deserialization of messages to and from byte buffers;
  • A type-length-value protocol - on top of the trait - that adds metadata in order to send and receive the messages over a transport layer;
  • The current Message implementations representing handshake, requests, responses, events, ...;

Message trait

The Message trait is protocol agnostic and only provides serialization and deserialization to and from byte buffers. It should not be used as is but rather be paired with a higher layer - like a type-length-value encoding - and as such does not provide any bounds check on inputs/outputs buffers.


#![allow(unused)]
fn main() {
/// A trait describing the behavior of a message.
trait Message {
    /// The unique identifier of the message within the protocol.
    const ID: u8;

    /// Returns the size range of the message as it can be compressed.
    fn size_range() -> Range<usize>;

    /// Deserializes a byte buffer into a message.
    /// Panics if the provided buffer has an invalid size.
    /// The size of the buffer should be within the range returned by the `size_range` method.
    fn from_bytes(bytes: &[u8]) -> Self;

    /// Returns the size of the message.
    fn size(&self) -> usize;

    /// Serializes a message into a byte buffer.
    /// Panics if the provided buffer has an invalid size.
    /// The size of the buffer should be equal to the one returned by the `size` method.
    fn into_bytes(self, bytes: &mut [u8]);
}
}

Notes:

  • size_range returns an allowed range for the message size because some parts of some messages can be trimmed. It is used to check if a message coming from a transport layer has a valid size. More details on compression below;
  • from_bytes/into_bytes panic if incorrectly used, only the following safe TLV module should directly use them;
  • into_bytes does not allocate a buffer because the following TLV protocol implies concatenating a header inducing another allocation. Since this is a hot path, a slice of an already allocated buffer for both the header and payload is expected; hence, limiting the amount of allocation to the bare minimum;

Type-length-value protocol

The type-length-value module is a safe layer on top of the messages. It allows serialization/deserialization to/from a byte buffer ready to be sent/received to/from a transport layer by prepending or reading a header containing the type and length of the payload.

Header


#![allow(unused)]
fn main() {
/// A header for the type-length-value encoding.
struct Header {
    /// Type of the message.
    message_type: u8,
    /// Length of the message.
    message_length: u16,
}
}

Methods


#![allow(unused)]
fn main() {
/// Deserializes a TLV header and a byte buffer into a message.
/// * The advertised message type should match the required message type.
/// * The advertised message length should match the buffer length.
/// * The buffer length should be within the allowed size range of the required message type.
fn tlv_from_bytes<M: Message>(header: &Header, bytes: &[u8]) -> Result<M, TlvError> {
    ...
}

/// Serializes a TLV header and a message into a byte buffer.
fn tlv_into_bytes<M: Message>(message: M) -> Vec<u8> {
    ...
}
}

Messages

Since the various types of messages are constructed with different kind of data, there can not be a single constructor signature in the Message trait. Implementations are then expected to provide a convenient new method to build them.

Endianness

All multi-byte number fields of the messages of the protocol are represented as big-endian.

Version 0

Handshake

Type ID: 1

A message that allows two nodes to pair. Contains useful information to verify that the pairing node is operating on the same configuration. Any difference in configuration will end up in the connection being closed and the nodes not pairing.

NameDescriptionTypeLength
portProtocol port of the node1.u162
timestampTimestamp - in ms - when the message was created by the node.u648
coordinatorPublic key of the coordinator being tracked by the node.[u8; 49]49
minimum_weight_magnitudeMinimum Weight Magnitude of the node.u81
supported_versionsProtocol versions supported by the node2.Vec<u8>1-32

1 When an incoming connection is created, a random port is attributed. This field contains the actual port being used by the node and is used to match the connection with a potential white-listed peer.

2 Bit-masks are used to denote what protocol versions the node supports. The LSB acts as a starting point. Up to 32 bytes are supported, limiting the number of protocol versions to 256. Examples:

  • [0b00000001] denotes that the node supports protocol version 1.
  • [0b00000111] denotes that the node supports protocol versions 1, 2 and 3.
  • [0b01101110] denotes that the node supports protocol versions 2, 3, 4, 6 and 7.
  • [0b01101110, 0b01010001] denotes that the node supports protocol versions 2, 3, 4, 6, 7, 9, 13 and 15.
  • [0b01101110, 0b01010001, 0b00010001] denotes that the node supports protocol versions 2, 3, 4, 6, 7, 9, 13, 15, 17 and 21.

Version 1

LegacyGossip

Type ID: 2

A legacy message to send a transaction and request another one at the same time.

NameDescriptionTypeLength
transactionTransaction to send. Can be compressed1.Vec<u8>292-1604
hashHash of the requested transaction.[u8; 49]49

1 Compression is detailed at the end.

Note: This message is the original IRI protocol message before the TLV protocol was introduced. It was kept by HORNET for compatibility with IRI but is not used between HORNET nodes. Its "ping-pong" concept has complex consequences on the node design and as such will not be implemented by Bee.

Version 2

MilestoneRequest

Type ID: 3

A message to request a milestone.

NameDescriptionTypeLength
indexIndex of the requested milestone.u324

Transaction

Type ID: 4

A message to send a transaction.

NameDescriptionTypeLength
transactionTransaction to send. Can be compressed1.Vec<u8>292-1604

1 Compression is detailed at the end.

TransactionRequest

Type ID: 5

A message to request a transaction.

NameDescriptionTypeLength
hashHash of the requested transaction.[u8; 49]49

Heartbeat

Type ID: 6

A message that informs about the part of the Tangle currently being fully stored by a node. This message is sent when a node:

  • just got paired to another node;
  • did a local snapshot and pruned away a part of the Tangle;
  • solidified a new milestone;

It also helps other nodes to know if they can ask it a specific transaction.

NameDescriptionTypeLength
solid_milestone_indexIndex of the last solid milestone.u324
snapshot_milestone_indexIndex of the snapshotted milestone.u324

Compression

A transaction encoded in bytes - using the T5B1 codec - has a length of 1604. The payload field itself occupies 1312 bytes and is often partially or completely filled with 0s. For this reason, trailing 0s of the payload field are removed, providing a compression rate up to nearly 82%. Only the payload field is altered during this compression and the order of the fields stays the same.

Proposed functions:


#![allow(unused)]
fn main() {
fn compress_transaction_bytes(bytes: &[u8]) -> Vec<u8> {
    ...
}

fn uncompress_transaction_bytes(bytes: &[u8]) -> [u8; 1604] {
    ...
}
}

Drawbacks

Since IRI nodes only implement version 0 and 1 and Bee nodes only implement versions 0 and 2, they will not be able to communicate with each other.

Rationale and alternatives

There are alternatives to a type-length-value protocol but it is very efficient and easily updatable without breaking change. Also, since this is the protocol that has been chosen for the IOTA network, there is no other alternative for Bee.

Unresolved questions

There are no open questions at this point. This protocol has been used for a long time and this RFC will be updated with new message types when/if needed.

Summary

This RFC proposes a configuration pattern for Bee binary and library crates.

Motivation

This RFC contains a set of recommendations regarding configuration management in order to ensure consistency across the different Bee crates.

Detailed design

Libraries

  • serde: the go-to framework for serializing and deserializing Rust data structures efficiently and generically;
  • toml-rs: a TOML encoding/decoding library for Rust with serialization/deserialization on top of serde;

Recommendations

With the following recommendations, one can:

  • read a configuration builder from a file;
  • write a configuration to a file;
  • manually override fields of a configuration builder;
  • construct a configuration from a configuration builder;

Configuration Builder

The Builder pattern is very common in Rust when it comes to constructing complex objects like configurations.

A configuration builder type should:

  • have a name suffixed by ConfigBuilder;
  • derive the following traits;
    • Default to easily implement the new method as convention;
    • Deserialize, from serde, to deserialize from a configuration file;
  • provide a new method;
  • have Option fields;
    • if there is a configuration file, the fields should have the same names as the keys;
  • provide setters to set/override the fields;
  • provide a finish method constructing the actual configuration object;
  • have default values defined as const and set with Option::unwrap_or/Option::unwrap_or_else;

Here is a small example fitting all these requirements.

config.toml

[snapshot]
meta_file_path  = "./data/snapshot/mainnet.snapshot.meta"
state_file_path = "./data/snapshot/mainnet.snapshot.state"

config.rs


#![allow(unused)]
fn main() {
const DEFAULT_META_FILE_PATH: &str = "./data/snapshot/mainnet.snapshot.meta";
const DEFAULT_STATE_FILE_PATH: &str = "./data/snapshot/mainnet.snapshot.state";

#[derive(Default, Deserialize)]
pub struct SnapshotConfigBuilder {
    meta_file_path: Option<String>,
    state_file_path: Option<String>,
}

impl SnapshotConfigBuilder {
    pub fn new() -> Self {
        Self::default()
    }

    pub fn meta_file_path(mut self, meta_file_path: String) -> Self {
        self.meta_file_path.replace(meta_file_path);
        self
    }

    pub fn state_file_path(mut self, state_file_path: String) -> Self {
        self.state_file_path.replace(state_file_path);
        self
    }

    pub fn finish(self) -> SnapshotConfig {
        SnapshotConfig {
            meta_file_path: self.meta_file_path.unwrap_or(DEFAULT_META_FILE_PATH.to_string()),
            state_file_path: self.state_file_path.unwrap_or(DEFAULT_STATE_FILE_PATH.to_string()),
        }
    }
}
}

Configuration

A configuration type should:

  • have a name suffixed by Config;
  • derive the following traits;
    • Clone, since all configuration are most probably aggregated in a common configuration after reading from a file, this trait is needed to give components a unique ownership of their own configuration;
    • Serialize if the configuration is expected to be updated and saved;
  • provide a build method that returns a new instance of the associated builder;
  • have the same fields with the same names, without Option, as the builder;
  • have no public fields;
  • provide setters/updaters only on fields that are expected to be updatable;
  • have getters or pub(crate) fields;

Here is a small example fitting all these requirements:

config.toml

[snapshot]
meta_file_path  = "./data/snapshot/mainnet.snapshot.meta"
state_file_path = "./data/snapshot/mainnet.snapshot.state"

config.rs


#![allow(unused)]
fn main() {
#[derive(Clone)]
pub struct SnapshotConfig {
    meta_file_path: String,
    state_file_path: String,
}

impl SnapshotConfig {
    pub fn build() -> SnapshotConfigBuilder {
      SnapshotConfig::new()
    }

    pub fn meta_file_path(&self) -> &String {
        &self.meta_file_path
    }

    pub fn state_file_path(&self) -> &String {
        &self.state_file_path
    }
}

}

Read a configuration builder from a file


#![allow(unused)]
fn main() {
let config_builder = match fs::read_to_string("config.toml") {
    Ok(toml) => match toml::from_str::<SnapshotConfigBuilder>(&toml) {
        Ok(config_builder) => config_builder,
        Err(e) => {
            // Handle error
        }
    },
    Err(e) => {
        // Handle error
    }
};

// Override fields if necessary e.g. with CLI arguments.

let config = config_builder.finish();
}

Write a configuration to a file


#![allow(unused)]
fn main() {
match toml::to_string(&config) {
    Ok(toml) => match fs::File::create("config.toml") {
        Ok(mut file) => {
            if let Err(e) = file.write_all(toml.as_bytes()) {
                // Handle error
            }
        }
        Err(e) => {
            // Handle error
        }
    },
    Err(e) => {
        // Handle error
    }
}
}

Sub-configuration

It is also very easy to create sub-configurations by nesting configuration builders and configurations.

config.toml

[snapshot]
[snapshot.local]
meta_file_path  = "./data/snapshot/local/mainnet.snapshot.meta"
state_file_path = "./data/snapshot/local/mainnet.snapshot.state"
[snapshot.global]
file_path       = "./data/snapshot/global/mainnet.txt"

config.rs


#![allow(unused)]
fn main() {
#[derive(Default, Deserialize)]
pub struct LocalSnapshotConfigBuilder {
    meta_file_path: Option<String>,
    state_file_path: Option<String>,
}

#[derive(Default, Deserialize)]
pub struct GlobalSnapshotConfigBuilder {
    file_path: Option<String>,
}

#[derive(Default, Deserialize)]
pub struct SnapshotConfigBuilder {
    local: LocalSnapshotConfigBuilder,
    global: GlobalSnapshotConfigBuilder,
}

impl SnapshotConfigBuilder {
    pub fn new() -> Self {
        Self::default()
    }

    // Setters

    pub fn finish(self) -> SnapshotConfig {
        SnapshotConfig {
            local: LocalSnapshotConfig {
                meta_file_path: self
                    .local
                    .meta_file_path
                    .unwrap_or(DEFAULT_LOCAL_SNAPSHOT_META_FILE_PATH.to_string()),
                state_file_path: self
                    .local
                    .state_file_path
                    .unwrap_or(DEFAULT_LOCAL_SNAPSHOT_STATE_FILE_PATH.to_string()),
            },
            global: GlobalSnapshotConfig {
                file_path: self
                    .global
                    .file_path
                    .unwrap_or(DEFAULT_GLOBAL_SNAPSHOT_FILE_PATH.to_string()),
            },
        }
    }
}
#[derive(Clone)]
pub struct LocalSnapshotConfig {
    meta_file_path: String,
    state_file_path: String,
}

#[derive(Clone)]
pub struct GlobalSnapshotConfig {
    file_path: String,
}

#[derive(Clone)]
pub struct SnapshotConfig {
    local: LocalSnapshotConfig,
    global: GlobalSnapshotConfig,
}

// Impl

}

Drawbacks

No specific drawback with this approach.

Rationale and alternatives

Many configuration formats are usually considered:

This RFC and its expected implementations are choosing TOML as a first default configuration format because it is the preferred option in the Rust ecosystem. However, it is not excluded by this RFC that other formats may be provided in the future, serde making it very easy to support other formats. It is also important to note that TOML only provides a limited amount of layers of nesting due to its non-recursive syntax, which may eventually become an issue.

Serde itself has been chosen because it is the standard for serialization/deserialization in Rust.

Unresolved questions

In case of binary crates, e.g.bee-node, configuration with CLI arguments is not described in this RFC but everything is already set up to support it seamlessly. The builder setters allow setting fields or overriding fields that may have already pre-filled by the parsing of a configuration file. A CLI parser library like clap may be used on top of the builders;

Summary

This RFC proposes a networking crate (bee-network) to be added to the Bee framework that provides means to exchange byte messages with connected endpoints.

Motivation

Bee nodes need a way to share messages with each other and other compatible node implementations as part of the IOTA gossip protocol or they won't be able to form a network and synchronize with each other.

Detailed design

The functionalities of this crate are relatively low-level from a node perspective in the sense, that they are independent from any specifics defined by the IOTA protocol. Consequently, it has no dependencies on other crates of the framework, and can easily be reused in different contexts.

The aim of this crate is to make it simple and straightforward for developers to build layers on top of it (e.g. a protocol layer) by abstracting away all the underlying networking logic. It is therefore a lot easier for them to focus on the modelling of other important aspects of a node software, like:

  • Peers: other nodes in the network,
  • Messagees: the information exchanged between peers, and its serialization/deserialization.

Given some identifier epid, sending a message to its corresponding endpoint becomes a single line of asynchronous code:


#![allow(unused)]
fn main() {
network.send(SendMessage { epid, bytes: "hello".as_bytes() }).await?;
}

The purpose of this crate is to provide the following functionalities:

  • maintain a list of endpoints,
  • establish and maintain connections with endpoints,
  • allow to send, multicast or broadcast byte encoded messages of variable size to any of the endpoints,
  • allow to receive byte encoded messages of variable size from any of the endpoints,
  • reject connections from unknown, i.e. not whitelisted, endpoints,
  • manage all connections and message transfers asynchronously,
  • provide an extensible, yet convenient-to-use, and documented API.

The key design decisions are now being discussed in the following sub-sections.

async/await

This crate is by nature very much dependent on events happening outside of the control of the program, e.g. listening for incoming connections from peers, waiting for packets on a specific socket, etc. Hence, - under the hood - this crate makes heavy use of Rust's concurrency abstractions. Luckily, with the stabilization of the async/await syntax, writing asynchronous code has become almost as easy to read, write, and maintain as synchronous code. In an experimental implementation of this crate the async_std library was used, which comes with asynchronous drop-in replacements for their synchronous equivalents found in Rust's standard library std. Additionally, asynchronous mpsc channels were taken from the futures crate.

Message passing

This crate favors message passing via channels over globally shared state and locks. Instead of keeping the list of endpoints in a globally accessible hashmap this crate separates and moves such state into workers that run asynchronously, and are listening for commands and events to act upon, and also notify listeners by sending events.

Initialization

In order for this crate to be used, it has to be initialized first by providing a NetworkConfig:


#![allow(unused)]
fn main() {
struct NetworkConfig {
    // The server or binding TCP port that remote endpoints can connect and send to.
    binding_port: u16,
    // The binding address that remote endpoints can connect and send to.
    binding_addr: IpAddr,
    // The interval between two connection attempts in case of connection failures.
    reconnect_interval: u64,
    // The maximum message length in terms of bytes.
    max_message_length: usize

    ...
}
}

This has the advantage of being easily extensible and keeping the signature of the static init function small:


#![allow(unused)]
fn main() {
fn init(config: NetworkConfig) -> (Network, Events, Shutdown) { ... }
}

This function returns a tuple struct that allows the consumer of this crate to send commands (Network), listen to events (Events), and gracefully shutdown all running asynchronous tasks that were spawned within this system (Shutdown).

See also RFC 31, which describes a configuration pattern for Bee binary and library crates.

Port, Address, Protocol, and Url

In order to send a message to an endpoint, a node needs to know the endpoint's address and the communication protocol it uses. This crate provides the following abstractions to deal with that:

  • Port: a 0-cost wrapper around a u16, which is introduced for type safety and better readability:

    
    #![allow(unused)]
    fn main() {
    struct Port(u16);
    }
    

    For convenience, Port dereferences to u16.

  • Address: a 0-cost wrapper around a SocketAddr which provides an adjusted, but overall simpler API than its inner type:

    
    #![allow(unused)]
    fn main() {
    #[derive(Clone, Copy, Debug, Eq, Hash, PartialEq)]
    struct Address(SocketAddr);
    }
    

    An Address can be constructed from Ipv4 and Ipv6 addresses and a port:

    
    #![allow(unused)]
    fn main() {
    // Infallible construction of an `Address` from Ipv4.
    fn from_v4_addr_and_port(address: Ipv4Addr, port: Port) -> Self { ... }
    
    // Infallible construction of an `Address` from Ipv6.
    fn from_v6_addr_and_port(address: Ipv6Addr, port: Port) -> Self { ... }
    
    // Fallible construction of an `Address` from `&str`.
    async fn from_addr_str(address: &str) -> Result<Self> { ... }
    }
    

    Note, that the last function is async. That is because of possible delays when performing domain name resolution.

  • Protocol: an enumeration of supported communication protocols:

    
    #![allow(unused)]
    fn main() {
    #[non_exhaustive]
    enum Protocol {
        Tcp,
    }
    }
    

    Note, that future updates may support different protocols, which is why this enum is declared as non_exhaustive, see Unresolved Questions section.

  • Url: an enumeration that can always be constructed from an Address and a Protocol (infallible), or from a &str, which can fail if parsing or domain name resolution fails. If successfull however, it resolves to an Ipv4 or Ipv6 address stored in a variant of the enum depending on the url scheme part. Note, that this crate therefore expects the Url string to always provide a scheme (e.g. tcp://) and a port (e.g. 15600) when specifying an endpoint's address.

    
    #![allow(unused)]
    fn main() {
    #[non_exhaustive]
    enum Url {
        // Represents the address of an endpoint communicating over TCP.
        Tcp(Address),
    }
    }
    

    Note, that future updates may support different protocols, which is why this enum is declared as non_exhaustive, see Unresolved Questions section.

    
    #![allow(unused)]
    fn main() {
    // Infallible construction of a `Url` from an `Address` and a `Protocol`.
    fn new(addr: Address, proto: Protocol) -> Self { ... }
    
    // Fallible construction of a `Url` from `&str`.
    async fn from_url_str(url: &str) -> Result<Self> { ... }
    }
    

    Note, that the second function is async. That is because of possible delays when performing domain name resolution.

Endpoint and EndpointId

To model the remote part of a connection, this crate introduces the Endpoint type:


#![allow(unused)]
fn main() {
struct Endpoint {
    // The id of the endpoint.
    id: EndpointId,

    // The address of the endpoint.
    address: Address,

    // The protocol used to communicate with that endpoint.
    protocol: Protocol,
}
}

Note: In a peer-to-peer (p2p) network peers usually have two endpoints initially as they are actively trying to connect, but also need to accept connections from their peers. This crate is agnostic about how to handle duplicate connections as this is usually resolved by a handshaking protocol defined at a higher layer.

To uniquely identify an Endpoint, this crate proposes the EndpointId type, which can be implemented as a wrapper around an Address.


#![allow(unused)]
fn main() {
#[derive(Clone, Copy, Eq, PartialEq, Hash)]
struct EndpointId(Address);
}

Note, that this EndpointId should be Clone and Copy, and requires to implement Eq, PartialEq, and Hash, because it is used as a hashmap key in several instances.

Command

Commands are messages that are supposed to be issued by higher level business logic like a protocol layer. They are implemented as an enum, which in Rust is one way of expressing a polymorphic type:


#![allow(unused)]
fn main() {
enum Command {
    // Adds an endpoint to the list.
    AddEndpoint { url: Url, ... },

    // Removes an endpoint from the list.
    RemoveEndpoint { epid: EndpointId, ... },

    // Attempts to connect to an endpoint.
    Connect { epid: EndpointId, ... },

    // Disconnects from an endpoint.
    Disconnect { epid: EndpointId, ... },

    // Sends a message to a particular endpoint.
    SendMessage { epid: EndpointId, bytes: Vec<u8>, ... },

    // Sends a message to multiple endpoints.
    MulticastMessage { epids: Vec<EndpointId>, bytes: Vec<u8>, ...},

    // Sends a message to all endpoints in its list.
    BroadcastMessage { bytes: Vec<u8>, ... },
}
}

This enum makes the things the consumer can do with the crate very explicit and descriptive, and also easily extensible.

Event

Similarly to Commands, the different kinds of Events in the system are implemented as an enum allowing various types of concrete events being sent over event channels:


#![allow(unused)]
fn main() {
#[non_exhaustive]
enum Event {
    // An endpoint was added to the list.
    EndpointAdded { epid: EndpointId, ... },

    // An endpoint was removed from the list.
    EndpointRemoved { epid: EndpointId, ... },

    // A new connection is being set up.
    NewConnection { ep: Endpoint, ... },

    // A broken connection.
    LostConnection { epid: EndpointId },

    // An endpoint was successfully connected.
    EndpointConnected { epid: EndpointId, address: Address, ... },

    // An endpoint was disconnected.
    EndpointDisconnected { epid: EndpointId, ... },

    // A message was sent.
    MessageSent { epid: EndpointId, num_bytes: usize },

    // A message was received.
    MessageReceived { epid: EndpointId, bytes: Vec<u8> },

    // A connection attempt should occur.
    TryConnect { epid: EndpointId, ... },
}

}

In contrast to Commands though, Events are messages created by the system, and those that are considered relevant are published for the consumer to execute custom logic. It is attributed as non_exhaustive to accommodate for possible additions in the future.

Workers

There are two essential asynchronous workers in the system that are running for the application's whole lifetime. That is the EndpointWorker and the TcpWorker.

EndpointWorker

This worker manages the list of Endpoints and processes the Commands issued by the consumer of this crate, and publishes Events to be consumed by the user.

TcpWorker

This worker is responsible for accepting incoming TCP connection requests from other endpoints. Once a connection is established two additional asynchronous tasks are spawned that respectively handle incoming and outgoing messages.

Note: connection attempts from unknown IP addresses (i.e. not part of the static whitelist) will be rejected.

Shutdown

Shutdown is a struct that facilitates a graceful shutdown. It stores the sender halfs of oneshot channels (ShutdownNotifier) and the task handles (WorkerTask).


#![allow(unused)]
fn main() {
struct Shutdown {
    notifiers: Vec<ShutdownNotifier>,
    tasks: Vec<WorkerTask>,
}
}

Apart from providing the necessary API to register channels and tasks, executing the shutdown happens by calling:


#![allow(unused)]

fn main() {
async fn execute(self) -> Result<()> { ... }

}

This method will then try to send a shutdown signal to all registered workers and wait for those tasks to complete.

Drawbacks

  • This design might not be the most performant solution as it requires each byte message to be wrapped into a SendMessage command, that is then sent through a channel to the EndpointWorker. Proper benchmarking is necessary to determine if this design can be a performance bottleneck;
  • Currently UDP is not a requirement, so the proposal focuses on TCP only. Nonetheless, the crate is designed in way that allows for simple addition of UDP support;

Rationale and alternatives

  • The reason for not choosing an already available crate, is that there aren't many crates yet, that are implemented in the proposed way, and are also widely used in the Rust ecosystem. Additionally we reduce dependencies, and have the option to tailor the crate exactly to our needs even on short notice;
  • There are many ways to implement such a crate, many of which are probably more performant than the proposed one. However, the main focus of this proposal is ease-of-use and extensibility. It leaves performance optimization to later versions and/or RFCs, when more data has been collected through stress tests and benchmarks;
  • In particular, the Command abstraction could be removed entirely, and replaced by an additional static function for each command. This might be a more efficient thing to do, but it is still unclear how much there is to be gained by doing this, and if this would be necessary taking into account the various other CPU expensive things like hashing, signature verification, etc, that a node needs to do;
  • Since mpsc channels are still unstable in async_std at the time of this writing, this RFC suggests using those from the futures crate. Once this has been stabilized, a swith to those channels might improve performance, since to the authors knowledge those are based on the more efficient crossbeam implementation;
  • To implement this RFC, any async library or mix can be used. Other options to consider are tokio, and for asynchronous channels flume. Using those libraries, might result in a much more efficient implementation;
  • EndpointId may be something different than a wrapper around an Address, but instead be derived from some sort of cryptographic hash;

Unresolved questions

  • The design has been tested in a prototype, and it seems to work reliably, so there are no unresolved questions in terms of the validity of this proposal;
  • It is still unclear - due to the lack of benchmark results - how efficient this design is; however, due to its asynchronous design it should be able to make use from multicore systems;
  • Handling of endpoints with dynamic IPs;

Summary

This RFC proposes the ternary Hash type, the Sponge trait, and two cryptographic hash functions CurlP and Kerl implementing it. The 3 cryptographic hash functions used in the current IOTA networks (i.e. as of IOTA Reference Implementation iri v1.8.6) are Kerl, CurlP27, and CurlP81.

Useful links:

Motivation

In order to participate in the IOTA network, an application needs to be able to construct valid messages that can be verified by other participants in the network. Among other operations, this is accomplished by validating transaction signatures, transaction hashes and bundle hashes.

The two hash functions currently used are both sponge constructions: CurlP, which is specified entirely in balanced ternary, and Kerl, which first converts ternary input to a binary representation, applies keccak-384 to it, and then converts its binary output back to ternary. For CurlP specifically, its variants CurlP27 and CurlP81 are used.

Detailed design

Hash

This RFC defines a ternary type Hash which is the base input and output of the Sponge. The exact definition is an implementation detail but an example definition could simply be the following:


#![allow(unused)]
fn main() {
struct Hash([i8; HASH_LENGTH]);
}

Where the length of a hash in units of binary-coded balanced trits would be defined as:


#![allow(unused)]
fn main() {
const HASH_LENGTH: usize = 243;
}

Sponges

CurlP and Kerl are cryptographic sponge constructions. They are equipped with a memory state and a function that replaces the state memory using some input string. A portion of the memory state is then the output. In the sponge metaphor, the process of replacing the memory state by an input string is said to absorb the input, while the process of producing an output is said to squeeze out an output.

The hash functions are expected to be used like this:

  • curlp

    
    #![allow(unused)]
    fn main() {
    // Create a CurlP instance with 81 rounds.
    // This is equivalent to calling `CurlP::new(CurlPRounds::Rounds81)`.
    let mut curlp = CurlP81::new();
    
    // Assume there are some transaction trits, all zeroes for the sake of this example.
    let transaction = TritBuf::<T1B1Buf>::zeros(6561);
    let mut hash = TritBuf::<T1B1Buf>::zeros(243);
    
    // Absorb the transaction.
    curlp.absorb(&transaction);
    // Squeeze out a hash.
    curlp.squeeze_into(&mut hash);
    }
    
  • kerl

    
    #![allow(unused)]
    fn main() {
    // Create a Kerl instance.
    let mut kerl = Kerl::new();
    
    // `Kerl::digest` is a function that combines `Kerl::absorb` and `Kerl::squeeze`.
    // `Kerl::digest_into` combines `Kerl::absorb` with `Kerl::squeeze_into`.
    let hash = kerl.digest(&transaction);
    }
    

The main proposal of this RFC are the Sponge trait and the CurlP and Kerl types that are implementing it. This RFC relies on the presence of the types TritBuf and Trits, as defined by RFC36, which are assumed to be owning and borrowing collections of binary-encoded ternary in the T1B1 encoding (one trit per byte).


#![allow(unused)]
fn main() {
/// The common interface of cryptographic hash functions that follow the sponge construction and that act on ternary.
trait Sponge {
    /// An error indicating that a failure has occured during a sponge operation.
    type Error;

    /// Absorb `input` into the sponge.
    fn absorb(&mut self, input: &Trits) -> Result<(), Self::Error>;

    /// Reset the inner state of the sponge.
    fn reset(&mut self);

    /// Squeeze the sponge into a buffer.
    fn squeeze_into(&mut self, buf: &mut Trits) -> Result<(), Self::Error>;

    /// Convenience function using `Sponge::squeeze_into` to return an owned version of the hash.
    fn squeeze(&mut self) -> Result<TritBuf, Self::Error> {
        let mut output = TritBuf::zeros(HASH_LENGTH);
        self.squeeze_into(&mut output)?;
        Ok(output)
    }

    /// Convenience function to absorb `input`, squeeze the sponge into a buffer `buf`, and reset the sponge in one go.
    fn digest_into(&mut self, input: &Trits, buf: &mut Trits) -> Result<(), Self::Error> {
        self.absorb(input)?;
        self.squeeze_into(buf)?;
        self.reset();
        Ok(())
    }

    /// Convenience function to absorb `input`, squeeze the sponge, and reset the sponge in one go.
    /// Returns an owned version of the hash.
    fn digest(&mut self, input: &Trits) -> Result<TritBuf, Self::Error> {
        self.absorb(input)?;
        let output = self.squeeze()?;
        self.reset();
        Ok(output)
    }
}
}

Following the sponge metaphor, an input provided by the user is absorbed, and an output will be squeezed from the data structure. digest is a convenience method calling absorb and squeeze in one go. The *_into versions of these methods are for passing a buffer, into which the calculated hashes are written. The internal state will not be cleared unless reset is called.

Design of CurlP

CurlP is designed as a hash function, that acts on a T1B1 binary-encoded ternary buffer, with a hash length of 243 trits and an inner state of 729:


#![allow(unused)]
fn main() {
const STATE_LENGTH: usize = HASH_LENGTH * 3;
}

In addition, a lookup table is used as part of the absorption step:

-101
-111-1
00-11
1-100

Given two balanced trits t and t', the following table can easily be accessed by shifting them to unbalanced trits:


#![allow(unused)]
fn main() {
const TRUTH_TABLE: [[i8; 3]; 3] = [[1, 0, -1], [1, -1, 0], [-1, 1, 0]];
}

The way CurlP is defined, it can not actually fail, because the input or outputs can be of arbitrary size; hence, the associated type Error = Infallible.

CurlP has two common variants depending on the number of rounds of hashing to apply before a hash is squeezed.


#![allow(unused)]
fn main() {
#[derive(Copy, Clone)]
enum CurlPRounds {
    Rounds27 = 27,
    Rounds81 = 81,
}
}

Type definition:


#![allow(unused)]
fn main() {
struct CurlP {
    /// The number of rounds of hashing to apply before a hash is squeezed.
    rounds: CurlPRounds,

    /// The internal state.
    state: TritBuf,

    /// Workspace for performing transformations.
    work_state: TritBuf,
}
}

#![allow(unused)]
fn main() {
impl Sponge for CurlP {
    type Error = Infallible;
    ...
}
}

In addition, this RFC proposes two wrapper types for the very common CurlP variants with 27 and 81 rounds. In most use cases, Sponge is required to implement Default so these variants need to be new types instead of just providing new27 or new81 methods to CurlP.


#![allow(unused)]
fn main() {
struct CurlP27(CurlP);

impl CurlP27 {
    fn new() -> Self {
        Self(CurlP::new(CurlPRounds::Rounds27))
    }
}

struct CurlP81(CurlP);

impl CurlP81 {
    fn new() -> Self {
        Self(CurlP::new(CurlPRounds::Rounds81))
    }
}
}

For convenience, they should both dereference to an actual CurlP:


#![allow(unused)]
fn main() {
impl Deref for CurlP27 {
    type Target = CurlP;

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

impl DerefMut for CurlP27 {
    fn deref_mut(&mut self) -> &mut Self::Target {
        &mut self.0
    }
}

impl Deref for CurlP81 {
    type Target = CurlP;

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

impl DerefMut for CurlP81 {
    fn deref_mut(&mut self) -> &mut Self::Target {
        &mut self.0
    }
}
}

This allows using them as a Sponge as well if there is a blanket implementation like this:


#![allow(unused)]
fn main() {
impl<T: Sponge, U: DerefMut<Target = T>> Sponge for U {
    type Error = T::Error;

    fn absorb(&mut self, input: &Trits) -> Result<(), Self::Error> {
        T::absorb(self, input)
    }

    fn reset(&mut self) {
        T::reset(self)
    }

    fn squeeze_into(&mut self, buf: &mut Trits) -> Result<(), Self::Error> {
        T::squeeze_into(self, buf)
    }
}
}

Design of Kerl

Type definition:


#![allow(unused)]
fn main() {
struct Kerl {
    /// Actual keccak hash function.
    keccak: Keccak,
    /// Binary working state.
    binary_state: I384<BigEndian, U8Repr>,
    /// Ternary working state.
    ternary_state: T243<Btrit>,
}
}

#![allow(unused)]
fn main() {
impl Sponge for Kerl {
    type Error = Error;
    ...
}
}

The actual cryptographic hash function underlying Kerl is keccak-384. The real task here is to transform an input of 243 (balanced) trits to 384 bits in a correct and performant way. This is done by interpreting the 243 trits as a signed integer I and converting it to a binary basis. In other words, the ternary encoded integer is expressed as the series:

I = t_0 * 3^0 + t_1 * 3^1 + ... + t_241 * 3^241 + t_242 * 3^242,

where t_i is the trit at the i-th position in the ternary input array. The challenge is then to convert this integer to base 2, i.e. find a a series such that:

I = b_0 * 2^0 + b_1 * 2^1 + ... + b_382 * 2^382 + b_383 * 2^383,

with b_i the bit at the i-th position.

Assuming there exists an implementation of keccak, the main work in implementing Kerl is writing an efficient converter between the ternary array interpreted as an integer, and its binary representation. For the binary representation, one can either use an existing big-integer library, or write one from scratch with only a subset of required methods to make the conversion work.

Important implementation details

When absorbing into the sponge the conversion flows like this (little and big are short for little and big endian, respectively):

Balanced t243 -> unbalanced t242 -> u32 little u384 -> u32 little i384 -> u8 big i384

When squeezing and thus converting back to ternary the conversion flows like this:

u8 big i384 -> u32 little i384 -> u32 little u384 -> unbalanced t243 -> balanced t243 -> balanced t242

These steps will now be explained in detail.

Conversion is done via accumulation

Because a number like 3^242 does not fit into any existing primitive, it needs to be constructed from scratch by taking a big integer, setting its least significant number to 1, and multiplying it by 3 242 times. This is the core of the conversion mechanism. Essentially, the polynomial used to express the integer in ternary can be rewritten like this:

I = t_0 * 3^0 + t_1 * 3^1 + ... + t_241 * 3^241 + t_242 * 3^242,
  = ((...((t_242 * 3 + t_241) * 3 + t_240) * 3 + ...) * 3 + t_1) * 3 + t_0

Thus, one iterates through the ternary buffer starting from the most significant trit, adds t_242 onto the binary big integer (initially filled with 0s), and then keeps looping through it, multiplying it by 3 and adding the next t_i.

Conversion is done in unsigned space

First and foremost, IOTA is primarily written with balanced ternary in mind, meaning that each trit represents an integer in the set {-1, 0, +1}. Because it is easier to do the conversion in positive space, the trits are shifted into unbalanced space, by adding +1 at each position, so that each unbalanced trit represents an integer in {0, 1, 2}.

For example, the balanced ternary buffer (using only 9 trits to demonstrate the point) becomes after the shift (leaving out signs in the unbalanced case):

[0, -1, +1, -1, -1, 0, 0, 0, +1] -> [1, 0, 2, 0, 0, 1, 1, 1, 2]

Remembering that each ternary array is actually some integer I, this is akin to adding another integer H to it, with all trits in the ternary buffer representing it set to 1, where I' is I shifted into unsigned space, and the underscore _t means a ternary representation (either balanced or unbalanced):

I_t + H_t = [0, -1, +1, -1, -1, 0, 0, 0, +1] + [+1, +1, +1, +1, +1, +1, +1, +1] = I'_t

After changing the base to binary using some function which would be called to_bin and which would be required to be distributed over addition, H needs to be subtracted again. We use _b to signify a binary representation:

I'_b = to_bin(I'_t) = to_bin(I_t + H_t) = to_bin(I_t) + to_bin(H_t) = I_b + H_b
=>
I_b = I'_b - H_b

In other words, the addition of the ternary buffer filled with 1s that shifts all trits into unbalanced space is reverted after conversion to binary, where the buffer of 1s is also converted to binary and then subtracted from the binary unsigned big integer. The result then is the integer I in binary.

243 trits do not fit into 384 bits

Since 243 trits do not fit into 384 bits, a choice has to be made about how to treat the most significant trit. For example, one could take the binary big integer, convert it to ternary, and then check if the 243 are smaller than this converted maximum 384 bit big int. With Kerl, the choice was made to disregard the most significant trit, so that one only ever converts 242 trits to 384 bits, which always fits.

For the direction ternary -> binary this does not pose challenges, other than making sure that one sets the most significant trit to 0 after the shift by applying +1 (if one chooses to reuse the array of 243 trits), and by making sure to subtract H_b (see previous section) with the most significant trit set to 0 and all others set to 1.

The direction binary -> ternary (the conversion of the 384-bit hash squeezed from keccak) is the challenging part: one needs to ensure that the most significant trit is is set to 0 before the binary-encoded integer is converted to ternary. However, this has to happen in binary space!

Take J_b as the 384 bit integer coming out of the sponge. Then after conversion to ternary, to_ter(J_b) = J_t, the most significant trit (MST) of J_t might be +1 or -1. However, since by convention the MST has to be 0, one needs to check whether J_b would cause J_t to have its MST set after conversion. This is done the following way:

if J_b > to_bin([0, 1, 1, ..., 1]):
    J_b <- J_b - to_bin([1, 0, 0, ..., 0])

if J_b < (to_bin([0, 0, 0, ..., 0]) - to_bin([0, 1, 1, ..., 1])):
    J_b <- J_b + to_bin([1, 0, 0, ..., 0])
Kerl updates the inner state by applying logical not

The upstream keccak implementation uses a complicated permutation to update the inner state of the sponge construction after a hash was squeezed from it. Kerl opts to apply logical not, !, to the bytes squeezed from keccak, and updating keccak's inner state with these.

Design and implementation

The main goal of the implementation was to ensure that representation of the integer was cast into types. To that end, the following ternary types are defined as wrappers around TritBuf (read T243 the same way you would think of u32 or i64):


#![allow(unused)]
fn main() {
struct T242<T: Trit> {
    inner: TritBuf<T1B1Buf<T>>,
}

struct T243<T: Trit> {
    inner: TritBuf<T1B1Buf<T>>,
}
}

where TritBuf, T1B1Buf, and Trit are types and traits defined in the bee-ternary crate. The bound Trit ensures that the structs only contain ternary buffers with Btrit (for balancer ternary), and Utrit (for unbalanced ternary). Methods on T242 and T243 assume that the most significant trit is always in the last position (think “little endian”).

For the binary representation of the integer, the types I384 (“signed” integer akin to i64), and U384 (“unsigned” akin to u64) are defined:


#![allow(unused)]
fn main() {
struct I384<E, T> {
    inner: T,
    _phantom: PhantomData<E>.
}

struct U384<E, T> {
    inner: T,
    _phantom: PhantomData<E>.
}
}

inner: T for encoding the inner fixed-size arrays used for storing either bytes, u8, or integers, u32:


#![allow(unused)]
fn main() {
type U8Repr = [u8; 48];
type U32Repr = [u32; 12];
}

The phantom type E is used to encode endianness, BigEndian and LittleEndian. These are just used as marker types without any methods.


#![allow(unused)]
fn main() {
struct BigEndian {}
struct LittleEndian {}

trait EndianType {}

impl EndianType for BigEndian {}
impl EndianType for LittleEndian {}
}

The underlying arrays and endianness are important, because keccak expects bytes as input, and because Kerl made the choice to revert the order of the integers in binary big int.

To understand the implementation, the most important methods are:


#![allow(unused)]
fn main() {
T242<Btrit>::from_i384_ignoring_mst
T243<Utrit>::from_u384
I384<LittleEndian, U32Repr>::from_t242
I384<LittleEndian, U32Repr>::try_from_t242
U384<LittleEndian, U32Repr>::add_inplace
U384<LittleEndian, U32Repr>::add_digit_inplace
U384<LittleEndian, U32Repr>::sub_inplace
U384<LittleEndian, U32Repr>::from_t242
U384<LittleEndian, U32Repr>::try_from_t243
}

Drawbacks

  • All hash functions, no matter if they can fail or not, have to implement Error;

Rationale and alternatives

  • CurlP and Kerl are fundamental to the iri v1.8.6 mainnet. They are thus essential for compatibility with it;
  • These types are idiomatic in Rust, and users are not required to know the implementation details of each hash algorithm;

Unresolved questions

  • Parameters are slice reference in both input and output. Do values need to be consumed or should new instances as return values be created?;
  • Implementation of each hash functions and other utilities like HMAC should have separate RFCs for them;
  • Decision on implementation of Troika is still unknown;
  • Can (should?) the CurlP algorithm be explained in more detail?;
  • Is it important to encode that the most significant trit is 0 by having a T242?;

Summary

This RFC introduces a logging library - log - and some recommendations on how to log in the Bee project.

Motivation

Logging is done across almost all binary and library crates in the Bee project, so consistency is needed to provide a better user experience.

Detailed design

Logging in Bee should be done through the log crate (A Rust library providing a lightweight logging facade) which is the de-facto standard library for logging in the Rust ecosystem.

The log crate itself is just a frontend and supports the implementation of many different backends.

It provides a single logging API that abstracts over the actual logging implementation. Libraries can use the logging API provided by this crate, and the consumer of those libraries can choose the logging implementation that is most suitable for its use case.

Frontend

Libraries should link only to the log crate, and use the provided macros to log whatever information will be useful to downstream consumers.

The log crate provides the following macros, from lowest priority to highest priority:

  • trace!
  • debug!
  • info!
  • warn!
  • error!

These macros have a usage similar to the println! macro.

A log request consists of a target, a level, and a message.

By default, the target represents the location of the log request, but may be overridden.

Example with default target:


#![allow(unused)]
fn main() {
info!("Connected to port {} at {} Mb/s", conn_info.port, conn_info.speed);
}

Example with overridden target:


#![allow(unused)]
fn main() {
info!(target: "connection_events", "Successful connection, port: {}, speed: {}", conn_info.port, conn_info.speed);
}

Backend

Executables should choose a logging implementation and initialize it early in the runtime of the program. Logging implementations will typically include a function to do this. Any log messages generated before the implementation is initialized will be ignored.

There are a lot of available backends for the log frontend.

This RFC opts for the fern backend which is a very complete one with an advanced configuration.

Initialization

The backend API is limited to one function call to initialize the logger. It is designed to be updatable without breaking changes by taking a LoggerConfig configuration object and returning a Result.


#![allow(unused)]
fn main() {
fn logger_init(config: LoggerConfig) -> Result<(), Error>;
}

Configuration

The following configuration - compliant with RFC 31 - primarily allows configuring different outputs with different log levels.

config.rs


#![allow(unused)]
fn main() {
#[derive(Clone)]
struct LoggerOutputConfig {
    name: String,
    level: LevelFilter,
    ...
}

#[derive(Clone)]
struct LoggerConfig {
    color: bool,
    outputs: Vec<LoggerOutputConfig>,
    ...
}
}

config.toml

[logger]
color = true
[[logger.outputs]]
name  = "stdout"
level = "info"
[[logger.outputs]]
name  = "errors.log"
level = "error"

Note: stdout is a reserved name for the standard output of the console.

Errors

Since different backends may have different errors, we need to abstract them to provide a consistent logger_init function.


#![allow(unused)]
fn main() {
#[derive(Debug)]
#[non_exhaustive]
enum Error {
    ...
}
}

Note: The Error enum has to be non_exhaustive to allow further improving / extending the logger without risking a breaking change.

Format

The following elements should appear in this order:

  • Date in the %Y-%m-%d format e.g. 2020-05-25;
  • Time in the %H:%M:%S format e.g. 10:23:03;
  • Target e.g. bee_node::node;
    • The default target is preferred but can be overridden if needed;
  • Level e.g. INFO;
  • Message e.g. Initializing...;

All elements should be enclosed in brackets [...] with only a space between the level and the message.

Example: [2020-05-25][10:23:03][bee_node::node][INFO] Initializing...

Note: All messages should end either with ... (3 dots) to indicate something potentially long-lasting is happening, or with . (1 dot) to indicate events that just happened.

Drawbacks

No specific drawbacks using this library.

Rationale and alternatives

  • The log crate is maintained by the rust team, so it is a widely used and trusted dependency in the Bee framework;
  • Most of the logging crates in the Rust ecosystem are actually backends for the log crate with different features;
  • It should be very easy to switch to a different backend in the future by only changing the initialization function;
  • There are a lot of available backends but fern offers a very fine-grained logging configuration;

Unresolved questions

There are no open questions as this RFC is pretty straightforward and the topic of logging is not a complex one.

Summary

This RFC introduces both the API and implementation of the bee-ternary crate, a general-purpose ternary manipulation crate written in Rust that may be used to handle ternary data in the IOTA ecosystem.

Motivation

Ternary has been a fundamental part of the IOTA technology and is used throughout, including for fundamental features like wallet addresses and transaction identification.

More information about ternary in IOTA can be found here.

Note that the IOTA foundation is seeking to reduce the prevalence of ternary in the IOTA protocol to better accomodate modern binary architectures. See Chrysalis for more information.

Manipulating ternary in an ergonomic manner on top of existing binary-driven code can require complexity and, up until now, comes either with a lot of boilerplate code, implementations that are difficult to verify as correct, or a lack of features.

bee-ternary seeks to become the canonical ternary crate in the Rust IOTA ecosystem by providing an API that is efficient, ergonomic, and featureful all at once.

bee-ternary will allow IOTA developers to more easily write code that works with ternary and allow the simplification of existing codebases through use of the API.

Broadly, there are 3 main benefits that bee-ternary aims to provide over use-specific implementations of ternary manipulation code.

  • Ergonomics: The API should be trivial to use correctly and should naturally guide developers towards correct and efficient solutions.

  • Features: The API should provide a fundamental set of core features that can be used together to cover most developer requirements. Examples of such features include multiple encoding schemes, binary/ternary conversion, (de)serialization, and tryte string de/encoding.

  • Performance: The API should allow 'acceptably' efficient manipulation of ternary data. This is difficult to ruggedly define, and implementation details may change later, but broadly this RFC intends to introduce an API that does not inherently inhibit high performance through poor design choices.

Detailed design

bee-ternary is designed to be extensible and is built on top of a handful of fundamental abstraction types. Below are listed the features of the API and a summary of the core API features.

Features

  • Efficient manipulation of ternary buffers (trits and trytes).
  • Multiple encoding schemes.
  • Extensible design that allows it to sit on top of existing data structures, avoiding unnecessary allocation and copying.
  • An array of utility functions to allow for easy manipulation of ternary data.
  • Zero-cost conversion between trit and tryte formats (i.e: no slower than the equivalent code would be if hand-written).

Key Types & Traits

The crate supports both balanced and unbalanced trit representations, supported through two types, Btrit and Utrit. Btrit is the more common representation and so is the implicit default for most operations in the crate.

Btrit and Utrit


#![allow(unused)]
fn main() {
enum Btrit { NegOne, Zero, PlusOne }
enum Utrit { Zero, One, Two }
}

Trits


#![allow(unused)]
fn main() {
struct Trits<T: RawEncoding> { ... }
}
  • Generic across different trit encoding schemes (see below).
  • Analogous to str or [T].
  • Unsized type, represents a buffer of trits of a specified encoding.
  • Most commonly used from behind a reference (as with &str and &[T]), representing a 'trit slice'.
  • Can be created from a variety of types such asTritBuf and [i8] with minimal overhead.
  • Sub-slices can be created from trit slices at a per-trit level.

TritBuf


#![allow(unused)]
fn main() {
struct TritBuf<T: RawEncodingBuf> { ... }
}
  • Generic across different trit encoding schemes (see below).
  • Analogous to String or Vec<T>.
  • Most common way to manipulate trits.
  • Allows pushing, popping, and collecting trits.
  • Implements Deref<Target=Trits> (in the same way that Vec<T> implements Deref<[T]>).

RawEncoding


#![allow(unused)]
fn main() {
trait RawEncoding { ... }
}
  • Represents a raw trit buffer of some encoding.
  • Common interface implemented by all trit encodings (T1B1, T3B1, T5B1, etc.).
  • Largely an implementation detail, not something you need to care about unless implementing your own encodings.
  • Minimal implementation requirements: safety and most utility functionality is provided by Trits instead.

RawEncodingBuf


#![allow(unused)]
fn main() {
trait RawEncodingBuf { ... }
}
  • Buffer counterpart of RawEncoding, always associated with a specific RawEncoding.
  • Distinct from RawEncoding to permit different buffer-like data structures in the future (linked lists, stack-allocated arrays, etc.).

T1B1/T2B1/T3B1/T4B1/T5B1


#![allow(unused)]
fn main() {
struct TXB1 { ... }
}
  • Types that implement RawEncoding.
  • Allow different encodings in Trits and TritBuf types.

T1B1Buf/T2B1Buf/T3B1Buf/T4B1Buf/T5B1Buf


#![allow(unused)]
fn main() {
struct TXB1Buf { ... }
}
  • Types that implement RawEncodingBuf.
  • Allow different encodings in TritBuf types.
  • Each type is associated with a RawEncoding type.

Tryte


#![allow(unused)]
fn main() {
enum Tryte {
    N = -13,
    O = -12,
    P = -11,
    Q = -10,
    R = -9,
    S = -8,
    T = -7,
    U = -6,
    V = -5,
    W = -4,
    X = -3,
    Y = -2,
    Z = -1,
    Nine = 0,
    A = 1,
    B = 2,
    C = 3,
    D = 4,
    E = 5,
    F = 6,
    G = 7,
    H = 8,
    I = 9,
    J = 10,
    K = 11,
    L = 12,
    M = 13,
}
}
  • Type that represents a ternary tryte.
  • Has the same representation as 3 T3B1 byte-aligned trits.

TryteBuf


#![allow(unused)]
fn main() {
struct TryteBuf { ... }
}
  • A growable linear buffer of Trytes.
  • Roughly analogous to Vec.
  • Has utility methods for converting to/from tryte strings.

API

The API makes up the body of this RFC. Due to its considerable length, this RFC simply refers to the documentation in question. You can find those docs in the bee repository.

Encodings

bee-ternary supports many different trit encodings. Notable encodings are explained in the crate documentation.

Common Patterns

When using the API, the most common types interacted with are Trits and TritBuf. These types are designed to play well with the rest of the Rust ecosystem. Here follows some examples of common patterns that you may wish to make use of.

Turning some i8s into a trit buffer


#![allow(unused)]
fn main() {
[-1, 0, 1, 1, -1, 0]
	.iter()
	.map(|x| Btrit::try_from(*x).unwrap())
	.collect::<TritBuf>()
}

Alternatively, for the T1B1 encoding only, you may directly reinterpret the i8 slice.


#![allow(unused)]
fn main() {
let xs = [-1, 0, 1, 1, -1, 0];

Trits::try_from_raw(&xs, xs.len())
	.unwrap()
}

If you are certain that the slice contains only valid trits then it is possible to unsafely reinterpret with amortised O(1) cost (i.e: it's basically free). However, scenarios in which this is necessary and sound are exceedingly uncommon. If you find yourself doing this, ask yourself first whether it is necessary.


#![allow(unused)]
fn main() {
let xs = [-1, 0, 1, 1, -1, 0];

unsafe { Trits::from_raw_unchecked(&xs, xs.len()) }
}

Turning a trit slice into a tryte string


#![allow(unused)]
fn main() {
trits
	.iter_trytes()
	.map(|trit| char::from(trit))
	.collect::<String>()
}

This becomes even more efficient easier with a T3B1 trit slice since it has the same underlying representation as a tryte slice.


#![allow(unused)]
fn main() {
trits
	.as_trytes()
	.iter()
	.map(|trit| char::from(*trit))
	.collect::<String>()
}

Turning a tryte string into a tryte buffer


#![allow(unused)]
fn main() {
tryte_str
	.chars()
	.map(Tryte::try_from)
	.collect::<Result<TryteBuf, _>>()
	.unwrap()
}

Since this is a common operation, there exists a shorthand.


#![allow(unused)]
fn main() {
TryteBuf::try_from_str(tryte_str).unwrap()
}

Turning a trit slice into a trit buffer


#![allow(unused)]
fn main() {
trits.to_buf()
}

Turning a trit slice into a trit buffer of a different encoding


#![allow(unused)]
fn main() {
trits.encode::<T5B1>()
}

Overwriting a sub-slice of a trit with copies of a trit


#![allow(unused)]
fn main() {
trits[start..end].fill(Btrit::Zero)
}

Copying trits from a source slice to a destination slice


#![allow(unused)]
fn main() {
tgt.copy_from(&src[start..end])
}

Drawbacks

This RFC does not have any particular drawbacks over an alternative approach. Ternary is currently essential to the function of much of the IOTA ecosystem and Bee requires a ternary API of some sort in order to effectively operate within it.

Rationale and alternatives

No suitable alternatives exist to this RFC. Rust does not have a mature ternary manipulation crate immediately available that suits our needs.

Unresolved questions

The API has now been in use in Bee for several months. Questions about additional API features exist, but the current API seems to have proved its suitability.

Summary

Add a way to convert to and from numeric binary types for ternary types.

Motivation

Conversion between binary and ternary is often useful and so a good solution to this is required.

For example, this feature permits the following conversions.


#![allow(unused)]
fn main() {
let x = 42;
let x_trits = TritBuf::from(x);
let y = i64::try_from(x_trits.as_slice()).unwrap();
assert_eq!(x, y);
}

Detailed design

This RFC introduces the following trait implementations:


#![allow(unused)]
fn main() {
// Signed binary to balanced ternary
impl<T: RawEncodingBuf> From<i128> for TritBuf<T> where T::Slice: RawEncoding<Trit = Btrit> {}
impl<T: RawEncodingBuf> From<i64> for TritBuf<T> where T::Slice: RawEncoding<Trit = Btrit> {}
impl<T: RawEncodingBuf> From<i32> for TritBuf<T> where T::Slice: RawEncoding<Trit = Btrit> {}
impl<T: RawEncodingBuf> From<i16> for TritBuf<T> where T::Slice: RawEncoding<Trit = Btrit> {}
impl<T: RawEncodingBuf> From<i8> for TritBuf<T> where T::Slice: RawEncoding<Trit = Btrit> {}

// Balanced ternary to signed binary
impl<'a, T: RawEncoding<Trit = Btrit> + ?Sized> TryFrom<&'a Trits<T>> for i128 {}
impl<'a, T: RawEncoding<Trit = Btrit> + ?Sized> TryFrom<&'a Trits<T>> for i64 {}
impl<'a, T: RawEncoding<Trit = Btrit> + ?Sized> TryFrom<&'a Trits<T>> for i32 {}
impl<'a, T: RawEncoding<Trit = Btrit> + ?Sized> TryFrom<&'a Trits<T>> for i16 {}
impl<'a, T: RawEncoding<Trit = Btrit> + ?Sized> TryFrom<&'a Trits<T>> for i8 {}

// Unsigned binary to unbalanced ternary
impl<T: RawEncodingBuf> From<u128> for TritBuf<T> where T::Slice: RawEncoding<Trit = Utrit> {}
impl<T: RawEncodingBuf> From<u64> for TritBuf<T> where T::Slice: RawEncoding<Trit = Utrit> {}
impl<T: RawEncodingBuf> From<u32> for TritBuf<T> where T::Slice: RawEncoding<Trit = Utrit> {}
impl<T: RawEncodingBuf> From<u16> for TritBuf<T> where T::Slice: RawEncoding<Trit = Utrit> {}
impl<T: RawEncodingBuf> From<u8> for TritBuf<T> where T::Slice: RawEncoding<Trit = Utrit> {}

// Unbalanced ternary to unsigned binary
impl<'a, T: RawEncoding<Trit = Utrit> + ?Sized> TryFrom<&'a Trits<T>> for u128 {}
impl<'a, T: RawEncoding<Trit = Utrit> + ?Sized> TryFrom<&'a Trits<T>> for u64 {}
impl<'a, T: RawEncoding<Trit = Utrit> + ?Sized> TryFrom<&'a Trits<T>> for u32 {}
impl<'a, T: RawEncoding<Trit = Utrit> + ?Sized> TryFrom<&'a Trits<T>> for u16 {}
impl<'a, T: RawEncoding<Trit = Utrit> + ?Sized> TryFrom<&'a Trits<T>> for u8 {}
}

The aforementioned trait implementations allow conversion to and from a variety of numeric binary types for ternary types.

In addition, there exist utility functions that allow implementing this behaviour for arbitrary numeric values that implement relevant traits from num_traits.


#![allow(unused)]
fn main() {
pub fn trits_to_int<
    I: Clone + num_traits::CheckedAdd + num_traits::CheckedSub + PartialOrd + num_traits::Num,
    T: RawEncoding + ?Sized,
>(trits: &Trits<T>) -> Result<I, Error> { ... }

pub fn signed_int_trits<I: Clone
    + num_trits::AsPrimitive<i8>
    + num_trits::FromPrimitive
    + num_traits::Signed>(x: I) -> impl Iterator<Item=Btrit> + Clone { ... }

pub fn unsigned_int_trits<I: Clone
    + num_trits::AsPrimitive<u8>
    + num_trits::FromPrimitive
    + num_traits::Num>(x: I) -> impl Iterator<Item=Utrit> + Clone { ... }
}

The AsPrimitive trait comes from the num_traits crate, common throughout the Rust ecosystem.

All of the aforementioned numeric conversion functions and traits operate correctly for all possible values including numeric limits.

Drawbacks

  • No support for fractional ternary floating-point conversion (although this is probably not useful for IOTA's use case).

  • Rust's orphan rules do not allow implementing foreign traits for foreign types so automatically implementing TryFrom for all numeric types is not possible.

  • Rust's type system cannot yet reason about type equality through associated type bindings and so it is not possible to implement Btrit trit slice to/from unsigned integers yet (although it's possible to use the utility functions to achieve this).

Rationale and alternatives

No competing alternatives exist at this time.

Unresolved questions

None so far.

Summary

This RFC proposes a unified way of achieving a graceful shutdown of arbitrarily many async workers and their utilized resources in the Bee framework.

Motivation

Rust provides one of the most powerful, performant, and still safest abstractions to deal with concurrency. You can read about Rust's concurrency story here.

One problem that arises when dealing with many asynchronous tasks within a system is to ensure that all of them terminate gracefully once an agreed upon shutdown signal was issued. An example for this is when a software intercepts the press of ^C in the terminal, and then executes specialized instructions to prevent data corruption before finally exiting the application.

This feature solves this problem by introducing a type called Shutdown, that allows to register ansynchronous workers, and execute a shutdown in a 3-step process:

  1. Send a shutdown signal to all registered workers;
  2. Wait for all the workers to terminate;
  3. Perform final actions (like flushing buffers to databases, saving files, etc.)

Detailed design

As the shutdown signal itself contains no data, the zero-sized unit type () is used to model it. Furthermore, as the synchronizing mechanism, this RFC proposes the use of oneshot channels from the futures crate. Those channels are consumed, i.e. destroyed, when calling send once; this way Rust ensures at compile-time, that no further redundant shutdown signals can be sent to the same listener.

To make the code more readable and less verbose, this RFC proposes the following type aliases:


#![allow(unused)]
fn main() {
// The "notifier" is the origin of the shutdown signal. For each spawned worker a
// dedicated notifier will be created and kept in a list.
type ShutdownNotifier = oneshot::Sender<()>;

// The "listener" is the receiver of the shutdown signal. It awaits the shutdown
// signal as part of the event loop of the corresponding worker.
type ShutdownListener = oneshot::Receiver<()>;

// For each async worker the async runtime creates a `Future`, that completes when the
// worker terminates, i.e. shuts down. To make sure, the shutdown is complete, one has
// to store those `Future`s in a list, and await all of them iteratively before exiting
// the program.
type WorkerShutdown = Box<dyn Future<Output = Result<(), WorkerError>> + Unpin>;

// Before shutting down a system, workers oftentimes need to perform one or more final
// actions to prevent data loss and/or data corruption. Therefore each worker can register
// an arbitrary amount of actions, that will be executed as the final step of the shutdown
// process.
type Action = Box<dyn FnOnce()>;
}

NOTE: WorkerShutdown and Action are boxed trait objects; hence have a single owner, and are stored on the heap. The Unpin marker trait requirement ensures that the given trait object can be moved into the internal datastructure. If the worker terminates with an error, it will be wrapped by a variant of the WorkerError enum. If the worker terminates without any error, () is returned - this time indicating that the worker doesn't produce a return value.

Shutdown

The shutdown functionality can be implemented using only one central type, that essentially is an append-only registry with a single execute operation. There is no need for removal, or identifying single items as the operation is applied to all of them equally, and once done, the whole program can be expected to terminate.

This Shutdown type looks like this:


#![allow(unused)]
fn main() {
struct Shutdown {
    // A list of all registered shutdown signal senders, briefly labeled as "notifiers".
    notifiers: Vec<ShutdownNotifier>,

    // A list of all registered worker termination `Future`s.
    worker_shutdowns: Vec<WorkerShutdown>,

    // A list of all registered finalizing actions performed once all async workers
    // have terminated.
    actions: Vec<Action>,
}
}

The API of Shutdown is very small. It only needs to allow filling those internal lists, and provide an execute method to facilitate the actual shutdown:


#![allow(unused)]
fn main() {
impl Shutdown {

    // Creates a new instance
    fn new() -> Self {
        Self::default()
    }

    // Registers a worker shutdown future, which completes once the worker terminates in one
    // way or another, and registers a shutdown notification channel by providing its sender half.
    fn add_worker_shutdown(
        &mut self,
        notifier: ShutdownNotifier,
        worker: impl Future<Output = Result<(), WorkerError>> + Unpin + 'static,
    ) {
        self.notifiers.push(notifier);
        self.worker_shutdowns.push(Box::new(worker));
    }

    // Registers an action to perform when the shutdown is executed.
    fn add_action(&mut self, action: impl FnOnce() + 'static) {
        self.actions.push(Box::new(action));
    }

    // Executes the shutdown.
    async fn execute(mut self) -> Result<(), Error> {
        // Step 1: notify all registrees.
        while let Some(notifier) = self.notifiers.pop() {
            notifier.send(()).map_err(|_| Error::SendingShutdownSignalFailed)?
        }

        // Step 2: await workers to terminate their event loop.
        while let Some(worker_shutdown) = self.worker_shutdowns.pop() {
            if let Err(e) = worker_shutdown.await {
                error!("Awaiting worker failed: {:?}.", e);
            }
        }

        // Step 3: perform finalizing actions to prevent data/resource corruption.
        while let Some(action) = self.actions.pop() {
            action();
        }

        Ok(())
    }
}
}

About the execute method there are three things worth mentioning:

  1. takes self by value - in other words - claims ownership over the Shutdown instance, which means that it will be deallocated at the end of the method body, and can no longer be used;
  2. is decorated with the async keyword, which means that under the hood it returns a Future that needs to be polled by an async runtime in order to make progress. In this specific case of a shutdown it almost always makes sense to block_on this method;
  3. workers shut down in reverse order as they registered;

Drawbacks

  • No known drawbacks; the proposal uses trait objects, that have a certain performance overhead, but this shutdown is a one-time operation, performance is not a major issue;

Rationale and Alternatives

  • No viable alternatives known to the author;

Unresolved questions

  • Should we have a forced shutdown after a certain time interval, in case some worker fails to terminate? When would such a thing actually happen?

Summary

This RFC introduces two procedural macros, SecretDebug and SecretDisplay, to derive the traits Debug and Display for secret material types in order to avoid leaking their internals.

Motivation

Secret materials, private keys for example, are types that are expected to remain private and be shared/leaked under no circumstances. However, these types may be wrapped in other higher level types, a wallet for example, that may want to implement Debug and/or Display. Not implementing these traits on the secret types would then not permit it. At the same time, secret materials should not be exposed by logging them. The consensus is then to implement Debug and Display but to not actually log the inner secret. To make things easier, two procedural macros are proposed to automatically derive such implementations.

Detailed design

Crate

Procedural macros need to be defined in a different type of crate so a new dedicated crate has to be created with the following in its Cargo.toml:

[lib]
proc-macro = true

Derivation

This feature is expected to be used by deriving SecretDebug and/or SecretDisplay.


#![allow(unused)]
fn main() {
#[derive(SecretDebug, SecretDisplay)]
struct PrivateKey {
    ...
}
}

Output

When an attempt to display or debug a secret type is made, a generic message should be displayed instead. This RFC proposes <Omitted secret>.

Implementation

This feature makes use of the syn and quote crates to create derivation macros.

Implementation of SecretDebug:


#![allow(unused)]
fn main() {
#[proc_macro_derive(SecretDebug)]
pub fn derive_secret_debug(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
    // Parse the input tokens into a syntax tree.
    let input = parse_macro_input!(input as DeriveInput);
    // Used in the quasi-quotation below as `#name`.
    let name = input.ident;
    // Get the different implementation elements from the input.
    let (impl_generics, ty_generics, _) = input.generics.split_for_impl();

    let expanded = quote! {
        // The generated implementation.
        impl #impl_generics std::fmt::Debug for #name #ty_generics {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
                write!(f, "<Omitted secret>")
            }
        }
    };

    // Hand the output tokens back to the compiler.
    expanded.into()
}
}

Implementation of SecretDisplay is similar.

Drawbacks

User may actually really want to debug/display their secret types. With this feature in place, they would have to rely on APIs providing getters to the inner secret and/or serialization primitives.

Rationale and alternatives

By providing macros, we offer a consistent way to approach the problem and potentially reduce the risk of users leaking their secret materials by accident.

Unresolved questions

No questions at the moment.

Summary

This RFC does not really introduce a feature but rather a recommendation when dealing with sensitive data; their memory should be securely erased when not used anymore.

Motivation

When dealing with sensitive data like key materials (e.g. seeds, private keys, ...), it is wise to explicitly clear their memory to avoid having them continue existing and potentially exposing them to an attack.

Since such an attack already assumes unauthorized access to the data, this technique is merely an exploit mitigation by ensuring sensitive data is no longer available.

Useful links:

Detailed design

This task is made a bit more complex by compilers that have a tendency of removing zeroing of memory they deem pointless when doing optimizations. It is then not as trivial as resetting the memory to 0.

These optimizations can however be bypassed by using volatile memory.

Zeroize and Drop

This RFC recommends:

  • implementing the following traits on sensitive data types:
    • Zeroize to Securely zero memory with a simple trait (Zeroize) built on stable Rust primitives which guarantee the operation will not be "optimized away";
    • Drop Used to run some code when a value goes out of scope. This is sometimes called a 'destructor' and making it call self.zeroize();
  • having Zeroize and Drop as requirements of sensitive data traits;

Examples

Traits requirements:


#![allow(unused)]
fn main() {
trait PrivateKey: Zeroize + Drop {
    ...
}
}

If the type is trivial, Zeroize and Drop can be derived:


#![allow(unused)]
fn main() {
#[derive(Zeroize)]
#[zeroize(drop)]
struct ExamplePrivateKey([u8; 32]);
}

Otherwise, Zeroize and Drop need to be manually implemented.

  • Zeroize implementation:

    
    #![allow(unused)]
    fn main() {
    impl Zeroize for ExamplePrivateKey {
        fn zeroize(&mut self) {
            ...
        }
    }
    }
    
  • Drop implementation:

    
    #![allow(unused)]
    fn main() {
    impl Drop for ExamplePrivateKey {
        fn drop(&mut self) {
            self.zeroize()
        }
    }
    }
    

    A derive macro SecretDrop could be written to derive the Drop implementation that is always expected to be the same.

Useful links:

Drawbacks

There are no major drawbacks in doing that, except the little overhead it comes with.

Rationale and alternatives

The alternative is to not recommend or enforce anything and risk users exposing their sensitive data to attacks.

Unresolved questions

Even though this technique uses volatile memory and is probably effective most of the time the compiler is still free to move all of the inner data to another part of memory without zeroing the original data since it doesn't consider the value to be dropped yet. A solution to prevent the compiler moving things whenever possible would be to box secrets and access them through Pin.