Octavia is a secure, decentralized network filesystem. Although there are already many network filesystems, such as the Common Internet Filesystem (also called “Samba”), Sun’s Network File System, and Apple Filing Protocol, they are unsafe for use on the wider Internet. (In fact, they are unsafe for all but the smallest, physically private networks, as well.) By contrast, Octavia aims to provide specific security and reliability guarantees while still providing good performance and usability.
In particular, most existing network filesystems rely on a single central file server computer to serve data to many client computers. This centralization is both a strength and a weakness: while it is easy to understand and develop a centralized system, it proves brittle in practice. If the server becomes overloaded or crashes, client performance becomes degraded or even completely unavailable. It is often (but not necessarily) the case that people must place complete trust in the central server as well: if the server is compromised, people can no longer be certain that their data has not been accessed by the attacker.
The Tahoe-LAFS filesystem most closely resembles Octavia: it shares Octavia’s goals of security and reliability. Octavia aims to explore different design alternatives in the same problem space. For other significant work in this area, perhaps most notably including SFS, see the references page.
Octavia provides four crucial security guarantees. We currently know of no network filesystem besides Tahoe-LAFS that claims to provide these guarantees.
Confidentiality. Before sending data to the Octavia server, the Octavia client encrypts the data in such a way that only the client itself can decrypt it. The server cannot decrypt the data, nor can any other user. The client only ever sends the encrypted form of the data over the internet. The confidentiality of the client’s data is thus protected from servers and from eavesdroppers on the internet.
The encryption function is AES-256 in cipher block chaining (CBC) mode with a cryptographically random key and a unique, cryptographically random initialization vector (IV) created for each encryption operation. Each directory of files has its own cryptographically random key.
Integrity. In addition to encrypting the data when storing data on the server, the client also computes a cryptographic message authentication code (MAC) over the data. The signature is created with a secret signing key that only the client and the server share; the key is necessary to compute the signature. Thus, the server and the client can verify that the data has not been damaged in transit over the network, and that only the true client or server could have signed it.
The MAC function is HMAC-SHA-512, again with a cryptographically random symmetric key. The key is pre-shared, out of band, between client and server in some server-defined protocol.
When reading data from servers, clients ask for segments of data identified by a cryptographic hash of the data itself. When the servers respond, clients check that the hash is correct by re-computing the hash for the received data. It is (believed to be) computationally infeasible to forge these hashes or to find two segments with the same hash value, so clients can trust that servers have sent the correct data and that no attacker has damaged the data in transit.
The cryptographic hash is SHA-512d.
Authentication. Since the client and the server share the signing key, they can verify that a signed block of data could only have been signed by a computer that has the secret key installed. Octavia uses this feature to enable servers to ensure that incoming data are sent by authorized clients, and to enable clients to know that the true server received the client's request.
Servers can implement essentially arbitrary policies and arbitrary client registration protocols. They can leverage centralized AAA systems like Active Directory, they can accept anonymous clients, they can enforce per-client quotas, and so on. Clients can interact with many servers implementing many policies, transparently — in fact, clients generally have no understanding of server policy at all, except indirectly.
Availability. Octavia clients can store copies of their data on many servers. Ideally, people will configure their clients to store copies of their data on servers that are distributed in many different locations, thus providing protection against various types of calamity: bad weather, natural disasters, power outages, human errors, et c.
Octavia simply copies the entire data to many servers, making no use of clever techniques such as erasure coding to provide efficient redundancy. Although space-inefficient and not as reliable for the money as erasure coding, this approach is time-efficient, simple to implement and to understand, requires only simple configuration, and maximizes read parallelism (any valid response from any server satisfies the request).
Octavia is designed to be deployed on the internet. Since the internet has no centralized authority, Octavia does not rely on one. Octavia clients and servers can associate and disassociate on an ad hoc basis. Clients don’t depend on any particular server, and servers are thus free to implement any policy toward clients they choose. Clients can work with servers on any network, in any location, implementing any policy, in any combination.
Octavia also scales down to intranets and LANs. In a sense, these centrally-managed networks are special cases of the general, decentralized case: the same mechanisms that work to associate clients and servers in a decentralized manner also work in a centrally-managed deployment.