Understanding Vanish – And Defeating Vanish

Objective :  All data and copies vanish after an interval

How

  • Correspondent must get key each time (central control) or
  • Key is stored locally for a while for offline use

Vanish uses two principals,

The first is an encapsulate algorithm that takes a data object D as input and produces a Vanishing Data Object (VDO) as output.

The second is a de-capsulate algorithm that accepts as input a VDO and reproduces the original data, with the caveat that de-capsulation must be done within a certain time T of the VDO’s creation

Email with a vanishing data object

Options:

  • Detect and prevent entry, like spam
  • Allow in, but prevent acquisition of keys, through network policy.
  • Allow in, but decode passing through gateway
  • Allow in with quarantine & special handling

Is there a duty to preserve it? For e-Discovery?
Would the court consider the unpacked as equivalent?
To prove it is equivalent you’d need the key

 

 

Distributed Hash Tables

  • Used for many P2P applications Academic studies since 2001
  • Unless refreshed, DHT, times out entries
  • Pick a random symmetric key, K
  • Encrypt the user data locally, yielding C
  • Pick a seed, L, for pseudo random number generation
  • Use L to generate indices in the hash table x1..xn
  • Divide the key into pieces k1..kn where m parts are needed to compute the key, K. (Shamir Secret Sharing)
    put(xi,ki) for i=1 to n
  • destroys the local copy of the key,
  • Sends {C,L} to correspondent

1- A class of a decentralized distributed system that provides a lookup service similar to a hash table;
2- (key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key
3- Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants
causes a minimal amount of disruption
4- This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

Y DHTs

1- it provides a infrastructure to build complex services, anycast,webcast, distributed file systems , IM, p2p file sharing ..
Vanish encapsulates data objects so that they “self-destruct” after a specified time, becoming permanently unreadable. It encrypts the data using a randomly generated key and then uses Shamir secret sharing [36] to break the key into n shares where k of them are needed to reconstruct the key. Vanish stores these shares in random indices in a large, pre-existing distributed hash table (DHT), a kind of peer-to-peer network that holds key-value pairs. The encrypted data object together with the list of random indices comprise a “Vanishing Data Object” (VDO).

DHTs have a property that seemingly makes them ideal for this application: they make room for new data by discarding older data after a set time. The DHT policy to age out data is what makes Vanish data vanish. A user in possession of a VDO can retrieve the plaintext prior to the expiration time T by simply reading the secret shares from at least k indices in the DHT and reconstructing the decryption key. When the expiration time passes, the DHT will expunge the stored shares, and, the Vanish authors assert, the information needed to reconstruct the key will be permanently lost.

 

 Attacking Vanish :

We present two Sybil attacks against the current Vanish implementation, which stores its encryption keys in the million-node Vuze BitTorrent DHT. These attacks work by continuously crawling the DHT and saving each stored value before it ages out. They can efficiently recover keys for more than 99% of Vanish messages.

Vanish’s security depends on the assumption that an attacker cannot efficiently extract VDO key shares from the DHT before they expire. Suppose an adversary could continuously crawl the DHT and record a copy of everything that gets stored. Later, if he wished to decrypt a Vanish message, he could simply look up the key shares in his logs. Such an attacker might even run a commercial service, offering to provide the keys for any Vanish message for a fee. Thus, a method of efficiently crawling the DHT enables a major attack against Vanish.

Defenses :

There are a number of possible defenses that could be applied to future versions of Vanish and Vuze, including reducing replication, imposing further restrictions on node IDs, and employing client puzzles.

 

Another approach would be to switch from a public DHT, where anyone can serve as a peer, to a privately run system like OpenDHT [35]. Though this would remove the threat from Sybils, the private system would essentially act as a trusted third party, which Vanish was designed to avoid.

 

Vanish’s weaknesses are not only of academic concern but as users may already be treating it as a production system and entrusting it with sensitive data,

As User Assumes, 

“Why bother to prune my own data,” the user may ask, “if Vanish is doing it for me?”

 

What Vanish Team did,

Vanish stores keys in a distributed hash table (DHT); DHTs erase old data after a period of time to make room for new stores, and Vanish exploits this property to ensure that its keys will expire at a predictable time with no intervention from the user

 

Get Set Go – Attack :

 

One way to attack Vanish is with a large Sybil attack against the underlying Vuze DHT. Vuze nodes replicate

the data they store to up to 20 neighboring nodes 

Two properties of Vuze’s replication strategy make this easy. First, Vuze replicates values to new clients as soon as they join the network. Second, to ensure resiliency as nodes rapidly join and leave, Vuze nodes replicate the data they know to their neighbors at frequent intervals, usually every 30 minutes.

Using the Statistic analysis 80 % chance of learning each stored share. Our experiments suggest that this would require more than 60,000 Sybils. Although Vuze allows each IP address the attacker owns to participate with up to 65,535 node IDs (one for each UDP port), the attacker may not have sufficient computing resources to maintain so many Sybils concurrently, and the necessary bandwidth might also be prohibitively high.

The attacker can do much better by exploiting the fact that he does not need continuous control over such a large fraction of the network. Rather, he need only observe each stored value briefly, at some point during its lifetime

 

Two properties of Vuze’s replication strategy make this easy. First, Vuze replicates values to new clients as soon as they join the network. Second, to ensure resiliency as nodes rapidly join and leave, Vuze

nodes replicate the data they know to their neighbors at frequent intervals, usually every 30 minutes.

 

 

Strategy #01 : Simple Hopping Implementation (Unvanish)

  • Unvanish records keys and values it receives from neighboring nodes upon joining the network.
  • Unvanish records keys and values it receives from neighboring nodes upon joining the network.
  • Unvanish heuristically discards values that are unlikely to be shares of Vanish encryption keys

Reverse Engineering Shamir Secret Sharing ,The current Vanish implementation splits encryption keys using Shamir secret sharing over the integers. As a result, the length of shares can vary significantly, depending on key length, number of shares n, and threshold k.

Unvanish records all values within that range.

Unvanish that decapsulates VDOs after they have supposedly expired. To minimize the harm to Vanish users, we discard the data we collect from the DHT after one week,though a real attacker could easily keep it indefinitely.

 

Strategy #02 : Advanced Hopping Implementation (ClearView) 

  • ClearView is a from-scratch reimplementation of the Vuze DHT protocol written in 2036 lines of C. it can run many DHT clients in a single process. It can maintain several thousand concurrent Sybils on a single EC2 instance.
  • On startup, ClearView bootstraps multiple Vuze nodes in parallel, seeding the bootstrap process with a list of peers gleaned ahead of time from a scan of the network in order to avoid overloading the Vuze DHT root node.
  • ClearView then logs the content of each incoming STORE request for later processing.
  • ClearView reduces the amount of network traffic used in the attack by replying to incoming DHT commands only as necessary to collect stored data.

 

Issue Noticed : During preliminary experiments, we discovered that Sybils remain in the Vuze 

We avoid this by We achieved substantial cost savings by simply configuring the Linux firewall to block outgoing ICMP messages.

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: