This homework assignment has five questions; answer all of them. This assignment is due no later than 7:00 p.m. on Thursday, 13 December.
If you mail in your assignment, please submit a printable document
— a PostScript .ps
or PDF .pdf
document, for
example — and not a source document — a
Word .docx
or Latex .tex
document, for example.
The small-world topology has a many close-by connections and fewer long-distance connections. A node in a small-world network is either directly connected to, or a few local hops from, a connection to a far-way node.
Attacking the long-distance connections would take less effort than attacking other connections because there are fewer long-distance connections (1). Eliminating long-distance collection would partition the network into a set of disconnected, small-capacity sub-networks (2). Such a disconnected network would have greatly reduced and fragmented storage capacity (3).
The answers, approximately verbatim, and in no particular order:
1: Do small-world peer-to-peer networks have servers?
2: Is Napster based on a small-world topology?
3: A network with that property is connected, but a connected network doesn't necessarily have a small-world topology.
4: If there's a path between any two nodes, there's no “almost:” the network is fully connected.
5: Ok, but how does this attack make use of the small-world topology.
6: True, but the damage done by this attack seems linear in the number of nodes attacked, which doesn't provide large damage for small effort.
The trick to this question is to define what network characteristic exhibits a power-law relation. However, in general, a network exhibiting the power law will have a small percentage (≤ 10%) of x (usually nodes) showing a large percentage (≥ 80%) of some other property y, such as neighbors or stored items.
The query depends on x and y, say x = nodes and y = neighbors. A series of queries to the nodes of each network about their neighbors (1) would identify a small set of nodes with many neighbors in one network that were missing in the other network (2). The identified power-law network would have a few nodes that are good starting places for queries with short expected search paths while the other network would no obvious starting points and correspondingly longer search paths (3).
The answers, approximately verbatim, and in no particular order:
a The rank and out-degree power laws are the most effective in distinguishing different kinds of topologies.
b In a network topology has power-law characteristics then it should satisfy P(K) proportional to k power -y where P(K) is the probability that a node has k edges.12 Results Expected. Results show which network follows power law.2
3 By the expected results we can distinguish two networks.31: What are the queries?
2: How are these results characterized? Will any results do?
3: Distinguish how?
Source: dx.doi.org/10.1145/316194.316229
4: What parameters? How are they held?
5: How would the data be characterized in general?
(source: selfUFL(www.hpl.hp.com))
6: Are Internet routing topologies related to peer-to-peer network topologies, which is what the question asked about?
Assuming a random distribution of files to nodes, the probability that a node has a copy of the file is p = 16/1024 = 1/64 = 0.016, and that it doesn't have a file copy is q = 1 - 0.016 = 0.984. Assuming independence among nodes, which is a poor assumption given peer-to-peer routing algorithms, the probability that the file will be found after i hops is qi - 1p. The number of paths of length i from a particular node is related to the average fan-out f: fi. The expected value for path length is then the sum of fiqi -p for i = 0 to 1024 - 16, assuming a connected network and depth-first search.
The answers, approximately verbatim, and in no particular order:
1: Chances can't be greater than 1, so it's 16/1024.
In structured P2P systems,3 connections among peers in the network are fixed, and peers maintain information about the resources (e.g., shared content) that their neighbor peers possess. Hence, the data queries can be efficiently directed to the neighbor peers that have the desired data, even if the data is extremely rare.4
Unlike structured P2P systems, in unstructured P2P systems, connections among peers in the network are formed arbitrarily in a flat or hierarchical manner. In order to find as may peers that have the desired content as possible, peers in unstructured P2P systems query data based on several techniques such as flooding, random walking, and expanding-ring.52: If the neighbors return many high-quality results, why would they then forward the query to all their neighbors? Isn't the search done?
3: The problem's concerned with unstructured peer-to-peer networks.
4: Can “efficiently” be quantified in terms of the values given in the question?
5: And what should be expected of these techniques given the values in the question?
6: That's the worst case; what's the expected (average) case? And what about networks that aren't linear?
Generally, hash functions are used in peer-to-peer networks to randomize locations, providing wide accessibility and load-balancing. The SHA-1 crack allows for generating collisions at will (1), which could be used to defeat both accessibility and load-balancing as well as node hijacking, depending on how the hash is used (2). Some networks, such as Freenet, that place a premium on privacy and anonymity may find it difficult to maintain those properties using a compromised hash function.
The answers, approximately verbatim, and in no particular order:
1: What?
2: How does a cracked SHA-1 effect this?
Source: news.techworld.com/security/3156/crypto-world-in-panic-as-sha-1-broken/
It is also used in Chord server that has a unique 160-bit Chord node identifier which is produced by SHA-1 hash of the node's IP address.4
Attacking SHA-1 was done by a Chinese research team, and the team had an attack finding collisions in 269 trials.The team has reduced the amount of time needed to find two documents with the same signature by a factor of more than 2000. This attack is described as a collision attack.
(source:www.schnieier.com
, www.dcs.ed.ac.uk
)
3: What is consequence to these uses that SHA-1 has been cracked?
4: What is the consequence to Chord that SHA-1 has been cracked?
Finger tables would not be too useful in classic Gnutella because it doesn't have a meaningful addressing relation between items and nodes. Chord can relate items and nodes, so that an item identified as far away from a particular node n will probably be located in the neighborhood of a node that's also far away from n. In that case, a finger table, which allows for progressively larger jumps in the network, can be exploited to move the search rapidly to a neighborhood that should contain the search item.
Classic Gnutella has no such relation between items and nodes. There's no good reason to suppose that an item can be found in a particular neighborhood, and so there's no good reason to be able to jump directly to that neighborhood.
The answers, approximately verbatim, and in no particular order:
(source: Chord Implementation, www.dcs.ed.ac.uk
)
1: A Chord network, or a Gnutella network?
2: But what is it about ring structure that makes finger tables work?
3: How would adding structure to a Gnutella network make finger tables useful? Is finger-table value based on structure, or some other property?
Source: cs-www.cs.yale.edu/homes/arvind/cs425/doc/gutella-rc.pdf
4: What kind of system is it that should be dynamic and scalable? The finger table?
5: Sub-linear means what?