To answer the second part first: FTP uses TCP because the spec says it should (File Transfer Functions, sec. 4, and Connection Establishment, sec. 8, in File Transfer Protocol (FTP), RFC 959).
The answers, approximately verbatim, and in no particular order:
The advantages of TCP when implementing FTP are,
The disadvantages of TCP when implementint FTP are,
The advantages of UDP wehen implementing FTP are,
The disadvantages of UDP when implementing FTP are,
1: Transformation? Perhaps you meant “transmission”?
2: What does “interoperability” mean? Are you saying that UDP wouldn't provide interoperability?
3: Independently of what?
4: Doesn't UDP establish internetworking too?
5: More than what?
6: What are some of the intricacies? What kind of management? Is this at the end-points or in the network?
7: Less framework than what? Why is this important to FTP?
8: Good. You compared X to something, and included what X is being compared to.
9: Less latency than what?
10: What is application flexibility? Why is it important to FTP?
11: How do you have reliability when not transmitting data?
12: What kind of communication is lacking? Does this mean communication is unreliable?
UDP is a connectionless protocol which means it sends data immediately. That makes it a fast protocol.13 So, it is used with applications that require fast transmission of data. For instance, it is used for DNS, DHCP, and real time communication. Some UDP disadvantages are reliability which means it is possible to not receive data. It also sends data not ordered and data reaches in random order. (Vivek Gite, Cyberciti.biz, 15 May 2007. Web. 18 Sep. 2012.)
On the other hand, TCP is a connection oriented protocol and that means it tests and checks connection14 then start sending data, and it guarantees that all data will reach the destination. It also sends data ordered. For that, TCP is a reliable protocol, and it corrects mistakes and re-sends corrupted data again. So it is used for HTTPS, SMTP, and FTP. However, that makes TCP slower than UDP when sending data. (Erik Rodriguez, Skullbox.net, 26 July 2011. Web. 18 Sep. 2012.)
FTP uses TCP protocol.
13: Fast by what measure? Throughput? Delay? Low host overhead?
14: Test and checks the connection in what way? Does it do resource allocation?
15: Is this true? What kind of predictions are made? By what mechanism are predictions made?
16: Isn't that true for UDP too? Did you mean “reliably sending the data”?
17: Isn't that true for any network protocol?
18: Is multicast or broadcast for FTP?
19: Faster in what sense? Faster raw throughput? Faster usable throughput? Under what conditions?
http://www.bleepingcomputer.com/tutorials/TCP-and-UDP-ports-explained/
Section: “The two Internet workhorses: UDP and TCP”TCP stands for Transmission Control Protocol. Using this method, the computer sending the data connects directly to the computer it is sending the data it to, and stays connected for the duration of the transfer. With this method, the two computers can guarantee that the data has arrived safely and correctly, and then they disconnect the connection. This method of transferring data tends to be quicker and more reliable, but puts a higher load on the computer as it has to monitor the connection and the data going across it. A real life comparison to this method would be to pick up the phone and call a friend. You have a conversation and when it is over, you both hang up, releasing the connection.
UDP stands for User Datagram Protocol. Using this method, the computer sending the data packages the information into a nice little package and releases it into the network with the hopes that it will get to the right place. What this means is that UDP does not connect directly to the receiving computer like TCP does, but rather sends the data out and relies on the devices in between the sending computer and the receiving computer to get the data where it is supposed to go properly. This method of transmission does not provide any guarantee that the data you send will ever reach its destination. On the other hand, this method of transmission has a very low overhead and is therefore very popular to use for services that are not that important to work on the first try. A comparison you can use for this method is the plain old US Postal Service. You place your mail in the mailbox and hope the Postal Service will get it to the proper location. Most of the time they do, but sometimes it gets lost along the way.
…FTP servers use TCP ports 20 and 21 to send and receive information, so you won't have any conflicts with the web server running on TCP port 80.20
20: Ugh, don't do this. Cutting and pasting is complete waste of time: you learn nothing from it, and I can look it up from the citations if I need to.
HTTP://www.computer-networks.blurtit.com
, The Advantages Of Using
TCP Over UDP).
HTTP://tcpipguide.com
, FTP overview).
HTTP://tcpipguide.com
, FTP overview).
HTTP://www.en.wikipedia.org
,
TCP).
www.diffen.com
, TCP vs. UDP).
HTTP://www.diffen.com
, TCP vs. UDP).
HTTP:www.en.wikipedia.org
, UDP).
21: Is it the case that UDP doesn't support multiple data types and file types? What are data and file types?
Advantages of TCP when implementing FTP are:
Disadvantages of TCP when implementing FTP are
pAdvantages of UDP when implementing FTP are
Disadvantages of UDP when implementing FTP are
22: Is this something important to FTP?
23: If UDP doesn't establish connections between end-points, how can it support broadcast or multicast? What does “support” mean?
24: Is that what the application's sending, IP datagrams?
25: Can't UDP, or IP for that matter, do the same thing?
26: Scalable in what sense?
27: Complex for which side? Complex in what way?
28: Does not preserve sequences of what?
The applications associated with FTP29 require all the data to be received in correct order. It is fairly simple that TCP provides this service and that's why FTP uses TCP, and not UDP.
In TCP lost packets are resent and thus it is reliable. It guarantees efficient delivery.30 TCP guarantees 3 things:
It has good throughput on a modem or LAN.31
Disadvantages of UDP
A packet may not be delivered in order or may be duplicated. And you get no indication that its been delivered or not unless the listening one says something.
UDP has no flow control.
No retransmission if data collides.
Disadvantages of TCP
Advantages of UDP while implementing FTP in it.
29: FTP is the application being considered. What are the applications associated with FTP?
30: Efficient in what sense? Efficient at the end-points? In the operating system? Over the network?
31: Why is throughput across a modem interesting?
32: So? Why is that important to FTP?
33: Does FTP care at all about broadcast or multicast?
'\n'
) is one
character and appears at the end of every line, and 3) the encoded file
contains only lines from the original file, there is no extra header or footer
information added.
Base64 encodes a source sequence of three bytes as a result sequence of four bytes by encoding successive groups of six bits from the source sequence as a byte (eight bits) in the result sequence. If necessary, the source sequence is padded (appended at the end) with one or two null (all zero) bytes to make the sequence size in bytes evenly divisible by three, and the result sequence is also padded with a distinguished byte value (which must exists because the source-sequence representation uses only 64 of the possible 256 values available) equal to the number of null bytes appended to the source sequence.
Source: Base 64 Encoding (section 4) in Base-N Encodings (rfc 4648).
A 5,000-byte sequence requires one pad byte to be divisible by three, and the Base64-encoded resulting sequence contains (5001/3)*4 + 1 = 6668 + 1 = 6669 bytes. The result sequence divides into ceiling(6669/79) = 85 lines of at most 79 characters, and the total size of the result sequence is 6669 + 85 = 6754 bytes.
The answers, approximately verbatim, and in no particular order:
Input file size = 5000 bytes
Base64 | = | bytes + 21 - ((bytes + 2) MOD 3)/3 * 4 |
= | 5000 + 2 - ((5000 + 2) % 3) / 3 * 4 | |
= | 6668 bytes = 6.51172 KB2 |
HTTP://www.obviex.com
, How to Calculate the Size of Encrypted Data).
1: Why “+ 2”? What does it represent? Could it be related to padding? How?
2: Why translate from exact character units to approximate kilobyte units?
3: Does this include the newlines added to the encoded result?
The file size will be: 6668 Byte.8Base64 = (File_size + 25 - ((File_size + 2) MOD 3))/36 * 47
Base64 = (5000 + 2 - ((5000 + 2) MOD 3))/3 * 4
4: Encoded, not encrypted, although the information is obscured after being run through Base64.
5: Why 2? What does it represent?
6: Anything mod 3 will be less than 3, and dividing anything mod 3 by 3 makes it less than 1. What does this value represent?
7: Is padding included somewhere in this equation? Is that what the + 2 is for?
8: Where does this calculation account for the newlines added to the encoded result?
9: Approximately? Why approximately? Why not exactly?
10: 137%? How did you get that number?
11: Again, why approximately?
12: Given assumption 1 in the question, is it always true that a line has 80 characters?
13: What's the answer? Is it 85 lines?
14: This is an unhelpful citation. Where would I go if I wanted to check your answer?
Output size | ((input_size - 1)/3)*4 + 415 |
((5000 - 1)/3)*4 + 4 = 6669 | |
final size | output_size + (output_size/80)*216 |
6669 + (6669/80)*2 = 6835.725 bytes17 |
http://stackoverflow.com/questions/1533113/calculate-the-size-to-a-base-64-encoded-message
; answer by kanaka.
15: Why does this formula subtract one from the input size? And what does that final “+ 4” represent?
16: What does multiplying by 2 represent?
17: Fractional bytes? There are bits left over? (Note that 5/8 = 0.625 < 0.725 < 0.75 = 6/8, so there are fractional bits too.) Is that how Base64 encoding works?
4[n/3]
(4[5000/3])/8018
(4[1666.67])/8019
6666.68/8020
83.3321
18: Is there any padding being described by this equation? Does Base64 do any padding?
19: Are the added newlines accounted for by this equation?
20: What units are 6666.68 in?
21: Units?
Base64 takes 3 bytes at one time and converts them to 4 Base64 characters. A
5,000 byte file would consist of 6666.67 characters after the conversion (5000
bytes/3 bytes per group = 1666.67 groups of 3 bytes.22 1666.67 groups * 4 characters per
group = 6666.67 characters). 6667 characters (rounded up)23 will require 84.39 lines in the file (6667/79
characters per line, \n
is the 80th character on each line). Final
answer: The file will be 85 lines long to sufficiently contain the 5,000 byte
file endoded in Base64.24
http://email.about.com/cs/standards/a/base64_encoding.htm
, section
Base64 to the rescue (explanation and example).
22: How is it Base64 produces fractional characters?
23: How are characters rounded up?
24: But how long is a line?
The base64 converts 3 bytes of data into 4 characters. So 5000 bytes of data is converted into
\[{5000 \cross 4} \over 3 = 6666.66 = 6667\] characters25Given that each line has 80 characters. So clearly number of lines = 6667/80 = 83.3 lines.
i.e. Total no. of lines = 842625: How does the rounding up go? What extra data is added to produce an integer?
26: But how long is a line?
27: What kind of unit is characters? How many bits does it have?
28: Fractional characters? What do they look like?
29: Is that true? Is that what the problem states? (Hint: no)
Zipf's law was originally an observation about the relation between frequency and rank ordering in English words: the frequency of the ith most frequently occurring word is proportional to 1/i. If Zipf's law holds, the most frequent word (rank 1) has a frequency that's twice the frequency of the second most frequent word, three times the frequency of the third most frequent word, and so on.
A server offering a population of objects for which Zipf's law holds can exploit the law by creating a static, two-level cache. A two-level cache is sufficient because Zipf's law divides the population into popular (frequent) and unpopular (infrequent) objects, and the popular objects are much more popular than the unpopular objects. The cache can be static because accessing an unpopular object can be considered a rare event, and moving the object into the popular cache would be a waste of time because it's unlikely to be accessed again soon.
Source:HTTP://xlinux.nist.gov/dads/HTML/zipfslaw.html
The answers, approximately verbatim, and in no particular order:
HTTP://www.linkage.rockefeller.edu
, introduction to Zipf's law).
## A File server can exploit Zipf's law to improve performance by identifying the most popular requested files on itself, and it can also tell how many times these files have been downloaded, modified, or even opened.1 So, the file server can make a prior access to these files and put them on the top of its files’ list. Thus, this process will improve the file server performance because it is going to speed up the access speed to the requested file.
## As mentioned previously, Zipf's law will be beneficial to identfy what or where the popularity is and gives it the priority to be accessed; for example, Zipf's law can give a hand of help with search engines, etc.1: What in particular about Zipf's law improves performance over the usual cache operation?
2: Always answer the question. Even a lame answer can earn some points, but no answer can only earn zero points.
Zipf's law states that the probability of occurrence of words or other itmes starts from high occurrence and then reduces off. Thus, many items occur rarely while a few occur very often. In other words, the frequency of occurrence of any word is inversely proportional to its rank in the frequency occurrences table.
Formula: Px = 1/xa, where Px is the frequency of occurrence of the xth ranked item and a is close to 1.
A file server would exploit a Zipf's law to improve the performance by enhancing the performance of the cache i.e. this property leads to effective web caches, which contain the most popular objects and typically employ the least frequently used replacement policy due to which the server often achieves higher cache hit rates.
The expected benefits of exploiting the Zipf's law are:
3: Is this a property of Zipf's law, or of caching?
4: Ok, but what does this have to do with Zipf's law?
Zipf's Law (refers to George K. Zipf) describes the incidence of distinct objects5 in special sorts of collections.6 (Aaron Krowne, Planetmath.org. Version 4. Web. 18 Sep 2012.)
Server exploits Zipf's law can improve sorting and delivering data methods according to popularity and importantly of data.7 As a result, that can decrease access time to contents in the server which saves time and resources.
5: What does the description say about the incidence of distinct objects? Is there any particular relation described?
6: What's special about the collections?
7: Ok, but how might that be possible? what is it about Zipf's law that makes this possible?
Zipf's law: Zipf's law states that the relative probability of a request for the i'th most popular page is inversely proportional to ‘i’. It specifies that popularity objects are ranked according to their popularity, then the probability that the user chooses the ‘m’th item on the list is 1/m.
File server exploits Zipf's law by searching the request8 based on least occurrence of words by removing highly ranked words.9
8: Perhaps that phrase should have read “searching for the request”?
9: If I'm understanding the answer correctly, it's suggestion that commonly occurring words occur in too many results to be useful and should be thrown away in favor of less commonly occurring words. Ok, but what does that have to do with Zipf's law? Zipf's law relates frequency and rank ordering in a particular way, and this answer doesn't exploit that relation.
Zipf's law outlines how often individual objects in a set will occur. The frequency of an object in a set is inversely proportional to its overall frequency.10 A file server could use Zipf's law to stack the cache with more frequently used files.11 This act would increase the speed of transfers using this server.
Research Source:HTTP://planetmath.org/ZipfsLaw.html
, section: Zipf's
law (explanation, formulae, graph).
10: Isn't it more like the probability of occurrence is inversely proportional to the rank?
11: Doesn't replacement cache do this anyway? How does Zipf's law help make this better?
12: That's all Zipf's law states? That the most common words have a high rate of occurrence? Isn't that tautological?
13: What kind of information? Information (that is, queries) that comes from the clients, or information (that is files) that comes from the server? Or maybe both.
14: What kind of patterns?
15: What does repeating information handling involve?
The most frequently occurring words are rankded in increasing order of their occurrence. So most frequently occuring word is ranked 1.16
These most frequently occurring words are non-content words and the least frequently occurring words i.e. with high rank have more content in it.17
The file server exploits Zipf's law to improve performance by “effective caching”. This property provides an important tool in designing architectures of web caching. Zipf's law helps in selecting which objects to cache.18 In this case it uses Zipf's law to select content objects to cache to improve its performance. Popularity of a video file can be calculated using Zipf's law.19
16: Is this all Zipf's law says?
17: Is Zipf's law concerned with meaning (word content), or popularity?
18: How does Zipf's law help? How is it used to select objects for caching?
19: Is that what Zipf's law is concerned about? Isn't popularity determined by how many times an object is used?
A 640 × 420 frame has 268,800 pixels; at 16 bits of color per pixel, a frame has 4,300,800 bits. 50 frames per second is equivalent to 215,040,000 bits per second. Unencoded video transmission requires a little under a quarter gigabit per second.
The answers, approximately verbatim, and in no particular order:
Required Bandwidth = Frames/sec × Resolution × Color Depth
1Required Bandwidth = 50 × 640 × 420 × 16 = 215,040,000 Bits/second = 205.07 Megabits/second.
1: Do the units carry through in this equation?
Now,a video file contains 640*420 pixel frames = 268800 pixels
Bits of color per pixel = 16
display rate = 50 frames/sec.
Hence,
The data in one frame no. of pixels * no. of bits/pixel 268800*16 4300800 bits
So 215 Mbps bandwidth is required to stream this video file.
The no of bits displayed in one second display rate * data in one frame 50*4300800 215040000 bits = 215 Mbps
A video file with 640 × 420 pixel frames would normally have a size of 268,800 bytes; however, since the file uses 16 bits of color per pixel, this number is doubled (8 bits per byte normally) to 537,600 bytes. The file for this problem requires a fram rate of 50 frames per second, so the bandwidth that is needed to stream this video is 537,600 bytes multipled by 50 frames/second, yielding a final answer of 26,800,000 bytes/sec.
Source:HTTP://www.pk3.org/Astro/index.htm?astrophoto_vesta_pro.htm
(Example) The reason of above facts is limited throughput of USB. Vesta Pro is using YUV420 codec, which requires 12 bits per pixel. That means, that, for example, 640x480 pixels frame has size 460800 bytes. For 5 fps video stream it requires 2304000 bytes/s - it is more than throughput of USB. That's why there must be used some compression of video data, which are sent through USB. As my measurements confirmed, the compression is lossy.2
2: Really, don't do any quoting; just the citation will be enough.
3: Ok, this was my mistake.
15KB4 × 10 bits/byte5 = 150 ; 150 × 50 frames/sec = 7500 Kbits/sec
http://www.imakenews.com/kin2/e_article000345313.cfm?x=b11,0,2
4: From where did 15KB come? What does it represent?
5: What does 10 bits/byte represent?
Bandwidth * (16 bits) * (50 frames/sec) | ||
215 040 000 bits | 6 |
6: Do the units carry through to bits?
Total no. of pixels is 640 × 420 = 268,800.
Data in one frame is 268,800 × 16 = 4,300,800 bits.Given display rate is 50 frames/sec.
So in one second no of bits displayed = 50 × 4,300,800 = 215,040,000 bits = 215 mbits.
So clearly bandwidth required to stream this video is 215 mbps.
Some transfer modes used by FTP require that the server close the data connection to indicate end-of-file.
Reference: Transmission Modes (section 3.4) from File Transfer Protocol (rfc 959).The answers, approximately verbatim, and in no particular order:
[picture]
1: Is this an acknowledgement in response to the server making the data connection, or to something else?
2: But why is the connection broken?
3: Even if there is no other data being transferred?
The FTP does not keep the port 20 data connection open for multiple file transfers because it breaks the data connection after every file transmission.
The FTP client initiates a connection4 with the FTP server on its port 21.Port 21 is where the server is listening for commands issued to it, and in turn, which it will respond to. Hence here the TCP/IP handshake is complete.5
At this point, the client begins to listen on its ephemeral port + 1, and sends the PORT N + 1 command to the server on its port 21, i.e., if the ephemeral port in use by the client is 1026, then it would listen on port 1027.
Once this is done the data tranfer port (port 20) on the FTP server would initiate a conneciton to the FTP client's ephemeral port plus 1, as indicated above. This is how an active FTP session is conducted by both the client and server.
Hence the port 20 data connection for multiple file transfers is not open because it breaks for every single connection after every file is transferred.6
4: Which connection? There are two of them: command and data.
5: Is this something that FTP does, or does TCP do the handshake?
6: That's true, but why does FTP break the connection? That's what the question's asking?
Breaking the data connection after every transfer is necessary to avoid confusion between different connections.7
Source: Notes from Networking and Internet Technologies class at Rutgers.
7: Which different connections? Between the control and data connections? How many other connections are there?
An FTP closes the port 20 data connection after each file transfer because it avoids confusion between the data connections. It's possible that old data from previous transfers could still be present, and reestablishing a conneciton each time helps avoid one transfer absorbing the old data from a previous one. It is also possible that a system could “lock up” because it is waiting for an end-of-transfer message8 that it might not receive due to “time outs” on the firewall's side.9 It is safer and more acurate for each transfer to have freshly opened ports.
Source:HTTP://msmvps.com/blogs/alunj/archive/2009/07/13/1300796.aspx
In typical Stream Mode operation, a new data connection is opened and closed for each data transfer, whether that’s an upload, a download, or a directory listing. To avoid confusion between different data connections, and as a recognition of the fact that networks may have old packets shuttling around for some time, these connections need to be distinguishable from one another.Source:
HTTP://www.ncftp.com/ncftpd/doc/misc/ftp_and_firewalls.html
Even if the client program is planning on ending the session, the FTP requires that the client program send a message ("QUIT") to the server indicating that the connection should be closed, and the server is then required to reply with another message indicating that the session is officially closed. The ramifications are that the client program could then lock up waiting for a reply to a "QUIT" message that the server will not receive since the firewall timed-out the session, unbeknownst to both client and server. The solution for this specific case, which some, but not all, FTP client programs do, is to either place a very short time-out on the reply to the "QUIT" message, or to simply close its end of the FTP session (which violates the FTP protocol, but is de facto behavior and is generally accepted).
8: From where does the end-of-transfer message come?
9: Firewall? Why is there a firewall involved?
The justification for the FTP breaking the data connection after every file transfer is that FTP uses port 21 for listening to commands, while port 20 I for receiving data files. The connection is broken so that the FTP can receive the commands about the data10 before it receives the data, so there is no need to keep it open.
Sources: windosnetworking.com — understanding ftp protocol10: What does breaking the data connection have to do with receiving commands? Aren't they separate connections?
## There are two FTP modes, and when FTP uses either active or passive mode, there will be a justification for breaking the data connection after every file transfer.
## The Active FTP mode is inefficient way to deal with a multiuser system because if many users made a lot of FTP requests, the system wouldn't be capable to match the all incoming FTP data connections to right users. Consequently, FTP does not keep the port 20 data connection open for multiple file transfers to make sure about the matching the incoming FTP data connection.11 Also firewalls might block connections on this mode.12
## The Passive FTP mode use to solve the firewalls issues that might block some connections. On this mode, the client initiates the data connection from its data port to the specified server data port to avoid blocking connections; as a result, the client opens a new connection port for each connection or every file transfer, so the server cannot deal with all incoming connections. Consequently, FTP does not keep the port 20 data connections open for multiple file transfers to make sure about the matching the incoming FTP data connections.11: But doesn't the command connection have the same problem? And FTP doesn't close the command connection after each command.
12: If a firewall blocks connections, how can connections be broken?
13: Which rfc? There are over 6,000 of them.